|
The "futuristic design" of Star Trek has a name -- "midcentury-modern". Who knew? Star Trek "helped make midcentury-modern the signature sci-fi aesthetic." |
|
|
Low-level Guidance (llguidance) is a tool that can enforce arbitrary context-free grammar on the output of an LLM.
"Given a context-free grammar, a tokenizer, and a prefix of tokens, llguidance computes a token mask - a set of tokens from the tokenizer - that, when added to the current token prefix, can lead to a valid string in the language defined by the grammar. Mask computation takes approximately 1ms of single-core CPU time for a tokenizer with 100k tokens. While this timing depends on the exact grammar, it holds, for example, for grammars derived from JSON schemas. There is no significant startup cost."
"The library implements a context-free grammar parser using Earley's algorithm on top of a lexer based on derivatives of regular expressions. Mask computation is achieved by traversing the prefix tree (trie) of all possible tokens, leveraging highly optimized code." |
|
|
The US military just created an "AI Rapid Capabilities Cell" "focused on accelerating Department of Defense adoption of next-generation artificial intelligence such as Generative AI (GenAI)."
"The AI Rapid Capabilities Cell will lead efforts to accelerate and scale the deployment of cutting-edge AI-enabled tools, to include Frontier models, across the Department of Defense."
The AI Rapid Capabilities Cell will replace Task Force Lima, the Department of Defense generative AI initiative that I didn't know existed until reading this press release about how it won't exist any more. Task Force Lima identified "pilots" and the AI Rapid Capabilities Cell will execute the pilots. These are:
"Warfighting: Command and Control and decision support, operational planning, logistics, weapons development and testing, uncrewed and autonomous systems, intelligence activities, information operations, and cyber operations,"
"Enterprise management: financial systems, human resources, enterprise logistics and supply chain, health care information management, legal analysis and compliance, procurement processes, and software development and cyber security,"
Whew, got that?
Remember a decade or two ago when we futurists debated whether AI would ever be used in weapons? And here we are, watching AI get thoroughly integrated into the military, lol. Not just a weapons system here or there, but every aspect of the military. Command and Control and decision support, operational planning, logistics, weapons development and testing, uncrewed and autonomous systems, intelligence activities, information operations, cyber operations, financial systems, human resources, enterprise logistics and supply chain, health care information management, legal analysis and compliance, procurement processes, and software development and cyber security. |
|
|
"Total app reliance" is a phrase now. I'm surprised it's taken this long.
"Members use the Zipcar app to locate cars, unlock and lock them, share images of the vehicle (for proof that you didn't damage it), and report concerns. One typically goes through the entire Zipcar rental process without interacting with a human."
"Without the app support, people could not unlock cars to start rentals, open cars that didn't come with keys, lock cars, and/or return cars before their rental period expired."
"Users reported long wait times with customer support, enduring cold temperatures while locked out of vehicles, and trepidation regarding cars they couldn't lock. 404 Media spoke with an unnamed person who said their friend's passport was locked in a Zipcar, adding that he 'missed his flight last night and his final exam today because of this.'" |
|
|
Exa Websets purports to turn the whole internet into a searchable database.
"All AI startups building new LLMs chips that are post series A."
"All PhDs who have worked on developer products and graduated from a top university and have a blog."
"Obviously traditional search tools can't do these things. You don't even think to ask them that because they weren't built to be a database."
"So how do we do it? Well, we built the first web-scale embeddings-based search engine. Essentially, we trained an AI system to organize the whole web by meaning."
They claim "Exa's system knows when to use more compute to agentically research and verify each result. That means Exa Websets might take a long time to complete."
But it's not available now. You can join the waitlist. If this works as advertised, it'll be amazing. |
|
|
"Some experts give the Voyagers only about five years before we lose contact."
"The probes are running critically short of electricity from what are called their 'nuclear batteries' -- actually radioisotope thermoelectric generators that make electricity from the radioactive decay of plutonium. The fading power of the probes and the difficulties of making contact over more than 10 billion miles means that, one day soon, one or other of the Voyagers won't answer NASA's daily attempts to communicate via the Deep Space Network of radio dishes. Both probes use heaters to keep key instruments warm and keep the hydrazine in the fuel lines liquid: When the fuel freezes up, the probes won't be able to use their thrusters to keep their main radio antennae pointed at the Earth, and their communications will come to an end." |
|
|
The abject weirdness of AI ads. Not ads made using AI, ads made by AI companies about their AI products.
"I'm trying to find holiday gifts for my sisters. I open a bunch of tabs, I want my wife's advice."
"The company later pulled the ad after facing backlash for taking a sweet father-daughter exchange and automating it away."
"Many people pointed out that you could have just asked the stranger what type of dog they have, and maybe you would have found a friend alongside the dog's breed."
"An AI startup called Friend released a promotional video showing how lonely young people could have a virtual companion in the startup's AI device that they wear around their neck, instead of talking to others."
"Intelligence so big, you'd swear it was from Texas."
"Adapt your workforce at the speed of AI."
"AI that talks to cars and talks to wildlife." |
|
|
Aurora DSQL is a new "serverless" database system from Amazon Web Services.
"Aurora DSQL is a new serverless SQL database, optimized for transaction processing, and designed for the cloud. DSQL is designed to scale up and down to serve workloads of nearly any size, from your hobby project to your largest enterprise application. All the SQL stuff you expect is there: transactions, schemas, indexes, joins, and so on, all with strong consistency and isolation."
If you're wondering what they mean by "serverless":
"Here, we mean that you create a cluster in the AWS console (or API or CLI), and that cluster will include an endpoint. You connect your PostgreSQL client to that endpoint. That's all you have to do: management, scalability, patching, fault tolerance, durability, etc are all built right in. You never have to worry about infrastructure."
If you're wondering about the technology behind it, they say:
"At the same time, a few pieces of technology were coming together. One was a set of new virtualization capabilities, including Caspian (which can dynamically and securely scale the resources allocated to a virtual machine up and down), Firecracker (a lightweight VMM for fast-scaling applications), and the VM snapshotting technology we were using to build Lambda Snapstart."
"The second was EC2 time sync, which brings microsecond-accurate time to EC2 instances around the globe. High-quality physical time is hugely useful for all kinds of distributed system problems. Most interestingly, it unlocks ways to avoid coordination within distributed systems, offering better scalability and better performance."
"The third was Journal, the distributed transaction log we'd used to build critical parts of multiple AWS services (such as MemoryDB, the Valkey compatible durable in-memory database). Having a reliable, proven, primitive that offers atomicity, durability, and replication between both availability zones and regions simplifies a lot of things about building a database system (after all, Atomicity and Durability are half of ACID)."
"The fourth was AWS's strong formal methods and automated reasoning tool set. Formal methods allow us to explore the space of design and implementation choices quickly, and also helps us build reliable and dependable distributed system implementations. Distributed databases, and especially fast distributed transactions, are a famously hard design problem, with tons of interesting trade-offs, lots of subtle traps, and a need for a strong correctness argument. Formal methods allowed us to move faster and think bigger about what we wanted to build." |
|
|
"I've observed two distinct patterns in how teams are leveraging AI for development. Let's call them the "bootstrappers" and the "iterators." Both are helping engineers (and even non-technical users) reduce the gap from idea to execution (or minimum viable product (MVP))."
"The Bootstrappers: Zero to MVP: Start with a design or rough concept, use AI to generate a complete initial codebase, get a working prototype in hours or days instead of weeks, focus on rapid validation and iteration."
"The Iterators: daily development: Using AI for code completion and suggestions, leveraging AI for complex refactoring tasks, generating tests and documentation, using AI as a 'pair programmer' for problem-solving."
The "bootstrappers" use tools like Bolt, v0, and screenshot-to-code AI, while "iterators" use tools like Cursor, Cline, Copilot, and WindSurf.
But there is "hidden cost".
"When you watch a senior engineer work with AI tools like Cursor or Copilot, it looks like magic, absolutely amazing. But watch carefully, and you'll notice something crucial: They're not just accepting what the AI suggests. They're constantly: Refactoring the generated code into smaller, focused modules, adding edge case handling the AI missed, strengthening type definitions and interfaces, questioning architectural decisions, and adding comprehensive error handling."
"In other words, they're applying years of hard-won engineering wisdom to shape and constrain the AI's output."
The author speculates on two futures for software: One is "agentic AI", where AI gets better and better and teams of AI agents can take on more and more of the work done by humans, and "software as craft", where humans make high-quality, polished software, with empathy, experience, and caring deeply about craft that can't be AI-generated.
The article used the term "P2 bugs" without explaining what that means. P2 means "priority 2". The idea is people focus all their attention on "priority 1" bugs, but fixing all the "priority 2" bugs is what makes software feel "polished" to the end user.
Commentary: My own experience is that AI is useful for certain use cases. If your situation fits those use cases, AI is magic. If your situation doesn't fit those use cases, AI isn't useful, or is of marginal utility. Because AI is useful-or-not depending on situation, it doesn't provide the across-the-board 5x productivity improvement that employers expect today. My feeling is that the current generation of LLMs aren't good enough to fix this, but because of the employer expectation, I have to keep trying new AI tools in pursuit of the expected 5x improvement in productivity. (If you are able to achieve a 5x productivity improvement over 2 years ago on a large (more than a half million lines of code) codebase written in a crappy language, get in touch with me -- I want to know how you do it.) |
|
|
"Where are today's Michelangelos? Goyas? Shakespeares? Cervantes? Goethes? Montaignes? Pushkins? Dostoevskys? Balzacs? Mozarts? Where are our Einsteins, our Darwins, our Maxwells, our Newtons, our Aristotles, our Socrates?"
In the past, there were geniuses, but why not today?
The article considers some interesting hypotheses, like the switch from tutoring to bureaucratized mass education systems.
But, spoiler, I'll just jump to the conclusion: Geniuses are in new fields, not established fields. The discovery of element 117 took a large team of people across multiple continents, all to prove the element had existed in a particle accelerator for a few milliseconds. In the 1670s, when chemistry was a new field, element 15 (phosphorus) could be discovered by 1 person.
That's why today, the geniuses aren't chemistry geniuses. They're in new fields like AI and cryptocurrency. They're people like Vitalik Buterin, inventor of Ethereum, the first cryptocurrency to support smart contracts, and Geoffrey Hinton, co-inventor (with David Rumelhart and Ronald J. Williams) of the now-ubiquitous backpropagation algorithm that is essential to training any neural network that goes beyond a single layer. |
|
|
Diffusion models are evolutionary algorithms, claims a team of researchers from Tufts, Harvard, and TU Wien.
"At least two processes in the biosphere have been recognized as capable of generalizing and driving novelty: evolution, a slow variational process adapting organisms across generations to their environment through natural selection; and learning, a faster transformational process allowing individuals to acquire knowledge and generalize from subjective experience during their lifetime. These processes are intensively studied in distinct domains within artificial intelligence. Relatively recent work has started drawing parallels between the seemingly unrelated processes of evolution and learning. We here argue that in particular diffusion models, where generative models trained to sample data points through incremental stochastic denoising, can be understood through evolutionary processes, inherently performing natural selection, mutation, and reproductive isolation."
"Both evolutionary processes and diffusion models rely on iterative refinements that combine directed updates with undirected perturbations: in evolution, random genetic mutations introduce diversity while natural selection guides populations toward greater fitness, and in diffusion models, random noise is progressively transformed into meaningful data through learned denoising steps that steer samples toward the target distribution. This parallel raises fundamental questions: Are the mechanisms underlying evolution and diffusion models fundamentally connected? Is this similarity merely an analogy, or does it reflect a deeper mathematical duality between biological evolution and generative modeling?"
"To answer these questions, we first examine evolution from the perspective of generative models. By considering populations of species in the biosphere, the variational evolution process can also be viewed as a transformation of distributions: the distributions of genotypes and phenotypes. Over evolutionary time scales, mutation and selection collectively alter the shape of these distributions. Similarly, many biologically inspired evolutionary algorithms can be understood in the same way: they optimize an objective function by maintaining and iteratively changing a large population's distribution. In fact, this concept is central to most generative models: the transformation of distributions. Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and diffusion models are all trained to transform simple distributions, typically standard Gaussian distributions, into complex distributions, where the samples represent meaningful images, videos, or audio, etc."
"On the other hand, diffusion models can also be viewed from an evolutionary perspective. As a generative model, diffusion models transform Gaussian distributions in an iterative manner into complex, structured data-points that resemble the training data distribution. During the training phase, the data points are corrupted by adding noise, and the model is trained to predict this added noise to reverse the process. In the sampling phase, starting with Gaussiandistributed data points, the model iteratively denoises to incrementally refine the data point samples. By considering noise-free samples as the desired outcome, such a directed denoising can be interpreted as directed selection, with each step introducing slight noise, akin to mutations. Together, this resembles an evolutionary process, where evolution is formulated as a combination of deterministic dynamics and stochastic mutations within the framework of non-equilibrium thermodynamics. This aligns with recent ideas that interpret the genome as a latent space parameterization of a multi-scale generative morphogenetic process, rather than a direct blueprint of an organism. If one were to revert the time direction of an evolutionary process, the evolved population of potentially highly correlated high-fitness solutions will dissolve gradually, i.e., step by step and thus akin to the forward process in diffusion models, into the respectively chosen initial distribution, typically Gaussian noise."
The researchers proceed to present a mathematical representation of diffusion models. Then, "By substituting Equations 8 and 10 into Equation 5, we derive the Diffusion Evolution algorithm: an evolutionary optimization procedure based on iterative error correction akin to diffusion models but without relying on neural networks at all." They present pseudocode for an algorithm to demonstrate this.
Equations 1-3 are about the added noise, equations 4-5 are about reversing the process and using a neural network to estimate and remove the noise, equation 6 represents the process using Bayes' Theorem and introduces a representation using functions (f() and g()), and equations 7-9 are some plugging and chugging changing the representation of those equations to get the form where you can substitute back inte equation 5 as mentioned above.
"When inversely denoising, i.e., evolving from time T to 0, while increasing alpha-sub-t, the Gaussian term will initially have a high variance, allowing global exploration at first. As the evolution progresses, the variance decreases giving lower weight to distant populations, leads to local optimization (exploitation). This locality avoids global competition and thus allows the algorithm to maintain multiple solutions and balance exploration and exploitation. Hence, the denoising process of diffusion models can be understood in an evolutionary manner: x-hat-0 represents an estimated high fitness parameter target. In contrast, x-sub-t can be considered as diffused from high-fitness points. The first two parts in the Equation 5, ..., guide the individuals towards high fitness targets in small steps. The last part of Equation 5, sigma-sub-t-w, is an integral part of diffusion models, perturbing the parameters in our approach similarly to random mutations."
Obviously, consult the paper if you want the mathematical details.
"We conduct two sets of experiments to study Diffusion Evolution in terms of diversity and solving complex reinforcement learning tasks. Moreover, we utilize techniques from the diffusion models literature to improve Diffusion Evolution. In the first experiment, we adopt an accelerated sampling method to significantly reduce the number of iterations. In the second experiment, we propose Latent Space Diffusion Evolution, inspired by latent space diffusion models, allowing us to deploy our approach to complex problems with high-dimensional parameter spaces through exploring a lower-dimensional latent space."
"Our method consistently finds more diverse solutions without sacrificing fitness performance. While CMA-ES shows higher entropy on the Ackley and Rastrigin functions, it finds significantly lower fitness solutions compared to Diffusion Evolution, suggesting it is distracted by multiple solutions rather than finding diverse ones.
"We apply the Diffusion Evolution method to reinforcement learning tasks to train neural networks for controlling the cart-pole system. This system has a cart with a hinged pole, and the objective is to keep the pole vertical as long as possible by moving the cart sideways while not exceeding a certain range."
"Deploying our original Diffusion Evolution method to this problem results in poor performance and lack of diversity. To address this issue, we propose Latent Space Diffusion Evolution: inspired by the latent space diffusion model, we map individual parameters into a lower-dimensional latent space in which we perform the Diffusion Evolution Algorithm. However, this approach requires a decoder and a new fitness function f-prime for z, which can be challenging to obtain."
"We also found that this latent evolution can still operate in a much larger dimensional parameter space, utilizing a three-layer neural network with 17,410 parameters, while still achieving strong performance. Combined with accelerated sampling method, we can solve the cart pole task in only 10 generations, with 512 population size, one fitness evaluation per individual."
"This parallel we draw here between evolution and diffusion models gives rise to several challenges and open questions. While diffusion models, by design, have a finite number of sampling steps, evolution is inherently open-ended. How can Diffusion Evolution be adapted to support open-ended evolution? Could other diffusion model implementations yield different evolutionary methods with diverse and unique features? Can advancements in diffusion models help introduce inductive biases into evolutionary algorithms? How do latent diffusion models correlate with neutral genes? Additionally, can insights from the field of evolution enhance diffusion models?" |
|
|
Genie 2 is a new foundation "world model" from DeepMind, "capable of generating an endless variety of action-controllable, playable 3D environments for training and evaluating embodied agents. Based on a single prompt image, it can be played by a human or AI agent using keyboard and mouse inputs."
Apparently these models that you can interact with like video games have a name now: "world models".
"Until now, world models have largely been confined to modeling narrow domains. In Genie 1, we introduced an approach for generating a diverse array of 2D worlds. Today we introduce Genie 2, which represents a significant leap forward in generality. Genie 2 can generate a vast diversity of rich 3D worlds."
"Genie 2 responds intelligently to actions taken by pressing keys on a keyboard, identifying the character and moving it correctly. For example, our model has to figure out that arrow keys should move the robot and not the trees or clouds."
"We can generate diverse trajectories from the same starting frame, which means it is possible to simulate counterfactual experiences for training agents."
"Genie 2 is capable of remembering parts of the world that are no longer in view and then rendering them accurately when they become observable again."
"Genie 2 generates new plausible content on the fly and maintains a consistent world for up to a minute."
"Genie 2 can create different perspectives, such as first-person view, isometric views, or third person driving videos."
"Genie 2 learned to create complex 3D visual scenes."
"Genie 2 models various object interactions, such as bursting balloons, opening doors, and shooting barrels of explosives."
"Genie 2 models other agents" -- NPCs -- "and even complex interactions with them."
"Genie 2 models water effects."
"Genie 2 models smoke effects."
"Genie 2 models gravity."
"Genie 2 models point and directional lighting."
"Genie 2 models reflections, bloom and coloured lighting."
"Genie 2 can also be prompted with real world images, where we see that it can model grass blowing in the wind or water flowing in a river."
"Genie 2 makes it easy to rapidly prototype diverse interactive experiences."
"Thanks to Genie 2's out-of-distribution generalization capabilities, concept art and drawings can be turned into fully interactive environments."
"By using Genie 2 to quickly create rich and diverse environments for AI agents, our researchers can also generate evaluation tasks that agents have not seen during training."
"The Scalable Instructable Multiworld Agent (SIMA) is designed to complete tasks in a range of 3D game worlds by following natural-language instructions. Here we used Genie 2 to generate a 3D environment with two doors, a blue and a red one, and provided instructions to the SIMA agent to open each of them."
Towards the very end of the blog post, we are given a few hints as to how Genie 2 works internally.
"Genie 2 is an autoregressive latent diffusion model, trained on a large video dataset. After passing through an autoencoder, latent frames from the video are passed to a large transformer dynamics model, trained with a causal mask similar to that used by large language models."
"At inference time, Genie 2 can be sampled in an autoregressive fashion, taking individual actions and past latent frames on a frame-by-frame basis. We use classifier-free guidance to improve action controllability." |
|
|
The first UEFI bootkit designed for Linux systems (named Bootkitty by its creators) has been discovered.
UEFI (which stands for Unified Extensible Firmware Interface) is a modern replacement for the BIOS, the first code that runs when a computer is turned on. It's job is to load the operating system. Starting from version 2 of UEFI, cryptography is incorporated to enforce security on this whole bootstrap process.
A rootkit is a piece of malware that infects and replaces part of the operating system in such a way as to conceal itself. If that rootkit is in the boot record that the BIOS or now UEFI system uses to bootstrap the operating system, it's called a bootkit. Such bootkits can do things like defeat disk encryption because they are bootstrapped before the disk encryption system is bootstrapped and running. When the full OS is bootstrapped the bootkit can run in kernel mode with full OS privileges. In this position it can intercept anything including encryption keys and passwords.
"The bootkit's main goal is to disable the kernel's signature verification feature and to preload two as yet unknown ELF binaries via the Linux init process (which is the first process executed by the Linux kernel during system startup). During our analysis, we discovered a possibly related unsigned kernel module -- with signs suggesting that it could have been developed by the same author(s) as the bootkit -- that deploys an ELF binary responsible for loading yet another kernel module unknown during our analysis."
ELF stands for Executable and Linkable Format and is a file format for executable code on Linux systems.
"Bootkitty is signed by a self-signed certificate, thus is not capable of running on systems with UEFI Secure Boot enabled unless the attackers certificates have been installed."
"Bootkitty is designed to boot the Linux kernel seamlessly, whether UEFI Secure Boot is enabled or not, as it patches, in memory, the necessary functions responsible for integrity verification before GRUB is executed."
"bootkit.efi contains many artifacts suggesting this is more like a proof of concept than the work of an active threat actor." |
|
|
Why OpenAI's $157B valuation misreads AI's future, according to Foundation Capital.
"OpenAI's growth has been nothing short of meteoric. Monthly revenue reached $300M in August 2023, a 1,700% increase from January. 10M users pay $20/month for ChatGPT, and the company projects $11.6B in revenue next year."
"This narrative collides with a stubborn reality: the economics of AI don't work like traditional software. OpenAI is currently valued at 13.5x forward revenue -- similar to what Facebook commanded at its IPO. But while Facebook's costs decreased as it scaled, OpenAI's costs are growing in lockstep with its revenue, and sometimes faster."
"In traditional software, increasing scale leads to improving economics. A typical software company might spend heavily on development upfront, but each additional user costs almost nothing to serve. Fixed costs are spread across a growing revenue base, creating the enviable margins that make today's tech giants among the most profitable businesses in history."
"Generative AI plays by different rules. Each query to a model costs money in compute resources, while each new model requires massive investments in training. OpenAI expects to lose $5B this year on $3.7B in revenue." |
|
|
"Global AI Vibrancy Tool" from Stanford's Human-Centered Artificial Intelligence lab.
The US ranks first, flowed by China, the UK, India, the United Arab Emirates, France, South Korea...
But what's interesting is the ranking is determined by "R&D," "Responsible AI," "Economy," "Education," "Diversity," "Policy and governance," "Public opinion," and "Infrastructure," and you can change the "weighting" of each of those factors and watch how the rankings change.
Ranked by "R&D," the US comes out on top, but change the ranking to based on "Education" and the UK comes out on top (followed by France and the United Arab Emirates). Change to "Diversity" and India comes out on top (the US goes down to number 27). Change to "Public opinion" and Saudi Arabia goes up to number 2 (really? Apparently this means people talk about them a lot on social media, not that people talk *positively* about them, necessarily). Select "Infrastructure" and Israel shoots up to number 8 (from 16). |
|
|
AI won't fix the fundamental flaw of programming, says YouTuber "Philomatics".
His basic thesis is that the "fundamental flaw of programming" is that software is unreliable and people no longer even expect it to be reliable.
"Jonathan Blow did an informal experiment where he took a screenshot every time some piece of software had an obvious bug in it. He couldn't keep this up for more than a few days because there were just too many bugs happening all the time to keep track of."
"I think we've all gotten so used to this general flakiness of software that we don't even notice it anymore. Workarounds like turning it off and on again or 'force quitting' applications have become so ingrained in us that they're almost part of the normal operation of the software. Smartphones are even worse in this regard. I'm often hesitant to do things in the mobile browser, for example using a government website or uploading my résumé to a job board, because things often just don't work on mobile.
He goes on to say the cause of this is that we stack software abstractions higher and higher, but (citing Joel Spolsky), ultimately all non-trivial abstractions are leaky. (Joel Spolsky actually wrote, in 2002, an essay called "The Law of Leaky Abstractions".)
AI is the next pile of abstractions that we are going to throw on the stack of abstractions. Like compilers, where it's possible, in principle, for people to look at and edit the binary output, but nobody does it, it's possible for people to read and edit the output of AI systems that produce code, but before long, nobody will do it. AI code generators will become the next generation of compilers, allowing people to "write" code at a higher level of abstraction, while leaving the details to the AI systems. It won't make software more reliable.
Is software that unreliable, though? I recently upgraded my mobile phone and various things that were broken on the old phone (2 OS versions older) magically started working just fine. Considering the millions of lines of code running every time I run an app or view a webpage, "obvious bugs" are actually few and far between. |
|