Boulder Future Salon

"New method for producing synthetic DNA". So apparently a technique for synthesis of short DNA sequences based on phosphoramidite chemistry was invented 1981, and to this day is considered the "gold standard", yet I only heard about it for the first time today?

First is the question of what phosphoramidite is. Apparently even that's a pretty complicated answer. Perhaps we should just say it's a phosphite ester, where "phosphite" means it's a particular type of phosphorus compound and "ester" means a compound formed from alcohol and an acid. The DNA sequences are made in a 4-step process where the 4 steps are called coupling, capping, oxidation, and deprotection. Apparently the "protecting groups" is key to the process, but I don't understand how that works, so I'm not going to try to explain it to you. And it's not what this research is about, which is the phosphoramidite itself.

The problem is phosphoramidites have to be stored at -20 Celsius (-4 Fahrenheit) and in inert gas, because it's so reactive it will oxidize other chemicals. Once it gets taken out for use in synthesis machines, it will degrade rapidly, even right in the machine that's being used for DNA synthesis.

The solution invented here is to not have phosphoramidites at all until the last second -- to store the building blocks of phosphoramidites and do the phosphoramidites synthesis "on demand" right in the machine. "In the method of producing phosphoramidites, nucleosides (starting materials) are flushed through a solid material (resin), which can potentially be fully integrated into an automated process in the instrument for DNA synthesis. The resin ensures that the nucleosides are rapidly phosphorylated, whereby the nucleosides are converted to phosphoramidites within a few minutes. From the resin, the phosphoramidites are automatically flushed on to the part of the instrument which is responsible for the DNA synthesis."

I'm guessing the practical effect of this will be a lot more and cheaper DNA synthesis for all kinds of uses.

Loss of smell is one of the earliest and most commonly reported symptoms of covid-19 but the mechanism causing it has been unknown. Apparently people guessed that "a transient edema of the olfactory clefts inhibited airflow transporting odor molecules to the olfactory neurons", in other words, "the familiar sensation of a blocked nose experienced during a common cold", but new research shows this is incorrect. The new research shows the process is as follows:

1) "Cilia carried by sensory neurons are lost post-viral infection. These cilia enable the sensory neurons to receive odor molecules";
2) "Virus present in sensory neurons";
3) "Disruption of the olfactory epithelium (sensory organ) integrity linked to apoptosis (i.e. cell death). The epithelium is organized into regular lamellae, which are destructured by coronavirus infection";
4) "Virus dissemination to the olfactory bulb, which is the first cerebral relay station in the olfactory system";
5) "Inflammation and viral RNA present in several regions of the brain."

Apparently SARS-CoV-2 can persist along with its associated inflammation in the olfactory neuroepithelium for a long time after symptoms have gone away from the rest of the body, making it hard to detect in tests. The researchers verified all this by infecting golden Syrian hamsters with SARS-CoV-2 and "induced acute anosmia and ageusia" (loss of smell and loss of taste).

Cells time DNA replication to the circadian rhythm and DNA replication stops at night -- at least in Cyanobacteria. One wonders how much of this applies to multicellular animals like us humans, as we know sleep disruption, such as from shift work, can increase the odds of serious health problems like heart disease and cancer, not just less serious things like mood disorders.

Cyanobacteria are photosynthetic so it makes sense that they'd shut down energy-intensive processes at night when there's no sunlight.

The researchers here established that the whole genome replication process is timed to the circadian rhythm, with initiation taking place early in the day enabling almost all replication to complete before sunset. They disrupted it on purpose both by disrupting the light source (turning the lights on at night or off during the day) and chemically, because there are genes called kaiBC genes that, if you knock them out, break the link between DNA replication and the circadian rhythm. The effect of these disruptions isn't to have DNA replication continue into nighttime, it's to have DNA replication stop with a boatload of incomplete, partially replicated chromosomes lying around.

You might think, well, that's no big deal if the DNA replication is simply put "on pause" and resumes later, but, well, the researchers found the bacteria do have the ability to resume the replication process and clean up all those incomplete, partially replicated chromosomes lying around, but, for some reason, the growth of the bacteria was stunted compared with Cyanobacteria who were able to do all their DNA replication "in phase" with the circadian rhythm.

Key to all this research is a chemical called "EdU" (whose full name is "5-ethynyl-2'-deoxyuridine", I'm sure you're eager to know) which is fluorescent and gets taken up into DNA (as a thymidine, one of the DNA base pairs) during replication, and which enabled the researchers to see DNA replication (using a technique called time-lapse fluorescent microscopy).

Video from a gamer on Bitcoin. If you're wondering why a gamer cares about Bitcoin, it's because GPUs have become scarce and expensive. And why is that? Well of course because people are buying them for mining Bitcoin.

As you know when I put my "futurist" hat on my prediction is that the cryptocurrency world will move away from currencies that use mining, also known as proof-of-work currencies, to currencies that don't use mining, also known as proof-of-stake (although there are some other variants with other names). Bitcoin may be king of the hill right now, but it won't stay there forever. I'm not predicting any dates, though -- predicting timelines is super hard. In the meantime, if you want to buy a GPU for gaming, you may have to pony up a bit more than you'd like.

Deepfake editing what talking heads say. This technique reduces time required to synthesize video, from hours to about 40 seconds for a 6 word edit, by developing a fast algorithm for searching a source database for lip motions. The system is able to do with with only a small amount of video of the person targeted, for example 2 or 3 minutes is often enough. This is accomplished by pre-training the neural network on hours of video with all the usual English phonemes and lip movements (a spoken version of something called the TIMIT corpus), and the neural network only has to learn how to modify this for the specific person being targeted. This modification step is called transfer learning and the researchers developed a new self-supervised neural "retargeting" technique to accomplish this transfer of lip motions to the target person. Their software provides extra controls to refine results by allowing users to do things like smooth over jumpy transitions and force mouth closure. It also enables insertion of non-verbal mouth gestures with the same text interface, as well as controls to vary speaking styles.

The system has two primary steps, called the preprocessing pipeline and the synthesis pipeline. The preprocessing pipeline further breaks down into two steps, called phoneme alignment and 3D head model registration. The synthesis pipeline breaks down into four steps, called fast phoneme search and stitching, neural retargeting, expansion to full parameters, and neural rendering.

Going through each of these sub-steps in turn: The phoneme alignment step requires the video be paired with its text transcript, and it figures out what all the phonemes are and matches them to their exact time codes in the video.

The 3D head model registration step goes frame by frame through the video and figures out 80 parameters for 3D facial geometry, 80 for facial reflectance, 3 for head pose, 27 for scene illumination, and 64 for face and lip expressions.

Switching over to the synthesis pipeline, the fast phoneme search and stitching step finds subsequences of phonemes in the database and stitches together the corresponding 3D head model parameters for those phonemes.

The neural retargeting step converts the expression parameters from the fast phoneme search and stitching step into parameters that work best for the target person.

The expansion to full parameters step merges the target person expression parameters from the neural retargeting step step with the full set of geometry, reflectance, pose and illumination parameters from the input target video.

The final step, the neural rendering step, as you might have guessed, uses a generative adversarial network (GAN) trained on video of the target person to generate the final photorealistic video.

Apparently in perfect information games like Go and chess, the best reinforcement learning strategies are based on either counterfactual regret minimization or Monte Carlo Tree Search, despite the heavy amount of computing power required. If memory serves the AlphaGo system made by DeepMind used Monte Carlo Tree Search.

For imperfect information games, though, games like poker and StarCraft 2, a different, lesser known technique called "neural fictitious self-play" works better. In neural fictitious self-play, fictitious players choose best response to their opponents' average behavior. As that suggests, this implies giving up trying to find the "best" move in any one game position. But in an imperfect information game, there isn't necessarily a single "best" move anyway. Oh, should probably also clarify what we mean by "works better". "works better" means finds the Nash Equilibrium more easily. The Nash Equilibrium is the state where all players can't improve their play by switching to any strategy other than their current strategy.

Normally with neural fictitious self-play, the system calculates what move to make according to it's normal strategy, called the "policy" in reinforcement learning parlance, but alongside that, calculates the "best" move based on the averages, and with some probability will switch to playing the "best" move instead of the regular "policy". What this research here is about is improving on this system by instead calculating the "best" move the regular way, you use something they call a regret matching method. The method is called Advantage-based Regret Minimization.

The starting point is the aforementioned counterfactual regret minimization strategy, which involves calculating a "regret" number and trying to minimize it. The Advantage-based Regret Minimization modifies it to use a "clipped" cumulative advantage function, which chops off values below some threshold. There's a whole class of reinforcement learning algorithms, including the famous Deep Q-Learning that stunned the world by winning Breakout years ago, that are based on an advantage function. Why "clipping" the advantage function should do a better job of "regret minimization" for imperfect information games, though, I can't explain.

"On buggy resizing libraries and surprising subtleties in FID calculation". Well, that's quite a title. But what this is about is something I did not know, which is that the way generative neural networks are evaluated is by something called the "Fréchet Inception Distance". Hence the "FID" in the title. I guess I'd never given serious thought to how to evaluate generative networks such as GANs (generative adversarial networks).

The idea is that with generative neural networks, say that generate photos of people that don't exist, it wouldn't make sense to compare the images to another image pixel-by-pixel, which non-generative neural networks can sometimes do (depending on their function) as part of their training. The Fréchet Inception Distance, instead, goes under the hood and looks at one of the deeper layers in a convolutional neural network. The aim is to get nodes of the network that correspond to real world objects and better mimic human perception. Once these nodes are tapped into, their probability distributions are compared, using some statistical magic called the Wasserstein metric.

The problem, though, is that ordinary image manipulation libraries that researchers use in their day-to-day work of shoveling images into neural networks causes the Fréchet Inception Distance scores to change. The biggest offender is image resizing. And downsizing images has a bigger impact that upscaling, probably because when you are downsizing an image, you are throwing away information.

The libraries tested were: the Pillow Image Library (PIL), OpenCV, TensorFlow, and PyTorch. The OpenCV, TensorFlow, and PyTorch implementations introduce severe aliasing artifacts.

"A central tenet in classic image processing, signal processing, graphics, and vision is to blur or 'prefilter' before subsampling, as a means of approximately removing the high-frequency information (thus preventing its misrepresentation downstream)." Testing with specially designed sparse input images prove the implementations do not properly blur.

One wonders how many research papers have reported inaccurate results because of problems like this.

Nowadays, TVs can play 4K or even 8K at 120 frames per second or even 240 frames per second. But the problem is, all the video is merely HD, merely 1920x1080 (2K) and goes at merely 30 frames per second. Oh no! We can't have that. What to do? How about use neural networks to do "space-time video superresolution"?

The system here uses a 3-step process. These go by the unglamorous names of "controllable feature interpolation", "temporal feature fusion", and "high-resolution reconstruction". The first step essentially extracts "features" from adjacent frames. They don't spell out exactly what a "feature" is, but from what I could figure out, a "feature" doesn't correspond with what you or I would think of as an object in a video, but rather corresponds to a more low-level representation of something that is "warpable" in the video.

Once these features have been extracted, intermediate feature sets are made for all the frames in between, using straightforward interpolation. After this, the second step kicks in, which takes the features from the first step and improves or "refines" them using another neural network. This apparently is quite helpful for getting the features of sufficient quality that you can blow them up into high-resolution images.

The third step, of course, is to make the high-resolution video from the feature sets.

This glosses over a lot of details, starting with the fact that these neural networks are crazy complicated. The first step, extracting features, uses a "temporal modulation block" architecture that involves multiple upscaling and downscaling neural networks as well as "deformable" convolutional networks, and uses multiple of these "temporal modulation blocks". The feature "refinement" step combines more of these "deformable" convolutional networks with bi-directional LSTM networks. The final step is a simpler architecture but a more massive neural network -- a 40-layer residual network.

"Deformable" convolutional neural networks is something I never heard of before. Apparently it's a technique intended to deal with the fact that convolutional neural networks have a fixed architecture, and you want them to be able to learn more flexibly. The way they go about doing this, though, is something like making the normally 2D convolution operation 3D. At this point I can't really comment on it.

"After a large amount of experimentation, I've found that StyleCLIP is essentially Photoshop driven by text, with all the good, bad, and chaos that entails." He takes a photo of Mark Zuckerberg, and then gives it transformation instructions such as: "face" to "tanned face", "face with hair" to "face with Hi-top fade hair", "face with nose" to "face with flared nostrils", "face" to "Elon Musk face", "face" to "Jesus Christ face", "face" to "Dragon Ball Z Goku face", "face" to "robot face", "face" to "troll face", "face" to "troll face with large eyes", "face with eyes" to "face with very large eyes and very large mouth", "face" to opposite of "laughing face,", "face" to opposite of "laughing face,", "face" to opposite of "Dragon Ball Z Goku face", "face" to opposite of "robot face", "face" to "soldier face", and "face" to "teacher face".

Hilarity ensues.

How does it work? The simple explanation is that there is an "inverter" that goes from an image to StyleGANs parameters, and then CLIP can be used to move those parameters in some direction, after which StyleGAN uses the parameters to generate a new image. CLIP is OpenAI's Contrastive Language-Image Pre-training, which I told you about in February. The process of moving the parameters in the desired direction is actually a complicated step. CLIP takes the text and generates an encoding for it. The change in encoding is straightforward enough to calculate, but how to translate it into changes in StyleGAN's parameters?To do this they had to train a neural network on generated StyleGAN images with only one parameter changed, and then see how they affected CLIP. This system was then able to map CLIP encoding changes to StyleGAN parameter changes. That's a greatly oversimplified explanation but gives you a bit of the gist of it.

LALALA.AI separates vocal tracks from instruments in music. Well, I didn't try it, I just played the sample, and it sounds like it has trouble completely removing the instruments from the vocal track, but the instrument track does sound like the voice is completely removed. So maybe this would be useful for making karaoke versions of songs. And no, they don't say how the system works, other than saying it's a 45-million-parameter neural network that was trained on 20 terabytes of audio.

"What a crossword AI reveals about humans' way with words." "At last week's American Crossword Puzzle Tournament, held as a virtual event with more than 1,000 participants, one impressive competitor made news. For the first time, artificial intelligence managed to outscore the human solvers in the race to fill the grids with speed and accuracy."

"Dr. Fill is trained on data gleaned from past crosswords that have appeared in various outlets. to solve a puzzle, the program refers to clues and answers it has already 'seen.'"

"The human mind often navigates what's called 'multi-hop inference," in which different bits of knowledge are combined in a chain of reasoning. Teaching an AI to follow such leaps of logic points to the subtle ways that people find meaning in language that may be oblique or downright deceptive." "Its brain still struggles to recognize alternative, less common meanings."

The White House just launched a National Artificial Intelligence Initiative to be directed by a new National Artificial Intelligence Initiative Office. The website is

Brace yourself for bureaucratese.

"The National Artificial Intelligence Initiative (NAII) was established by the National Artificial Intelligence Initiative Act of 2020 (NAIIA) – bipartisan legislation enacted on January 1, 2021. The main purposes of the initiative are to ensure continued US leadership in AI R&D; lead the world in the development and use of trustworthy AI systems in public and private sectors; prepare the present and future US workforce for the integration of artificial intelligence systems across all sectors of the economy and society; and coordinate ongoing AI activities across all Federal agencies, to ensure that each informs the work of the others."

"Located in the White House Office of Science and Technology Policy (OSTP), the National Artificial Intelligence Initiative Office (NAIIO) is legislated by the National Artificial Intelligence Initiative Act to coordinate and support the National Artificial Intelligence Initiative. The Director of the NAIIO is appointed by the Director of OSTP. The NAIIO is tasked to: Provide technical and administrative support to the Select Committee on AI (the senior interagency committee that oversees the NAII) and the National AI Initiative Advisory Committee; Oversee interagency coordination of the NAII; Serve as the central point of contact for technical and programmatic information exchange on activities related to the AI Initiative across Federal departments and agencies, industry, academia, nonprofit organizations, professional societies, State and tribal governments, and others; Conduct regular public outreach to diverse stakeholders; and Promote access to technologies, innovations, best practices, and expertise derived from Initiative activities to agency missions and systems across the Federal Government."

267 genes linked to creativity that differentiate humans from Neanderthals and chimpanzees have been identified. This was done by combining genome-wide association (GWAS) studies with cross-species genome comparisons.

The Temperament and Character Inventory was used as a starting point to identify a 'personality type' associated with creativity, genes for that 'personality type' were identified and then compared across species.

"These networks evolved in stages. The most primitive network emerged among monkeys and apes about 40 million years ago, and is responsible for emotional reactivity -- in other words, it regulates impulses, the learning of habits, social attachment, and conflict-resolution." "Less than 2 million years ago, the second network emerged. This regulates intentional self-control: self-direction and social cooperation for mutual benefit. Finally, about 100,000 years ago, the network relating to creative self-awareness emerged."

"The genes of the oldest network, that of emotional reactivity, were almost identical in Homo sapiens, Neanderthals, and chimpanzees. By contrast, the genes linked to self-control and self-awareness among Neanderthals were 'halfway between' those of chimpanzees and Homo sapiens."

"Most of these 267 genes that distinguish modern humans from Neanderthals and chimpanzees are RNA regulatory genes and not protein-coding genes. Almost all of the latter are the same across all three species, and this research shows that what distinguishes them is the regulation of expression of their proteins by genes found exclusively in humans. Using genetic markers, gene-expression data, and integrated brain magnetic resonance imaging based on AI techniques, the scientists were able to identify the regions of the brain in which those genes (and those with which they interacted) were overexpressed." These include the right amygdala, the hippocampus (left and right), the right thalamus, and the right mid cingulum. (Diagram on page 13 of the full paper.)

The evolution of mammalian brain size. It's been known for a long time that brain size scales with body size following a power law. The exponent in this power law, known as the "scaling coefficient", has been estimated to be between 2/3 and 3/4.

What they've done here, though, is gather a massive amount of data for tons of species and spanning a great deal of evolutionary time, and calculated the scaling of the brain relative to the body as an equation of the form y = bx + a. The fact that they're using a linear equation suggests they're taking the logarithm, but they don't actually say that. Anyway, once represented in this form, they can calculate both the "slope" (b in the equation) and "intercept" (a in the equation) for a boatload of species.

What they find is that, first of all, increasing brain size doesn't come from any specific evolutionary lineage, but can emerge in any of them. Humans have the greatest, but our cousins the great apes do not. That means as our bodies got bigger, our brains got disproportionately bigger, compared with all the other species on the planet. A similar pattern is seen in birds, where, for example, parrots and corvids have brain sizes that grew faster than their body size relative to other species, including other birds. Check out the impressive phylogenic tree on page 3. The animals are color-coded. Animals in grey are animals that evolved as a continuation of the "ancestral mammalian grade". Animals in orange and red evolved bigger brains than the ancestral grade as body size increased. Animals in green also increased but at a slower rate. Animals in blue and purple had their brains increase in size at a slower rate than the ancestral grade.

Sudden changes in the slope occurred on the Cretaceous-Paleogene boundary, which is when our planet got whacked by an asteroid (probably) and there was a mass extinction. This happened 66 million years ago. In birds there is a change on the Paleogene-Neogene transition, which was 23 million years ago.

Changes in the "intercept" over evolutionary time indicate stepwise changes in the relationship between brain size and body size. This happened with toothed whales, dolphins and killer whales.

Smaller species appear to be more constrained in their brain-body size relationship. As animals get larger, the constraint seems to get relaxed leading to greater variation, both towards larger and smaller brains relative to body size.

The researchers speculate that what may have enabled humans to grow larger brains, relative to body size, compared with birds, is that "the mammalian neocortex (dorsal pallium) is organized as an outer layer of neurons surrounding scalable white matter, the bird dorsal pallium is organized in a nuclear manner that might limit its scalability."

"Those of us in machine learning are really good at doing well on a test set, but unfortunately deploying a system takes more than doing well on a test set." So says Andrew Ng, my AI teacher -- I've done all the AI courses he's put online (and highly recommend them).

"Speaking via Zoom in a Q&A session hosted by DeepLearning.AI and Stanford HAI, Ng was responding to a question about why machine learning models trained to make medical decisions that perform at nearly the same level as human experts are not in clinical use." "It turns out that when we collect data from Stanford Hospital, then we train and test on data from the same hospital, indeed, we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions." "It turns out [that when] you take that same model, that same AI system, to an older hospital down the street, with an older machine, and the technician uses a slightly different imaging protocol, that data drifts to cause the performance of AI system to degrade significantly. In contrast, any human radiologist can walk down the street to the older hospital and do just fine."

Grim Tales: a short story written by an AI. "I'm the Grim Reaper, one of the most feared and reviled beings in existence. Humans have created songs and books about my reaping. I have been called a monster. A ghoul. A demon."

"You think I'm all-powerful, with the ability to play with the lives of mortals as I see fit."


"I'm a glorified gardener."

How the story was produced: "I used a writing tool that utilises OpenAI’s GPT-3 API. Unfortunately I don’t have access to the GPT-3 API myself, so it was the only option I had."

"One of the main goals was to generate a narrative that was cohesive, which meant that blindly generating wasn’t really an option. On the other hand, I didn’t want to babysit the generation process and cherry pick good output and arrange it into a story."

"So I struck a balance and made a couple of rules: First, if the generated text progresses the narrative in some way, we keep it and move on. Otherwise it gets removed and generated again."