Boulder Future Salon

Interview with Charles Hoskinson, founder of the Cardano cryptocurrency. Longer and more in depth than other interviews. Topics discussed include philosophy of engineering cryptocurrencies, the Haskell programming language and Plutus, the domain-specific-language (based on Haskell) for smart contracts, proof of work vs proof of stake consensus algorithms, Cardano vs Ethereum vs Bitcoin vs Polkadot, his experience at the last Bitcoin Conference, Cardano's Extended UTXO model, chainlink and oracle networks, decentralized exchanges, Cardano's "Hydra" scaling system vs Bitcoin's "Lightning Network", non-interactive proofs of proof-of-work, playing Devil's advocate and talking about ways Cardano could fail, Cardano's vision for decentralized governance, cryptocurrency and capitalism with long-term incentives, Cardano in Ethiopia, Bitcoin in El Salvador, various individuals including Alex Chepurnoy and the "Ergo" platform, designed for resilience, Stephen Wolfram and Wolfram Alpha, Vitalik Buterin of Ethereum, Jack Dorsey of Twitter, Elon Musk and Dogecoin, Satoshi Nakamoto, and Joe Rogan, and a few random topics like video games and (of course, because this is a Lex Fridman podcast) the meaning of life.

The House Financial Services Committee Task Force on Financial Technology held a hearing entitled "Digitizing the Dollar: Investigating the Infrastructure, Privacy, and Financial Inclusion: Implications of Central Bank Digital Currency".

You could watch the whole thing, or, because that would be boring, you can just read my conclusions below (which could be different from yours, if you watched the whole thing). If you're considering watching it, you might want to know that the witnesses brought to testify were Carmelle Cadet, founder and CEO of EMTECH, Jonathan Dharmapalan, founder and CEO of eCurrency, Rohan Grey, assistant professor of law at Willamette University, Jenny Gesley, foreign law specialist at the Library of Congress, and Neha Narula, director of the Digital Currency Initiative at the MIT Media Lab.

I watched the whole thing (so you don't have to?) and my conclusions were: 1) The US Federal Reserve is *highly* likely to issue a digital currency, like the Chinese Communist Party did. 2) Right now basically nothing has been decided about how this currency will actually work. 3) Proper homage was paid to buzzwords like "security", "efficiency", "stability", "privacy", and (as you might guess from the title), "financial inclusion". Regarding "stability", the frequent mention of "stability" suggests the Federal Reserve Digital Currency will work like how existing "stablecoins" work, such as the USD Tether coin (USDT). Basically the Federal Reserve Digital Currency is likely to work like an extension of regular dollars. Regarding "privacy", cryptocurrencies as a general rule don't have any privacy, unless they are specifically designed to have privacy, and right now there are only two that I know of that fit that description, which are Zcash and Monero. I interpreted the frequent mentions of "privacy" as meaning that the Federal Reserve Digital Currency will have more privacy than the Chinese Digital Yuan (or E-Yuan as it is also called), which gives the Chinese Communist Party full access to all transactions. For whatever that's worth. 4) There was no evidence of any intent to ban all cryptocurrencies except the central bank digital currency, like the Chinese Communist Party also did. I'll leave it as an exercise for the reader to decide whether that's a good or bad thing.

Found this video of a Romanian guy saying in Romania in the high inflation/hyperinflation period in the early 90s, the inflation didn't happen all at once. Instead, it started in basic commodities, and percolated from there into the rest of the economy. When the commodities went up, people initially thought it might be temporary and didn't react.

Well, I've been tracking the price of gold since May of last year, and started watching oil in February of this year. I started paying attention to gold because I heard people saying half of all dollars created had been created in 2020, in response to the pandemic. I thought all those newly created dollars would be immediately reflected in prices, but that didn't happen. It wasn't reflected in the price of gold, it wasn't reflected in the price of food at the grocery store. Nothing happened. So, I was like, ok, I'm wrong, I'm confused, I have no idea what's going on. But now I think, maybe it takes time -- a long time, like a year or several? -- for newly created currency to get around? A massive amount of new currency is created, and initially nothing happens and then over the course of a year or several, it shows up in commodities prices, and then works its way from there into other prices?

Recently there's been a split in the direction of the prices of oil vs gold, with oil going from the 50s to the 70s, while gold stayed the same. I thought, ok, maybe oil is going up because the pandemic is coming to an end and people are going on vacations and driving more or somesuch. But no inflation? Must not be, as the price of gold didn't move.

Today I thought of looking at other commodities. OMG. Copper went up 71.6% in the last year. Aluminum 55.74%. Zinc 52.07%. Iron 106.63% (over 100% means it doubled). Coal went up 144.94%. Corn went up 99.7% and soybean oil went up 135.71%.

This leads me to think maybe the rumors that the price of gold is manipulated are true. The price of just about any commodity you can think of has shot up in the last year, *except* gold. Silver is up but has barely changed since last August, suggesting maybe it is manipulated, too. The only other commodity I could find that didn't shoot up in the last year is "feeder cattle" (up 14.08%). Feeder cattle are cattle bought for feedlots that will be later slaughtered for meat.

Of course, having evidence the price of gold is manipulated tells you nothing about who might be doing it, why they might be doing it, or how they might be doing it.

Corn and soybeans are the foundation of the North American food production system, so... somebody correct me if I'm wrong, but... the skyrocketing prices for corn and soybean oil means eventually prices for all foods in this country have to double. Right? Corn and soybeans are part of the feed of all the livestock and used in every packaged product. If you read the nutrition labels you'll find almost everything in the supermarket has corn or soybeans in it in some form. I get that corn and soybeans aren't literally part of all food, because you can buy apples and potatoes, and so on, but it's still part of a lot and not hyperbole to say corn and soybeans are the base of the food chain here.

Anybody more knowledgable in economics want to chime in and explain to me what's going on?

Chip layout as a reinforcement learning problem. The chip layout is represented as a set of graphs, called a netlist, where the nodes indicate components that need to be placed on the chip and the edges represent the connections between them. They say the "state space" of placing 1,000 clusters of nodes on a grid with 1,000 cells is of the order of 1,000 factorial. This is about 10^2,500, vs 10^360 for the Chinese game of Go. Because of this, so far automated methods have not been able to beat human designers (who often have to wait 72 hours to find out if their manual placement is any good, and the resulting layout process can take several months of back-and-fourth between the layout person and the chip's designers), but also because of this, perhaps it makes sense to use the same class of machine learning algorithms, reinforcement learning, that was used to beat humans at Go.

The "objective function" that the reinforcement learning agent uses to play the game is based on trying to minimize wire length, congestion, and density. These all have negative signs, indicating the goal is to minimize them (increasing wire length, congestion, or density is a punishment to the agent). To speed the process up, the system is bootstrapped with transfer learning from a large assortment of existing human placements. By training on a vast assortment of layouts, the system can learn to get better and better at generating optimal placements.

Google has already used the system to design the next generation of its tensor processing units (TPUs), which themselves are used for AI systems, and will probably be used for this one.

Google's chip-designing algorithm's floor plans "look quite different to those created by a human. Instead of neat rows of components laid out on the die, sub-systems look like they've almost been scattered across the silicon at random."

"A deep neural network model can accurately predict the brain age of healthy patients based on electroencephalogram data recorded during an overnight sleep study."

The neural network was accurate to within 4.6 years on average, and identified several things that increase brain age beyond chronological age: epilepsy and seizure disorders, stroke, elevated markers of sleep-disordered breathing, and low sleep efficiency. "The study also found that patients with diabetes, depression, severe excessive daytime sleepiness, hypertension, and/or memory and concentration problems showed, on average, an elevated Brain Age Index compared with the healthy population sample."

"PlayStation is collaborating with Sony's artificial intelligence department in order to create AI 'Agents' that can play games alongside human players."

"Sony AI, which we established last year, has begun a collaboration with PlayStation that will make game experiences even richer and more enjoyable. By leveraging reinforcement learning, we are developing Game AI Agents that can be a player's in-game opponent or collaboration partner."

The article references a patent that says, "This system is described as an artificial intelligence that is able to 'simulate human game play' based on a play style learned from a human user."

This is different from AI characters within games. This would be a simulated human player Sony provides for any game.

"The first fiber with digital capabilities, able to sense, store, analyze, and infer activity after being sewn into a shirt." "Until now, electronic fibers have been analog -- carrying a continuous electrical signal -- rather than digital, where discrete bits of information can be encoded and processed in 0s and 1s."

"The new fiber was created by placing hundreds of square silicon microscale digital chips into a preform that was then used to create a polymer fiber. By precisely controlling the polymer flow, the researchers were able to create a fiber with continuous electrical connection between the chips over a length of tens of meters."

"The fiber itself is thin and flexible and can be passed through a needle, sewn into fabrics, and washed at least 10 times without breaking down." "When you put it into a shirt, you can't feel it at all. You wouldn't know it was there."

Thumbnail claims to have a Python tutorial system that is "AI powered". I didn't see any details on the website about how the "AI" works.

Deep learning for data in structured tables. "Tabular data is unique in several ways that have prevented it from benefiting from the impressive success of deep learning in vision and language. First, tabular data often contain heterogeneous features that represent a mixture of continuous, categorical, and ordinal values, and these values can be independent or correlated. Second, there is no inherent positional information in tabular data, meaning that the order of columns is arbitrary. This differs from text, where tokens are always discrete, and ordering impacts semantic meaning. It also differs from images, where pixels are typically continuous, and nearby pixels are correlated. Tabular models must handle features from multiple discrete and continuous distributions, and they must discover correlations without relying on the positional information.

"We introduce SAINT, the Self-Attention and Intersample Attention Transformer." (Gotta love these acronyms.) "SAINT projects all features -- categorical and continuous -- into a combined dense vector space. These projected values are passed as tokens into a transformer encoder which uses attention in the following two ways. First, there is 'self-attention,' which attends to individual features within each data sample. Second, we propose a novel 'intersample attention,' which enhances the classification of a row (i.e., a data sample) by relating it to other rows in the table. Intersample attention is akin to a nearest-neighbor classification, where the distance metric is learned end-to-end rather than fixed. In addition to this hybrid attention mechanism, we also leverage self-supervised contrastive pre-training to boost performance for semi-supervised problems."

One important little detail is that when the system compares rows, it only compares rows in the same batch. That's all it can see at any given time. Usually neural networks don't make any assumptions about what order the training examples come in, and the batching is just to speed things up. You figure out how many training examples can fit in your GPU memory (or whatever hardware you are using to do the matrix computations) and that's what determines your batch size. So it's an arbitrary size unrelated to the meaning of your data. You're even supposed to be able to completely randomize your training examples and that might be a good idea. Continuing on...

"SAINT is composed of a stack of L identical stages. Each stage consists of one self-attention transformer block and one intersample attention transformer block. The self-attention transformer block has a multi-head self-attention layer (with h heads), followed by two fully-connected feed-forward layers with a Gaussian error linear unit non-linearity. Each layer has a skip connection and layer normalization. The intersample attention transformer block is similar to the self-attention transformer block, except that the self-attention layer is replaced by an intersample attention layer."

In actuality the way the intersample attention block works is it simply smashes the entire batch down into a single sample (using PyTorch's reshape command) and symmetrically unsmashes the result that comes back, but is otherwise exactly identical to the regular self-attention block.

If you're wondering what the point of all this is, they say that unlike comparing rows where you only compare a value in a column with other values is the same column, this "intersample attention" system "allows all features from different samples to communicate with each other." They say, "In our experiments, we show that this ability boosts performance appreciably."

They say it not only beats random forests, but beats gradient boosting methods, including XGBoost, CatBoost, and LightGBM, which are currently the state-of-the-art in the industry for complex tabular datasets.

Anyone have under-13s in your household? Is this "Roblox" system the next great thing?

So this company, Roblox, apparently had an IPO in March, with this "Investor Day" video made shortly before. They're trying to make an "immersive" VR social networking "platform". It reminds me of Second Life, if you remember that from the olden days of the internet. Second Life didn't actually work very well, and I quit using it before long. It was later vastly surpassed by the online multiplayer video game communities like World of Warcraft and RuneScape. Minecraft also came out. But this company, Roblox, claims they already have millions of users and fast growth and fast revenue growth, and they say more than half of their users are under 13.

An AI system has found counterexamples for mathematical conjectures. Five of them. Conjectures in graph theory, to be precise.

What's interesting is it was done using a reinforcement learning system. The same class of machine learning system that was used to beat Atari games and beat human champions at the Chinese game of Go.

The reason cited for pursuing reinforcement learning algorithms is the desire for a system that can disprove conjectures without any "prior knowledge".

They specifically say they didn't use the Deep Q-Network that was used to beat Atari games, or its newer variants, because they needed something that could handle "sparser" rewards. Here they only give the learning agent feedback at the very end of each session. And the sessions are proportional to the number of edges in the graph, which generally means they are roughly proportional to the square of the number of nodes. They felt that trying to invent artificial rewards during "would defeat our goal of refuting conjectures without prior knowledge about the problem."

Instead they embraced an alternative called the deep cross-entropy method. "With the cross-entropy method, the neural network learns only to predict which move is best in a given state, and does not explicitly learn a value function for the states or state-action pairs. Given any state as an input to the neural net, the output is a probability distribution on all the possible moves in that state, with higher probability assigned to the moves that the agent thinks are best."

Putting this in practice means, "Given a combinatorial problem, we can often easily translate it to a problem about generating a word of certain length from a finite alphabet. We first ask it to predict what the best first letter should be. The output is a probability distribution on the alphabet, from which we can sample an element randomly and feed it back into the network to ask what the best second letter is."

"The network receives feedback only when a session finishes, i.e. when we have created the entire word. The feedback is given by a reward function that the neural network has no knowledge about and treats as a black box. This reward function is unique to every problem."

From this they found counterexamples for a conjecture on the sum of matching number and largest eigenvalue of graphs, a conjecture about the proximity and distance eigenvalues of graphs (Auchiche-Hansen conjecture), a conjecture on how far apart the peaks of distance and adjacency polynomials of trees can be (Collins conjecture), a conjecture related to transmission regularity and the distance Laplacian (Shor conjecture), and a conjecture relating to the permanent (matrix function) of 312-pattern avoiding 0-1 matrices (Brualdi-Cao conjecture).

"M4Depth: A motion-based approach for monocular depth estimation on video sequences." I assume the 'M4 is short for "motion for". The idea is to use a single camera on a drone that is just a regular camera that can't do depth measurements, and use the motion of the drone itself to estimate the depth of everything in the video image.

The starting point for their system was a a neural network called PWCNet. PWCNet stands for... uh... you know, I went and got the PWCNet paper because I'd never heard of it before, and I'm pretty sure it doesn't say anywhere (18 pages, though a lot of it is pictures and tables) what "PWC" stands for. PWC is an "optical flow" network, so we'll just pretend "PWCNet" stands for "optical flow network" even though that makes no sense. It's actually a pretty complicated network. Well, first we should probably say what "optical flow" is.

"Optical flow" is the pattern of apparent motion of image objects between two consecutive frames caused by the movement of these objects and the camera. In neural networks, it is represented as a 2D vector field which represents how pixels are displaced between the frames. Now that we know that we can say PWCNet is a pretty complex system. It runs the input image through two "feature pyramids", then a "warping layer", then a "cost volume" layer, then the "optical flow estimator" layer itself, and finally a "context" layer that refines the "flow" that is output as the final answer. Each of these "layers" is actually multiple actual layers, with the "pyramids" being 12 layers each, the "optical flow" being 8 layers, and the "context" layer being really 7 layers.

Anyway, what they did here was modify the PWC "optical flow" network to do "point triangulation" instead. To do this, they modified the PWC architecture so it had more of a U-Net architecture, with optical flow estimates at each layer within it, and this whole thing becomes a component that can be repeated and stacked into a bigger network. The "encoder" side of the U-net determines the optical flow and the "decoder" side refines it. The actual optical flow encoder is exactly PWCNet.

From here they change the decoder part of the network. At its input, each level of the decoder has a multiple "reprojection" layers and a convolutional subnetwork whose purpose is to estimate depth from the data it receives. This "subnetwork" actually consists of seven convolutional layers. Finally there is a block to combine the depth estimate and the optical flow information to "spatially realign the data according to the depth estimate." The final output is the depth for each pixel.

Riddle me this: What do all these "acquisition corporations" do? I don't know but I just found out there's one in the music industry. Located in Hollywood. Raised $230 million earlier this year.

"The Music Acquisition Corporation is a blank check company whose business purpose is to effect a merger, capital stock exchange, asset acquisition, stock purchase, reorganization or similar business combination with one or more businesses. While the Company may pursue an initial business combination target in any industry or geographic region, the Company intends to focus its search for an initial business combination on businesses that are either directly or indirectly connected with the music sector, with particular emphasis on businesses where the Company's significant strategic and operational expertise and long-standing position within the music industry will be a value-additive proposition to potential target businesses."

A light stage is a room-scale (typically) spherical array of brightly-flashing colored lights and cameras and is designed to capture people under any illumination condition. They're used for movie special effects, video game models, Presidential portraits, and of course, data for training computer vision relighting algorithms.

The idea here is to achieve part of what light stages achieve with nothing more than a computer with a bright screen and a webcam in a dark room. You watch Netflix or YouTube and the computer screen sends patterns of light across your face. By analyzing these patterns, a neural network can be trained to do face relighting similar to a light stage.

Doing so involves some challenges, such as "passive" lighting (the system doesn't have the ability to force predetermined light patterns on your face), head motion, the limited field of view (frontal lighting and view only, and "near-field" only -- no distant lighting), and limited brightness (limited by the brightness of the monitor).

The neural network model they developed is based on U-net. The idea behind U-net is that on one side, the "down" side, you take the input image and downsample at each layer, and on the other side, the "up" side, you upsample at each layer, and at the end out pops your output. In addition there are "across" connections between layers with the same sizes. The "input" to the system consists of not just one, not just two, but three images: your face, the content of the screen you're looking at, and an image that represents the "target light" that the system is supposed to "relight" your face to. Your face is input at the beginning of the "U" while the other two are injected later, after the downsampling process, though I couldn't tell you why. The training process also involves identifying pairs of images with similar (though not identical) poses. The loss function used to train the system is complex as well, involving the use of a system called PatchGAN to act as a "discriminator" that distinguishes "real" from "fake" images in a GAN (generative adversarial network) training process.

"The grim reality of jobs in robotics and AI". Crap jobs include: "AI tagging" (annotating images to make training data), "translator" (humans who 'assist' natural language translation systems), "content moderator" (humans who filter abusive, violent, or illegal content, making AI training data in the process, but who suffer psychological trauma and often leave the job burned out after a year or two), and "warehouse robot" (human who works with real robots in a warehouse, risking physical injury, does work robots cannot yet do yet is expected to work like a robot and as fast as a robot, and whose work is controlled by algorithms).

"Amazon will extend a ban it enacted last year on the use of its facial recognition for law enforcement purposes."

"The web giant's Rekognition service is one of the most powerful facial recognition tools available. Last year, Amazon signed a one-year moratorium that banned its use by police departments."

"Amazon has now extended its ban indefinitely."