Boulder Future Salon Recent News Bits

Thumbnail The New York Times is using a recommendation algorithm called contextual multi-armed bandits. Sounds much simpler than what others (e.g. Facebook, YouTube) are using. "The algorithm we used is based on a simple linear model that relates contextual information -- like the country or state a reader is in -- to some measure of engagement with each article, like click-through rate. When making new recommendations, it chooses articles more frequently if they have high expected engagement for the given context."
Thumbnail Could a robot outdo the runners at the 2020 Tokyo Olympics? Contenders include MIT's Cheetah, Boston Dynamics' Petman, Handle, and WildCat, Michigan Robotics' MABEL, the University of Cape Town's Baleka, and the Institute for Human & Machine Cognition (IHMC)'s Planar Elliptical Runner (PER) and HexRunner.
Thumbnail Darknet: open source neural networks in C. I think I'll stick with Python and TensorFlow (and PyTorch once I've learned it) for the time being, but it's good to know this option is out there. It can use GPUs through CUDA and can run state-of-the-art neural networks like YOLO ("you only look once", a real-time object detection system).
Thumbnail Trump vs RoboTrump. Take the quiz.
Thumbnail "When AngularJS framework was released, it sky rocketed to being the most popular front-end development framework. But after a few years, React, a competing front-end framework open sourced by Facebook, quickly gained traction and is now the most popular one."

"A few years later, I began working on deep learning projects and began using the most popular framework at the time, Tensorflow 1.0, open sourced by Google. Google announced early 2019 the release of Tensorflow 2.0, which is a major, non-backward compatible rewrite of TF1.0. One of the most important changes is that TF2.0 default mode is now « eager execution », which basically means you write your neural networks as functions and not as graphs. This pattern (procedural rather than declarative) is widely considered to be more intuitive (at least for python people who are numerous in deep learning community) and closely follows PyTorch, a competing deep learning framework open sourced by Facebook, that is progressively overtaking Tensorflow as the most popular framework."

"Once again, the rewrite is not backward compatible, once again it aligns on the usage pattern of a competing framework, and once again this competing framework comes from Facebook."
Thumbnail Benchmarking transformers: PyTorch and TensorFlow. Transformers are a type of neural network architecture used for natural language processing tasks like text classification, information extraction, question answering, and text generation.

Basically, the result from the benchmarks is that PyTorch with a GPU instead of a CPU and TorchScript is about as fast as TensorFlow with a GPU and XLA. TorchScript was designed to take models created in PyTorch and remove their dependency on Python so they can be moved into production. XLA, which stands for Accelerated Linear Algebra, in contrast, is an actual compiler. TorchScript improved the performance of some models but not others, while XLA increased the performance of all models.
Thumbnail Kimera is a C++ library for robots to map their environment and figure out where the robot is using only camera images and inertial data. (This is called SLAM, "simultaneous localization and mapping".) It works on robots that use Robot Operating System (ROS). The input is camera and inertial data, and it outputs trajectory estimates, odometry information, something called loop closures, which refers to when the robot has looped back to a location it has been in before and needs to update its beliefs about that location, and multiple meshes describing the 3D environment the robot is in, one that is updated very fast (less than 20 milliseconds) so the robot can avoid obstacles, one that is updated very slowly but maps the environment out most extensively (not just the area in the robots immediate vicinity) and has semantic labels for everything (in other words, it differentiates between walls, floors, furniture such as tables, etc), which is useful for long-term planning, and a 3rd mesh which is in-between the other two (generated semi-quickly, around 1 second, has the local area in the robot's immediate vicinity, and has semantic labels). It does this by combining a lot of different algorithms, too many to list here. The only place neural networks are used is for the semantic labeling. This enables the system to run fast without GPUs. It also doesn't require distance (RGB-D) cameras.
Thumbnail "The researchers invented a 'visual deprojection' model that uses a neural network to 'learn' patterns that match low-dimensional projections to their original high-dimensional images and videos. Given new projections, the model uses what it's learned to recreate all the original data from a projection."

"In experiments, the model synthesized accurate video frames showing people walking, by extracting information from single, one-dimensional lines similar to those produced by corner cameras. The model also recovered video frames from single, motion-blurred projections of digits moving around a screen."
Thumbnail Subsystems that can be plugged together by robots to build large-scale structures. "The underlying vision is that just as the most complex of images can be reproduced by using an array of pixels on a screen, virtually any physical object can be recreated as an array of smaller three-dimensional pieces, or voxels, which can themselves be made up of simple struts and nodes. The team has shown that these simple components can be arranged to distribute loads efficiently; they are largely made up of open space so that the overall weight of the structure is minimized. The units can be picked up and placed in position next to one another by the simple assemblers, and then fastened together using latching systems built into each voxel."

"The robots themselves resemble a small arm, with two long segments that are hinged in the middle, and devices for clamping onto the voxel structures on each end. The simple devices move around like inchworms, advancing along a row of voxels by repeatedly opening and closing their V-shaped bodies to move from one to the next."

The hope is the system can be used to produce large-scale systems from airplanes to bridges to buildings.
Thumbnail Coanda effect hovercraft. Instead of blowing air underneath, it blows air on the outside. It works but before someone builds a life-size version and offers you a ride on it, you might consider its tendency to flip over.
Thumbnail "According to a new study from Oracle and Future Workplace, a clear majority of Americans (64 percent) would trust a robot more than a human manager, and 32 percent think that a machine will eventually replace their boss."

"What can robots actually do better than living, breathing managers? Here's the full list: Provide unbiased information, maintain work schedules, problem-solve, manage a budget, answer confidential questions, and evaluate team performance. However, respondents didn't think machines were better than human managers at 'understanding feelings' and professional coaching."
Thumbnail "Ono Food Co. announced this week that the first mobile restaurant powered entirely by robotic technology, called Ono Blends, will open later this month in Venice, California. Not coincidentally, the company was founded by two people who know quite a bit about robotics and automation. CEO Stephen Klein came from robotic coffee bar Café X in San Francisco, and previously worked at Instacart. CTO Daniel Fukuba directed the engineering team at a firm that provided automation for Zume, SpaceX, Tesla, Apple and more."

"Every step of Ono Blends' assembly process is monitored by hundreds of sensors to ensure no spillage, cross-contamination or inconsistencies. He adds that Ono's technology creates 60 blends per hour, versus the industry standard of about 20, and uses about 28 times less water because of its cleaning system."
Thumbnail Detectron2 is a ground-up rewrite of Facebook's Detectron object detection system. "The platform is now implemented in PyTorch. With a new, more modular design, Detectron2 is flexible and extensible, and able to provide fast training on single or multiple GPU servers. Detectron2 includes high-quality implementations of state-of-the-art object detection algorithms, including DensePose, panoptic feature pyramid networks, and numerous variants of the pioneering Mask R-CNN model family also developed by Facebook AI Research."

"We built Detectron2 to meet the research needs of Facebook AI and to provide the foundation for object detection in production use cases at Facebook. We are now using Detectron2 to rapidly design and train the next-generation pose detection models that power Smart Camera, the AI camera system in Facebook's Portal video-calling devices. By relying on Detectron2 as the unified library for object detection across research and production use cases, we are able to rapidly move research ideas into production models that are deployed at scale."
Thumbnail "Excessive activity in the brain is linked to shorter life spans." "Bruce Yankner, professor of genetics at Harvard Medical School and co-director of the Paul F. Glenn Center for the Biology of Aging, and colleagues began their investigation by analyzing gene expression patterns -- the extent to which various genes are turned on and off -- in donated brain tissue from hundreds of people who died at ages ranging from 60 to over 100."

"Immediately, a striking difference appeared between the older and younger study participants, said Yankner: The longest-lived people, those over 85, had lower expression of genes related to neural excitation than those who died between the ages of 60 and 80."

"Repressor element 1-silencing transcription (REST), which is known to regulate genes, also suppresses neural excitation, the researchers found. Blocking REST or its equivalent in the animal models led to higher neural activity and earlier deaths, while boosting REST did the opposite."

"And human centenarians had significantly more REST in the nuclei of their brain cells than people who died in their 70s or 80s."
Thumbnail "This page tries to assemble all the research on Natural Language Processing (NLP) for native and indigenous languages of the American continent. Our languages are in danger, especially if they don't get involved in the new digital boom, that is introduced even into the most remote communities. Nevertheless, scientific and engineering work has been done in the field, much more work is necessary to archive usable tools that can compete with the products from the big companies (as Google Translate, Alexa, etc.). To push forward this effort, this work wants to generate an (as much as possible) complete list."

The links are organized into machine translation, automatic lexical extraction, morphologcal analysis and segmentation, corpus and digital resources, speech recognition, part-of-speech tagging, parsing, OCR, spell checking, WordNet, language ID, code-switching and multilingual NLP, tools, documentation, and education, and computational linguistic analyze and surveys.
Thumbnail TensorFlow vs PyTorch: PyTorch dominates research. TensorFlow dominates industry.
Thumbnail Drone Racing League's Racer AI was built for one purpose: to outmaneuver and outsmart the greatest human pilot in the world. And reign victorious. Machine over man.
Thumbnail OpenAI got a robotic hand to learn from scratch how to solve a Rubik's Cube with only one hand. The way it works is they trained a robotic hand in a computer simulation using a technique they call Automatic Domain Randomization, or ADR. The idea is that you generate random environments for the AI to learn in, but initially, the randomness is restricted to a very small subset of the possible environments. Once the AI has mastered that small subset, the randomization system is changed to expand to a wider range of possible environments. The process continues until the randomization is over the full range of possible environments, and the AI has learned everything it is supposed to learn.
Thumbnail I haven't posted this video from years ago because it's a YouTube "Premium" video, and I didn't know if you all could watch it without paying, but it is supposed to be free until the end of the year so you should be able to watch it now. What the video depicts is the "Trolly Problem" done as a real-life experiment.
Thumbnail Relativity Space is "a startup that wants to combine 3D printing and artificial intelligence to do for the rocket what Henry Ford did for the automobile." "As we walk among the robots occupying Relativity's factory, Tim Ellis, the chief executive and cofounder of Relativity Space, points out the just-completed upper stage of the company's rocket, which will soon be shipped to Mississippi for its first tests." "It can make rockets anywhere. In an ideal cosmos, though, its neighbors will be even more alien than Snoop Dogg. Relativity wants to not just build rockets, but to build them on Mars."

"To make a rocket 3D-printable, Ellis's team had to totally rethink the way rockets are designed. As a result, Terran-1 will have 100 times fewer parts than a comparable rocket. Its Aeon engine, for instance, consists of just 100 parts, whereas a typical liquid-fueled rocket would have thousands."
Thumbnail Functional reinforcement learning. Functional in the "functional programming paradigm" sense. "The paradigm will be that developers write the numerics of their algorithm as independent, pure functions, and then use a library to compile them into policies that can be trained at scale. We share how these ideas were implemented in RLlib's policy builder API, eliminating thousands of lines of 'glue' code and bringing support for Keras and TensorFlow 2.0."

"One of the key ideas behind functional programming is that programs can be composed largely of pure functions, i.e., functions whose outputs are entirely determined by their inputs. Here less is more: by imposing restrictions on what functions can do, we gain the ability to more easily reason about and manipulate their execution."

"In contrast to a class-based API, in which class methods can access arbitrary parts of the class state, a functional API builds policies from loosely coupled pure functions."
Thumbnail RLib is an open-source library for "scalable" reinforcement learning. "RLlib uses Ray actors to scale training from a single core to many thousands of cores in a cluster."
Thumbnail Facebook open-sourced a system for managing network congestion with reinforcement learning. "Existing RL environments for congestion control research are not compatible with real-world use cases because they use RL interfaces with agents that block the network sender. This is largely an artifact of building on top of frameworks designed for using RL in research with games, where resource constraints do not pose the same challenges that they do in large-scale production environments, and even a delay of just a few milliseconds negatively affects performance."

"Industry estimates show that more than 150 exabytes of data per month were sent over the internet in 2018 and this expected to nearly double by 2021. Effective network congestion control strategies are key to keeping the internet operational at this massive scale."
Thumbnail AI chips can have "hot spots" with temperatures 15-20 degrees (8-11 degrees Celsius) higher than the rest of the chip. AI chips are highly repetitive but usage is bursty and the bursts can cause sudden floods of current into a particular part of the chip.
Thumbnail Yoshua Bengio, Geoff Hinton, and Yann LeCun think neural networks are a "universal solvent" for incorporating cognitive abilities into computers. Gary Marcus disagrees and thinks symbolic AI techniques from the pre-neural-network "good old fashioned AI" (GOFAI) era need to be combined with neural networks.

"I see no way to do robust natural language understanding in the absence of some sort of symbol manipulating system; the very idea of doing so seems to dismiss an entire field of cognitive science (linguistics). Yes, deep learning has made progress on translation, but on robust conversational interpretation, it has not."
Thumbnail Deep learning cheat sheets. One for convolutional neural networks, one for recurrent neural networks, and one for general deep learning tips and tricks.
Thumbnail "Advanced Symbolics's patented artificial intelligence, named Polly, collects millions of social-media messages, which are then fed through a proprietary algorithm that monitors how events happening in real time are being talked about. The algorithm then compares its findings to patterns Polly has uncovered in the past. In a sense, Polly resembles those computers that collect all the master chess games of history and, on the basis of that aggregated knowledge, anticipate player behaviour to win. In this case, Polly spits out numbers that express the momentum, or lack thereof, behind various political campaigns, based on an analysis of the internet chatter those campaigns generate."

"In terms of the dynamics of the Brexit campaign, the turning point for Polly came after Labour MP Jo Cox -- a politician the Guardian described as a 'passionate defender' of Remain -- was assassinated by a far-right extremist a week before the vote was to take place. 'As soon as that tragedy happened, Polly flipped,' says Kenton White, the company's cofounder and chief scientist. 'She changed her mind.'"
Thumbnail NFL Big Data Bowl. Your job is to predict how many yards will an NFL player gain after receiving a handoff? To work with, you get such data as the X and Y position of the player on the field, speed and acceleration of the player, orientation, the player's height and weight, the yard line, the yards needed for a first down, the down number, the score at the start of the play, the formation and personnel of the offense and defense, the stadium, the weather that day, and so on.
Thumbnail How to pronounce names from Norse mythology. Óðin, Frigg, Loki, Þór, Mjölnir, Heimdallr, Nidavellir, Ragnarök.
Thumbnail "Multilingual machine translation processes multiple languages using a single translation model. The success of multilingual training for data-scarce languages has been demonstrated for automatic speech recognition and text-to-speech systems, and by prior research on multilingual translation. We previously studied the effect of scaling up the number of languages that can be learned in a single neural network, while controlling the amount of training data per language. But what happens once all constraints are removed? Can we train a single model using all of the available data, despite the huge differences across languages in data size, scripts, complexity and domains?"

"We push the limits of research on multilingual neural machine translation by training a single neural machine translation model on 25+ billion sentence pairs, from 100+ languages to and from English, with 50+ billion parameters."

"Once trained using all of the available data (25+ billion examples from 103 languages), we observe strong positive transfer towards low-resource languages, dramatically improving the translation quality of 30+ languages at the tail of the distribution by an average of 5 bilingual evaluation understudy (BLEU) points."
Thumbnail "OpenCV is today's most popular image processing library, covering everything from classic image processing algorithms to cutting-edge deep learning pretrained models. However because OpenCV is not differentiable it mainly focuses on pre-processing tasks and cannot be embedded in an entire training process. That shortcoming motivated OpenCV.org research scientist Edgar Riba to propose a new differentiable computer vision library, 'Kornia,' which has now been open-sourced on GitHub."

"Inspired by OpenCV, Kornia is based on PyTorch and designed to solve generic computer vision problems. It contains a set of routines for performing color space conversions, and differentiable modules for performing tasks such as image filtering and edge detection. Kornia's core code can efficiently define and compute the gradient of complex functions with reverse-mode auto-differentiation."

"Kornia consists of subset packages containing operators which can be inserted into neural networks to enable models to perform tasks such as image transformations, epipolar geometry, and depth estimation."
Thumbnail The 5 algorithms for efficient deep learning inference on small devices. Pruning neural networks (removing unimportant weights), deep compression (pruning plus trained quantization, where the number of bits per weight is reduced, and Huffman coding), data quantization (requires FPGAs), low-rank approximation (a non-linear layer can be replaced by 2 layers with fewer total weights), and trained ternary quantization (tries to reduce all weights to -1, 0, or +1).
Thumbnail "Podar and co-workers targeted Saccharibacteria, mainly mouth-dwelling bacteria of which nearly a dozen species live in humans. But because they make up such a tiny portion of the mouth's microbiome -- less than 1% -- they are difficult to isolate and grow."

"Podar's team used an antibody-based strategy to obtain isolates of Saccharibacteria that had been sequenced -- but not cultured -- from a gamish of saliva and other mouth fluids. They first searched for genes in different Saccharibacteria DNA that likely coded for proteins capable of jutting through the surfaces of cells. The immune system produces antibodies in response to the portions of cell surface proteins it can 'see.'"

"A comparison with other bacteria allowed the researchers to identify specific regions of the surface proteins that would likely trigger the strongest antibody responses. After injecting these protein fragments into rabbits, they purified and fluorescently labeled the antibodies the animals made. Mixing the labeled antibodies with fluids from people's mouths allowed the researchers to pluck out the relatively rare Saccharibacteria from the mix of cells in the samples."
Thumbnail How to learn OpenCV and deep learning for computer vision.
Thumbnail Robopsychology. "Sending a mouse through a maze can tell you a lot about how its little brain learns. But what if you could change the size and structure of its brain at will to study what makes different behaviors possible? That's what Elan Barenholtz and William Hahn are proposing. The cognitive psychologist and computer scientist, both at Florida Atlantic University in Boca Raton, are running versions of classic psychology experiments on robots equipped with artificial intelligence."

"What is the minimum complexity you need to put in one of these agents so that it acts like a squirrel or it acts like a cat?"

"What's the nature of reward that really best simulates the way it works in organisms?"

"We give [the robot] the current frame, and we say, 'What do you think the world would look like a second from now if you were to take a right?' To be able to do that, it has to know where it is. It has to build, in its own mind, a map of, 'I am here, and then there's another world over there, and if I turn, I'll now be at that world.'"
Thumbnail The farther you live from the equator, and the rainier the climate you live in, the more likely you are to like the color yellow.
Thumbnail DeepFly3D. Like pose-estimation systems for humans, but for flies.
Thumbnail "We developed a new spectrometer technology that allows us to shrink big components onto a small silicon chip and still maintain high performance. We developed an algorithm that allows us to extract the information with much better signal-to-noise ratio. We have validated the algorithm for many different kinds of spectrum. The algorithm identifies separate colors of light by comparing two repeated measurements to mitigate the impact of measurement noises. The algorithm improves resolution by 100 percent compared to the textbook limits, called the Rayleigh limits."

"We are collaborating with a group at UMass to develop a deep learning algorithm for designing 'metasurfaces,' which are a kind of optical device where instead of using conventional geometric curvature to construct, say, a lens, you use an array of specially designed optical antennas to impart phase delay on the incoming light, and therefore we can achieve all kind of functionalities."
Thumbnail "When the molecule inserts itself throughout the entire cell membrane, the resulting images are blurry because the axons and dendrites that extend from neurons also fluoresce. To overcome that, the researchers attached a small peptide that guides the probe specifically to membranes of the cell bodies of neurons. They called this modified protein SomArchon."

"With SomArchon, you can see each cell as a distinct sphere. Rather than having one cell's light blurring all its neighbors, each cell can speak by itself loudly and clearly, uncontaminated by its neighbors."

"The researchers used this probe to image activity in a part of the brain called the striatum, which is involved in planning movement, as mice ran on a ball. They were able to monitor activity in several neurons simultaneously and correlate each one's activity with the mice's movement."

"Over the years, my lab has tried many different versions of voltage sensors, and none of them have worked in living mammalian brains until this one."

"Using this fluorescent probe, the researchers were able to obtain measurements similar to those recorded by an electrical probe, which can pick up activity on a very rapid timescale."

"With the new voltage sensor, it is also possible to measure very small fluctuations in activity that occur even when a neuron is not firing a spike."
Thumbnail "We need to stop pretending that Silicon Valley can compete with China on its own." "If the race for powerful AI is indeed a race among civilizations for control of the future, the United States and European nations should be spending at least 50 times the amount they do on public funding of basic AI research."

"The history of computing research is a story not just of big corporate laboratories but also of collaboration and competition among civilian government, the military, academia and private players both big (IBM, AT&T) and small (Apple, Sun)."

"When it comes to research and development, each of these actors has advantages and limitations. Compared with government-funded research, corporate research, at its best, can offer a stimulating balance of theory and practice, yielding inventions like the transistor and the Unix operating system. But big companies can also be secretive, occasionally paranoid and sometimes just wrong, as with AT&T's dismissal of internet technologies."

"Big companies can also change their priorities. Cisco, once an industry leader, has spent more than $129 billion in stock buybacks over the past 17 years, while its chief Chinese competitor, Huawei, developed the world's leading 5G products."
Thumbnail "The Stitch Fix team uses something called eigenvector decomposition, a concept from quantum mechanics, to tease apart the overlapping 'notes' in an individual's style. Each person's individual style contains many data points -- few people are simply 'preppy' or 'boho' -- and using physics, the team can better understand the complexities of the clients' style minds."

"Chris Moody belongs to a growing group of astrophysicist deserters, who have stopped researching the cosmos to start building recommendation algorithms and data models for the tech industry. They make up the data science teams at companies like Netflix and Spotify and Google. And even at elite universities, fewer astrophysics PhDs go on to take postdoctoral fellowships or pursue competitive professorships. Now, more of them go straight to work in Silicon Valley."

"To understand what's driving astrophysicists into consumer product startups, consider the recent explosion of machine learning. Astrophysicists, who wrangle massive amounts of data collected from high-powered telescopes that survey the sky..."
Thumbnail "One of the most notable innovators in the retail space, Realeyes, works with big-name brands such as Coca-Cola, Expedia, Mars, AT&T, and LG, who deploy the technology to help them measure, optimize, and compare the effectiveness of their content.

"The Realeyes software measures viewers' emotions and attention levels using webcams. It can show a brand's content to panels of consenting consumers all around the world and measure how audiences respond to a campaign by monitoring their attention levels and logging moments of maximum engagement. Marketers are provided with an overall score based on attention and emotional engagement, which enables them to compare multiple assets or benchmark them against previous campaigns."
Thumbnail "Tesla has acquired DeepScale, a Silicon Valley startup that uses low-wattage processors to power more accurate computer vision."

"DeepScale has developed a way to use efficient deep neural networks on small, low-cost, automotive-grade sensors and processors to improve the accuracy of perception systems."
Thumbnail "One market that has already reached its automation tipping point is the enterprise building security market. Traditionally, office security has been conducted by humans. The Guardian estimates that there are 20 million private security workers worldwide. One company, Cobalt Robotics, hopes to change that. With its platform, companies can replace a guard, or guards with a 30%+ cheaper robotic security robot. [Update: Cobalt has reached out to say their platform is actually 65% cheaper than traditional security guards.] For example, instead of manning a building with three to four people, you can have one human managing a few remote robots. Moreover, all the data and insights collected via these robots is organized and made available for building and security optimization."
Thumbnail Placenta flattening algorithms. Eh, maybe not the best wording. The algorithms flatten images of the placenta, taken from MRI machines, not actual placentas.
Thumbnail Robotic pizza system makes 300 pizzas/hour. "Machines have been making frozen pizzas for years, but Picnic's robot differs in a few respects. It's small enough to fit in most restaurant kitchens, the recipes can be easily tweaked to suit the whims of the restaurants, and -- most importantly -- the ingredients are fresh."

"There are also a few details that may save Picnic's pizzas from tasting as if a robot made them. For starters, the dough preparation, sauce making and baking -- the real art of pizza -- is left in the capable, five-fingered hands of people."
Thumbnail "When a fight broke out recently in the parking lot of Salt Lake Park, a few miles south of downtown Los Angeles, Cogo Guebara did what seemed the most practical thing at the time: she ran over to the park's police robot to push its emergency alert button."

"'I was pushing the button but it said, 'step out of the way,'' Guebara said. 'It just kept ringing and ringing, and I kept pushing and pushing.'"

"She thought maybe the robot, which stands about 5 feet tall and has 'POLICE' emblazoned on its egg-shaped body, wanted a visual of her face, so she crouched down for the camera. It still didn't work."

"Without a response, Rudy Espericuta, who was with Guebara and her children at the time, dialed 911. About 15 minutes later, after the fight had ended, a woman was rolled out on a stretcher and into an ambulance, her head bleeding from a cut suffered during the altercation."

"Amid the scene, the robot continued to glide along its pre-programmed route, humming an intergalactic tune that could have been ripped from any low-budget sci-fi film."
Thumbnail YouTube was different back in the Olden Days. Random people posted random videos.
Thumbnail "Increased adoption of robotics and automation equipment has been a substantial driver of the declining labor share of income for U.S. workers, even during a period of extremely low unemployment. In the last global recession, industrial robot shipments fell significantly but did not stop; the labor-force growth rate for manufacturing, needless to say, went negative."

"Last year, Canada's average manufacturing wage (in US dollars) was $19.31 per hour. In Mexico last month, it was $2.60. Robot shipments in Mexico are exceeding those in Canada despite manufacturing wages, not because of them."

"Shipments to automotive original equipment manufacturers, such as Ford Motor Co. and General Motors Co., have plummeted since 2016. At the same time, robot shipments to the food and consumer sectors, plus the life sciences, pharmaceutical and biomedical sectors, have risen to nearly the same level as auto OEM shipments."
Thumbnail Drone-killing drone. "First he grabbed the controls for an Up Air One, a remote control hobbyist model that retails for about $300, and steered it until it was hovering about 100 feet above the ground. Next he used a laptop to activate a system he'd spent the past several months building."

"A second drone roughly the size of the Up Air quadcopter spun into action, buzzing like a mechanical wasp as it ascended to about 20 feet below its target. As it hovered, a crowd of Jason Levin's colleagues gathered around. A prompt appeared on-screen asking for permission to attack. Levin tapped a button, and the second drone, dubbed the Interceptor, shot upward, striking the Up Air One at 100 mph. The two aircraft somersaulted skyward briefly, then they plummeted back to earth and landed with two satisfying thuds."
Thumbnail "Unilever, the consumer goods giant, is among companies using AI technology to analyse the language, tone and facial expressions of candidates when they are asked a set of identical job questions which they film on their mobile phone or laptop. The algorithms select the best applicants by assessing their performances in the videos against about 25,000 pieces of facial and linguistic information compiled from previous interviews of those who have gone on to prove to be good at the job."
Thumbnail "The unexpected difficulty of comparing AlphaStar to humans." "Humans interact with the game by looking at a screen, listening through headphones or speakers, and giving commands through a mouse and keyboard. AlphaStar is given a list of units or buildings and their attributes, which includes things like their location, how much damage they've taken, and which actions they're able to take, and gives commands directly, using coordinates and unit identifiers. For most of the matches, it had access to information about anything that wouldn't normally be hidden from a human player, without needing to control a 'camera' that focuses on only one part of the map at a time. For the final match, it had a camera restriction similar to humans, though it still was not given screen pixels as input. Because it gives commands directly through the game, it does not need to use a mouse accurately or worry about tapping the wrong key by accident."

"Starcraft is a game that rewards the ability to micromanage many things at once and give many commands in a short period of time. Players must simultaneously build their bases, manage resource collection, scout the map, research better technology, build individual units to create an army, and fight battles against other players. The combat is sufficiently fine grained that a player who is outnumbered or outgunned can often come out ahead by exerting better control over the units that make up their military forces, both on a group level and an individual level. For years, there have been simple Starcraft II bots that, although they cannot win a match against a highly-skilled human player, can do amazing things that humans can't do, by controlling dozens of units individually during combat. In practice, human players are limited by how many actions they can take in a given amount of time, usually measured in actions per minute (APM). Although DeepMind imposed restrictions on how quickly AlphaStar could react to the game and how many actions it could take in a given amount of time, many people believe that the agent was sometimes able to act with superhuman speed and precision."
Thumbnail Why Telsa's Model 3 received a 5-star crash test rating. Passive safety (crumple zones, air bags, etc) and active safety (automatic braking, automatic lane keeping). Other electric cars from Mercedes, Lexus, Audi, and a hydrogen car from Hyundai also received the same rating.
Thumbnail Drone bubble bursts. "French manufacturer Parrot SA announced in July that it would halt production of most of its drone lines. Software startup Airware Inc. raised $118 million from investors before shutting its doors and laying off 140 employees in late 2018. GoPro Inc. exited the drone business and laid off hundreds last year, citing an 'extremely competitive' market."

"At least 67 drone startups have been sold since their inception, according to Crunchbase, which collects data on private companies. Buyers range from rival drone operators to companies in other industries, such as Verizon Communications Inc."
Thumbnail Some napkin arithmetic on autonomous cars. Human drivers get into an accident every 165,000 miles. Waymo's last disengagement report says they have a disengagement, where the safety driver disengages the automatic system to avoid a crash, every 11,017 miles. That means without the safety driver, Waymo would get in crashes almost 15 times more than humans.
Thumbnail "Researchers have developed a tiny nanolaser that can function inside of living tissues without harming them. Just 50 to 150 nanometers thick, the laser is about 1/1,000th the thickness of a single human hair. At this size, the laser can fit and function inside living tissues, with the potential to sense disease biomarkers or perhaps treat deep-brain neurological disorders, such as epilepsy."

"Not only is it made mostly of glass, which is intrinsically biocompatible, the laser can also be excited with longer wavelengths of light and emit at shorter wavelengths."

"Longer wavelengths of light are needed for bioimaging because they can penetrate farther into tissues than visible wavelength photons, but shorter wavelengths of light are often desirable at those same deep areas. We have designed an optically clean system that can effectively deliver visible laser light at penetration depths accessible to longer wavelengths."

"The nanolaser also can operate in extremely confined spaces, including quantum circuits and microprocessors for ultra-fast and low-power electronics."

The laser is made of exotic elements like ytterbium and erbium on nanoarrays made of silver, so I guess they are counting on it never escaping the glass, although I guess minute amounts of ytterbium and erbium wouldn't be toxic.
Thumbnail Károly Zsolnai-Fehér worries that GPT-2 will be able to take his job as host of Two Minute Papers.
Thumbnail The 7 capabilities every AI should have. Or at least reinforcement learning algorithms. Generalization (how the agent does in previously unseen environments), scale (how well the agent does at larger problems), basic learning, noise (how it handles random noise on its inputs), memory (of past observations), exploration (of "deep" environments), and, perhaps most importantly, credit assignment (how good the agent is at figuring out which of many strategic decisions in a long game led to a win/loss).
Thumbnail Videos from Andreas Mueller's Applied Machine Learning class at Columbia University are available online for free. The emphasis of the course is practical tools and techniques, so rather than deeply learning the theory behind all the models (though theory is covered), the emphasis is on being to implement effective models using widely available tools. So once the course has covered models such as regression, linear models for classification, trees, forests, ensembles, gradient boosting, and neural networks, it will go on to teach techniques like learning with imbalanced data, feature selection, hyperparameter tuning, dimensionality reduction, and outlier detection.
Thumbnail Transformers is a library of state-of-the-art natural language processing for TensorFlow 2.0 and PyTorch. Has 32 pretained models in 100+ languages, including BERT (from Google), GPT (from OpenAI), GPT-2 (from OpenAI), Transformer-XL (from Google/CMU), XLNet (from Google/CMU), XLM (from Facebook), RoBERTa (from Facebook), and DistilBERT (from HuggingFace, who apparently also made this library).
Thumbnail Octopus changing color while it's dreaming.
Thumbnail The Dark Energy Spectroscopic Instrument, aka DESI, will use 5,000 fiber-optic sensors with robots to reposition them, which will feed into a room full of spectrographs to create the most accurate 3D map of the universe. This will enable spectrographs of 5,000 galaxies every 15 minutes. Scientists will look at the spectrograph results to deduce the expansion history of the universe. The key is baryon acoustic oscillations. The idea is that in the primordial plasma (made of baryons -- aka normal matter like us in the standard model of quantum physics) of the early universe, sound waves aka acoustic density waves could only travel a certain maximum distance before the plasma cooled to the point where it switched from plasma, which is electrically charged, to electrically neutral atoms, thus "freezing" the waves in place. As the universe went from there to the large-scale structure we see today, this freezing has resulted in the tendency for galaxies to be separated by the distance the waves traveled, and by accurately pinpointing the positions of galaxies we can trace their origins back to the acoustic waves of the early universe. This in turn can be used to understand when the acceleration of the expansion rate of the universe began.
Thumbnail TensorFlow 2.0 has finally been officially released. "TensorFlow 2.0 focuses on simplicity and ease of use, featuring updates like: Easy model building with Keras and eager execution. Robust model deployment in production on any platform. Powerful experimentation for research. API simplification by reducing duplication and removing deprecated endpoints."
Thumbnail "Researchers found that the neuron coverage behaviors between real and fake faces in deep face recognition systems can provide a critical clue for differentiating fake from real. Using this neuron coverage technique, researchers captured fine-grain facial features with deep facial recognition systems such as VGG-Face, OpenFace, and FaceNet. Because neurons can learn meaningful representations of inputs in image processing, the researchers focused on the behaviour of activated neurons as determined by a neuron coverage criteria they call 'MNC.'"

"Experiment results show FakeSpotter reaching fake face detection accuracy of 78.23 percent, 80.54 percent, and 84.78 percent on VGG-Face, OpenFace, and FaceNet respectively, better performance than traditional deep CNNs."
Thumbnail "Now you can see the Mona Lisa nodding and laughing."

"In 2018, Christie's, a British auction house, sold a GAN generated painting, 'A portrait of Edmond Belamy', for $432,500, along with the following artist's signature:" "The signature belongs to our AI creative painter, GAN. It's the GAN's objective function." "Edmond Belamy, to whom this portrait belongs, is a part of the Belamy family -- all created with the GAN model."

"Kenny Jones and Derrick Bonafilia developed a fascinating project based on GANs -- GANGogh, which include huge dataset of artistic works with different styles. The network then learned how to create paintings mixing those styles."

"Of all art generation algorithms, I find AICAN the most interesting. AICAN is an AI application based on creative adversarial networks developed by professor Ahmed Elgammal, the director of Rutgers university's Art and Artificial Intelligence Lab. These paintings are revolutionary! The fancy artistic style, the dream-like mood, the swaying lines and shapes, and the harmonic mixture of colors make them indistinguishable from contemporary human-created art."
Thumbnail "Goodhart's Law states that 'When a measure becomes a target, it ceases to be a good measure.' At their heart, what most current AI approaches do is to optimize metrics. The practice of optimizing metrics is not new nor unique to AI, yet AI can be particularly efficient (even too efficient!) at doing so."

"Metrics are typically just a proxy for what we really care about. The paper Does Machine Learning Automate Moral Hazard and Error? covers an interesting example: the researchers investigate which factors in someone's electronic medical record are most predictive of a future stroke. However, the researchers found that several of the most predictive factors (such as accidental injury, a benign breast lump, or colonoscopy) don't make sense as risk factors for stroke. So, just what is going on? It turned out that the model was just identifying people who utilize health care a lot. They didn't actually have data of who had a stroke (a physiological event in which regions of the brain are denied new oxygen); they had data about who had access to medical care, chose to go to a doctor, were given the needed tests, and had this billing code added to their chart."

"You want to know what content users like, so you measure what they click on. You want to know which teachers are most effective, so you measure their students test scores. You want to know about crime, so you measure arrests. These things are not the same."
Thumbnail Paging Dr. Robot. "At the age of sixty-six, Pier Giu­lianotti, professor of surgery at the University of Illinois College of Medicine, has now performed roughly three thousand procedures with the aid of a robot, and has helped train nearly two thousand surgeons in the art. Farid Gharagozloo, a professor at the University of Central Florida and a surgeon at the Global Robotics Institute, said of Giu­lianotti, 'He single-handedly started the area of general surgery in robotics, and I don't think that's an overstatement. No matter what the field, there's a certain panache and sort of genetic makeup that makes people the leaders -- makes them do things that no one else wants -- and Pier has that.' Gharagozloo said that, when he watched videos of Giu­lianotti's surgeries, he was left 'in awe.' Giu­lianotti was the first surgeon to perform more than a dozen robotic procedures, ranging from kidney transplants to lung resections. In the operating room, he relies on one robot: a multi-armed, one-and-a half-­million-dollar device named the da Vinci."

"Despite the enthusiasm of such practitioners as Giu­lianotti, many members of the American surgical establishment remain skeptical of robotic surgery."
Thumbnail Deep learning for manufacturing. "Deep learning architectures like convolutional neural nets are particularly poised to take over from human operators to spot and detect visual clues indicative of quality problems in manufactured goods and parts in a large assembly process."

"To detect anomaly or departure from the norm, often dimensionality reduction techniques like PCA (Principal Component Analysis) is used from traditional statistical signal processing domain. However, one can use static or variational Autoencoders, which are deep neural networks with layers consisting of progressively decreasing and increasing convolutional filters (and pooling). These type of encoder networks look past the noise and usual variance and encode the essential features of a signal or datastream in a small number of high-dimensional bits. It is much easier to track highly encoded bits if they are changing unexpectedly when one is looking for anomalies in a continuously running, high-volume process."

"Deep learning models have already proven to be highly effective in the domain of economics and financial modeling, dealing with time-series data. Similarly, in predictive maintenance, the data is collected over time to monitor the health of an asset with the goal of finding patterns to predict failures. Consequently, deep learning can be of significant aid for predictive maintenance of complex machinery and connected systems."
Thumbnail Chest x-ray dataset, consisting of 377,110 chest radiographs in DICOM format (whatever that is) with free-form text radiology reports, with personally identifiable information removed. They come from the Beth Israel Deaconess Medical Center in Boston, MA.

"Images sometimes contain 'burned in' annotations: areas where pixel values have been modified after image acquisition in order to display text. Annotations contain relevant information including: image orientation, anatomical position of the subject, timestamp of image capture, and so on. The resulting image, with textual annotations encoded within the pixel themselves, is then transferred from the modality to PACS [the hospital picture archiving and communication system]. Since the annotations are applied at the modality level, it is impossible to recover the original image without annotations."

"Due to the burned in annotations, image pixel values required de-identification. A custom algorithm was developed which removed dates and patient identifiers, but retained radiologically relevant information such as orientation. The algorithm applied an ensemble of image preprocessing and optical character recognition approaches to detect text within an image."

"Radiology reports were de-identified using a rule-based approach based upon prior work combined with a newly developed neural network approach."

They note that "Only 11 radiologists served the 12 million people of Rwanda, while the entire country of Liberia, with a population of four million, had only two practicing radiologists," so if you can develop an automated way of analyzing this data, you can save lives.
Thumbnail Before-and-after natural disaster satellite imagery dataset. Consists of building polygons, which are granular representations of a building footprint, regression numbers which rate how damaged a building is, labels for all the environmental factors that caused the damage seen in the imagery, and environmental factor bounding boxes and labels, which are a rough approximation of the area covered by smoke, water, fire, and other environmental factors.

The dataset was produced in collaboration with imagery analysts from the California Air National Guard, by studying the process human analysts use to help first responders. When a disaster occurs, analysts receive aerial and satellite imagery of the impacted regions from state, federal, and commercial sources, make an initial overall assessment of what sub-regions look the most damaged, then identify and count the number of structures damaged.

The disaster types are "earthquake/tsunami", "dam collapse", "flood", "landslide", "volcanic eruption", "wildfire", and "wind". Addition environmental factors are "smoke", "landslide", "water", "fire", "wind", and "other".

The damage scores are 0: no damage, 1: minor damage, 2: major damage, and 3: destroyed. This simple scale is based on HAZUS, FEMA's Damage Assessment Operations Manual, the Kelman scale, and the EMS-98, the California and Indiana Air National Guards, and the US Air Force, and was determined to be the most helpful for on-the-ground operational relevance.

All the images are from DigitalGlobe, have a resolution of 0.5m ground distance, are multi-band (though I don't know what the bands are), have metadata such as "sun azimuth" and 'off-nadir", has 22 different disasters from 15 which occurred at various times of the year, picked for range of severity, large geographical diversity, and diversity in the design and organization of buildings. There are approximately 700,000 building polygons.
Thumbnail "On my not-too-shabby laptop CPU, I can run most common CNN models in (at most) 10 -- 100 milliseconds with libraries like TensorFlow. In 2019, even a smartphone can run 'heavy' CNN models (like ResNet) in less than half a second. So imagine my surprise when I timed my own simple implementation of a convolution layer and found that it took over 2 seconds for a single layer!"

"It's no surprise that modern deep learning libraries have production-level, highly-optimized implementations of most operations. But what exactly is the black magic that these libraries use that we mere mortals don't? How are they able to improve performance by 100x? What exactly does one do to 'optimize' or accelerate neural network operations?"

Optimizing generalized matrix multiplication, loop reordering, loop unrolling, single instruction multiple data CPU instructions, and multithreading.
Thumbnail Are hippos OP? I clicked this because I was like, "What does OP mean?" Vocabulary for today.
Thumbnail LiquidWarping GAN takes video of someone dancing and outputs video of someone else dancing. The system can also be used to change the viewpoint or change attire.

The way it works is in 3 stages: the first is called "body mesh recovery" which figures out the rotation of the limbs in both the reference (dancer) image and source (someone else) image. The second is called "flow composition" and it figures out what parts of the source mesh (the someone else) correspond to the same parts of the reference mesh (dancer). The final stage is the actual generative network that generates the final image frames for the output video.

The body mesh recovery module is a ResNet-50, flow composition module is a neural mesh renderer, and the generator is a full-fledged GAN that combines the background with the correspondence meshes and uses the same discriminator as the infamous deepfakes system, Pix2Pix.
Thumbnail Generate images with Generative Adversarial Networks in your browser. "Select one or more of the automatically generated low-resolution images below to create a new image at the desired resolution and zoom level. If you select more than one image, a mix of the selected images will be generated."

I selected the "Flowers" model. Selected 2 images. It generated an... image. I turned up the resolution and zoomed in a little.
Thumbnail Scott Aaronson made a Quantum Supremacy FAQ. Questions answered: What is quantum computational supremacy? If Google has indeed achieved quantum supremacy, does that mean that now 'no code is uncrackable', as Democratic presidential candidate Andrew Yang recently tweeted? What calculation is Google planning to do, or has it already done, that's believed to be classically hard? But if the quantum computer is just executing some random garbage circuit, whose only purpose is to be hard to simulate classically, then who cares? Isn't this a big overhyped nothingburger? Years ago, you scolded the masses for being super-excited about D-Wave, and its claims to get huge quantum speedups for optimization problems via quantum annealing. Today you scold the masses for not being super-excited about quantum supremacy. Why can't you stay consistent? If quantum supremacy calculations just involve sampling from probability distributions, how do you check that they were done correctly? Wait. If classical computers can only check the results of a quantum supremacy experiment, in a regime where the classical computers can still simulate the experiment (albeit extremely slowly), then how do you get to claim 'quantum supremacy'? Is there a mathematical proof that no fast classical algorithm could possibly spoof the results of a sampling-based quantum supremacy experiment? Does sampling-based quantum supremacy have any applications in itself? If the quantum supremacy experiments are just generating random bits, isn't that uninteresting? Isn't it trivial to convert qubits into random bits, just by measuring them? Haven't decades of quantum-mechanical experiments -- for example, the ones that violated the Bell inequality -- already demonstrated quantum supremacy? Even so, there are countless examples of materials and chemical reactions that are hard to classically simulate, as well as special-purpose quantum simulators (like those of Lukin's group at Harvard). Why don't these already count as quantum computational supremacy? Did you (Scott Aaronson) invent the concept of quantum supremacy? If quantum supremacy was achieved, what would it mean for the QC skeptics? What's next?
Thumbnail Ben Goertzel gave a talk about the Singularity and mind uploading at Brain Bar. He touched on a variety of other topics, which I thought were more interesting than the mind uploading part, which is a philosophical debate for which it seems there is never any new knowledge. For example he touched on the question of who is likely to control AGI once it is developed, and what those organizations are likely to use it for.
Thumbnail 3D Ken Burns effect from a single image. The idea is to do a 3D zoom instead of a 2D zoom. The way the system works is by first estimating the depth of everything in the original image using neural networks that take into account both the geometry and the semantics (by which they mean, knowing something about what the objects in the image are), followed by neural networks that do "inpainting," filling in the parts of the image that are exposed to the (virtual) camera by the zoom. This is done by converting the image to a point cloud, rendering a new image with inpainted points, then converting that image to a point cloud and combining it with the point cloud from the original image.
Thumbnail Fine-tuning GPT-2 from human preferences. They tested it on four tasks: text continuation with positive sentiment, text continuation with physically descriptive text, summarization based on a model trained by summaries on CNN and Daily Mail articles, and summarization based on a model trained by finding Reddit posts with a "TL; DR" summary section.

The original GPT-2 was trained only on text input, but this system uses humans judging which of several options they liked better as part of its training system.
Thumbnail Boston Dynamics finally put their Spot robots up for sale. But if you want one, you'll have to convince them that you either have a compelling use case or a development team that can do something really interesting with the robot.
Thumbnail "Facebook is buying CTRL-labs, a NY-based startup building an armband that translates movement and the wearer's neural impulses into digital input signals."

"Facebook has talked a lot about working on a non-invasive brain input device that can make things like text entry possible just by thinking." "With this acquisition, the company appears to be working more closely with technology that could one day be productized."
Thumbnail Multi-agent hide-and-seek video.
Thumbnail RISC-V is emerging as the open source equivalent for hardware CPU instruction set architectures, analogous to USB and Linux.
Thumbnail Google reportedly attains "quantum supremacy". This is the kind of thing I ordinarily wouldn't report, since my attitude is "I'll believe it when I see it" (or something like that). But since everyone's talking about it, I figured I should post and comment on it.

"A Google research paper was temporarily posted online this week, the Financial Times reported Friday, and said the quantum computer's processor allowed a calculation to be performed in just over 3 minutes. That calculation would take 10,000 years on IBM's Summit, the world's most powerful commercial computer, Google reportedly said."

"Google researchers are throwing around the term 'quantum supremacy' as a result, the FT said, because their computer can solve tasks that can't otherwise be solved. 'To our knowledge, this experiment marks the first computation that can only be performed on a quantum processor,' the research paper reportedly said."

According to the paper (which was taken down, but not before somebody posted a clumsy cut-and-paste of it), what the Google quantum computer did was simulate a pseudo-random quantum circuit. Which I never heard of but apparently the quantum superposition in the quantum circuit makes it hard to simulate on a classical computer. I was expecting something like number factorization (Shor's Algorithm). I have no idea what a pseudo-random quantum circuit is good for.

The Google quantum computer is claimed to do this with 53 qubits, though the article mentions Google has claimed to have built a built a different quantum computer with 72 qubits. IBM has its own 53-bit quantum computer, and claims that theirs is a general-purpose computer, unlike the one Google used here which was designed specifically for this experiment (simulating a pseudo-random quantum circuit).
Thumbnail Hao Li, an associate professor of computer science at the University of Southern California, is predicting that deepfakes that look real are 6 months away, and "perfect and virtually undetectable" deepfakes are "a few years" away.
Thumbnail Flying saucer. Basically a ducted quadcopter with jets that uses the saucer body as a wing.
Thumbnail "Digital timber construction" aka robots making wooden frames.
Thumbnail OpenAI got AI agents in a hide-and-seek competition to invent 6 strategies that involved using objects in the environment as tools. "In our environment, agents play a team-based hide-and-seek game. Hiders (blue) are tasked with avoiding line-of-sight from the seekers (red), and seekers are tasked with keeping vision of the hiders. There are objects scattered throughout the environment that hiders and seekers can grab and lock in place, as well as randomly generated immovable rooms and walls that agents must learn to navigate. Before the game begins, hiders are given a preparation phase where seekers are immobilized to give hiders a chance to run away or change their environment."

"Agents are given a team-based reward; hiders are given a reward of +1 if all hiders are hidden and -1 if any hider is seen by a seeker. Seekers are given the opposite reward, -1 if all hiders are hidden and +1 otherwise. To confine agent behavior to a reasonable space, agents are penalized if they go too far outside the play area. During the preparation phase, all agents are given zero reward."

"As agents train against each other in hide-and-seek, as many as six distinct strategies emerge. Each new strategy creates a previously nonexistent pressure for agents to progress to the next stage."

The 6 strategies are: 1. Seekers learn to chase hiders, and hiders learn to run away. 2. Hiders learn to use boxes and existing walls to construct forts. 3. Seekers learn to use ramps to jump into the hiders' shelter. 4. Hiders learn to move ramps to the edge of the play area, far from where they will build their fort, and lock them in place. 5. Seekers learn that they can jump from locked ramps to unlocked boxes and then surf the box to the hiders' shelter, which is possible because the environment allows agents to move together with the box regardless of whether they are on the ground or not. 6. Hiders learn to lock all the unused boxes before constructing their fort.

One thing you might not notice if you're not paying close attention is that the light grey captions under the pictures indicate it took the AI agents 481 million games of hide-and-seek to learn all this. So they are still inefficient learners. But they do learn, so this shows it is possible.

"Additionally, hiders learn to coordinate who will block which door and who will go grab the ramp. In cases where the boxes are far from the doors, hiders pass boxes to each other in order to block the doors in time."
Thumbnail François Chollet, creator of the Keras deep learning framework, interviewed by Lex Fridman. Typical measurements of the progress of science is measuring resource consumption, like number of papers published, rather than the significance of the discoveries. If science was exponential, you would expect the temporal density of the significance of discovery to be exponential, but it's actually nearly flat. As you make progress on a subfield of science, it becomes more difficult to make further progress. You have to work harder to make smaller discoveries. To make the same amount of progress, you need more headcount. The number of people is increasing, and the amount of CPU power devoted to science is increasing, but that's resource consumption. He explains this to refute the "intelligence explosion" idea, and finds when he tells people this, there are people who take the "intelligence explosion" idea as a religious idea and make it part of their identity, and react as though he is attacking them personally.

History of Keras: Caffe was more popular than Theano in 2014, and it was a C++ library for computer vision, and using Python to define the model was an unusual idea in 2014 -- at the time the convention was to define models in yaml (configuration file format), rather than in code (e.g. Python), and he was trying to develop LSTMs. He was hired by Google which had just developed TensorFlow, so ported Keras to TensorFlow.

He talks about Keras 2.0 and the challenges in making an API for other developers. It needs to be as small as possible, hierarchical, and reflecting the way domain experts think about the problem, so they can map their mental model to the API as easily as possible. The API can't be self-referential -- referencing implementation details -- only domain-specific concepts.

He wants to make higher-level APIs, higher-level than Keras. A deep neural network could only learn sorting point-by-point, but a sorting algorithm can sort anything. The future is combining the two -- deep learning's ability to learn with the power of abstractions. The really successful AI systems today are mostly rule-based, with deep learning for perception. A self-driving can is mostly software, coded by hand. It's interfacing with the real world using deep learning systems.

He's not interested in the Turing Test because he doesn't consider imitating intelligence the same as intelligence. Anything can be solved with deep learning extremely inefficiently. An explicit model is much more efficient. Can we get deep learning to write the explicit models, that is, program synthesis? Right now nobody knows how. We haven't found the "backpropagation" of program synthesis.

You don't see papers about data annotation because the people doing that are spending all their time solving real-world problems and not writing papers. A deep learning system that solves natural language questions generated by an algorithm is just mirroring the algorithm, not learning about question answering in general, which is the point. So far, we've gotten further by increasing computation power for learning algorithms than by hard-coding knowledge, but this is right about the last 70 years, but might not hold true for the next 70 years. Moore's Law, for example, might not be applicable in the future. Data efficiency could become the bottleneck. Unsupervised and reinforcement learning are frameworks for learning but not techniques. When people say "reinforcement learning", People usually mean deep reinforcement learning (reinforcement learning framework combined with deep learning technique).

AI even in its current state can do mass surveillance and manipulation of behavior. The current objective function, "engagement", is not ideal. "Loss function engineer" is probably going to be a job title in the future, as everything else becomes increasingly automated. If you could make human-like intelligence, that would be interesting because it would mean you could understand human intelligence, including things like emotions. His definition of "intelligence" is the efficiency with which you turn experience into generalizable programs. Measure skill against priors and experience and the agent that performs better on larger experiences is the more intelligent.

Another "AI winter" could happen because people over-hype the capabilities of AI and over promise when they will be able to deliver solutions. AI winter is the backlash. He says there is too much AI that delivers value to have an AI winter for all AI, but he predicts a backlash on autonomous vehicles and AGI.
Thumbnail China clamping down.
Thumbnail Overton "automates the life cycle of model construction, deployment, and monitoring." "The machine itself fixes and adjusts machine learning models in response to external stimuli, making it more accurate and repairing logical flaws that might lead to an incorrect conclusion. The idea is that humans can then focus on the high-end supervision of machine learning models."

With Overton, the schema defines what the model computes but not how the model computes it. In other words, Overton is free to embed sentences using an LSTM or a Transformer, or change hyperparameters, like hidden state size. Overton engineers focus on how to monitor the application quality and improve supervision of of the deep learning models, but not on building the model. Overton is responsible for training the model and producing a production-ready binary.

Overton can accept supervision at multiple granularity and Overton models often perform other tasks alongside the main task, for example part-of-speech tagging.

Overton is often good at training models even with low-quality supervision.

Engineers can define "slices", such as "nutrition-related queries" (to Siri) or "queries with complex disambiguation". The engineer defines what supervision would improve the slice and Overton figures out how to improve the model.

Overton has already powered industry-grade systems at Apple for more than a year and reduced the error rates of those systems, enabled small teams to perform the same duties as several, larger teams, and improved product turn-around times.
Thumbnail Larry Ellison proclaims Oracle Linux is "autonomous." "It provisions itself, it scales itself, it tunes itself, it does it all while it's running. It patches itself. While it's running." "Once a vulnerability is discovered, we fix it while it's running."
Thumbnail Face recognition technology in China beaten by plastic surgery. "Huan said she discovered she had been logged out of the online shopping and payment gateways she used because the secure identification process, backed by facial recognition technology, simply did not know who she was. Huan said her work was also affected as she could no longer sign in and off work by scanning her face. Checking in to hotels and boarding high-speed trains had also become a problem as she had used facial recognition to register on those platforms."
Thumbnail "If a fishing vessel had steamed past the area last October, the crew might have glimpsed half a dozen or so 35-foot-long inflatable boats darting through the shallows, and thought little of it. But if crew members had looked closer, they would have seen that no one was aboard: The engine throttle levers were shifting up and down as if controlled by ghosts. The boats were using high-tech gear to sense their surroundings, communicate with one another, and automatically position themselves so, in theory, .50-caliber machine guns that can be strapped to their bows could fire a steady stream of bullets to protect troops landing on a beach."

"The secretive effort -- part of a Marine Corps program called Sea Mob -- was meant to demonstrate that vessels equipped with cutting-edge technology could soon undertake lethal assaults without a direct human hand at the helm. It was successful: Sources familiar with the test described it as a major milestone in the development of a new wave of artificially intelligent weapons systems soon to make their way to the battlefield."
Thumbnail "When humans face a complex challenge, we create a plan composed of individual, related steps. Often, these plans are formed as natural language sentences."

"Facebook AI has developed a new method of teaching AI to plan effectively, using natural language to break down complex problems into high-level plans and lower-level actions. Our system innovates by using two AI models -- one that gives instructions in natural language and one that interprets and executes them -- and it takes advantage of the structure in natural language in order to address unfamiliar tasks and situations. We've tested our approach using a new real-time strategy game called MiniRTSv2, and found it outperforms AI systems that simply try to directly imitate human gameplay."

"MiniRTSv2 is a streamlined strategy game designed specifically for AI research. In the game, a player commands archers, dragons, and other units in order to defeat an opponent."

"Though MiniRTSv2 is intentionally simpler and easier to learn than commercial games such as DOTA 2 and StarCraft, it still allows for complex strategies that must account for large state and action spaces, imperfect information (areas of the map are hidden when friendly units aren't nearby), and the need to adapt strategies to the opponent's actions."

"We used MiniRTSv2 to train AI agents to first express a high-level strategic plan as natural language instructions and then to act on that plan with the appropriate sequence of low-level actions in the game environment. This approach leverages natural language's built-in benefits for learning to generalize to new tasks. Those include the expressive nature of language -- different combinations of words can represent virtually any concept or action -- as well as its compositional structure, which allows people to combine and rearrange words to create new sentences that others can then understand. We applied these features to the entire process of planning and execution, from the generation of strategy and instructions to the interface that bridges the different parts of the system's hierarchical structure."
Thumbnail Pinterest claims Pinterest Lens, their visual search technology that lets you snap a photo of a product in the real world and Pinterest will tell you what it is and where you can find it or something just like it, "can identify more than 2.5 billion objects."
Thumbnail "10 reasons why PyTorch is the deep learning framework of the future." "1. PyTorch is Pythonic." "2. Easy to learn." "3. Higher developer productivity." "4. Easy debugging." "5. Data Parallelism." "6. Dynamic Computational Graph Support." "7. Hybrid Front-End." "8.Useful Libraries." "9. Open Neural Network Exchange support." "10. Cloud support."
Thumbnail "Aristo was tested on 119 questions from the eighth-grade exam and was correct on over 90 percent of them, a remarkable performance. It was also correct on over 83 percent of 12th-grade questions. While the Times reported that Aristo 'passed the test,' the AI2 team noted that the actual tests New York students take include questions that refer to diagrams, as well as 'direct answer' questions, neither of which Aristo was able to handle."

"This is exciting progress, but we must keep in mind that a high score on a particular data set does not always mean that a machine has actually learned the task its human programmers intended. Sometimes the data used to train and test a learning system has subtle statistical patterns -- I'll call these giveaways -- that allow the system to perform well without any real understanding or reasoning."

"For example, one neural-network language model -- similar to the one Aristo uses -- was reported in 2019 to capably determine whether one sentence logically implies another. However, the reason for the high performance was not that the network understood the sentences or their connecting logic; rather, it relied on superficial syntactic properties such as how much the words in one sentence overlapped those in the second sentence."
Thumbnail "A few years ago, Joelle Pineau, a computer science professor at McGill, was helping her students design a new algorithm when they fell into a rut." "Pineau's students hoped to improve on another lab's system. But first they had to rebuild it, and their design, for reasons unknown, was falling short of its promised results. Until, that is, the students tried some 'creative manipulations' that didn't appear in the other lab's paper."

"Lo and behold, the system began performing as advertised. The lucky break was a symptom of a troubling trend, according to Pineau."

"Pineau is trying to change the standards. She's the reproducibility chair for NeurIPS, a premier artificial intelligence conference. Under her watch, the conference now asks researchers to submit a 'reproducibility checklist' including items often omitted from papers, like the number of models trained before the 'best' one was selected, the computing power used, and links to code and datasets. That's a change for a field where prestige rests on leaderboards -- rankings that determine whose system is the 'state of the art' for a particular task -- and offers great incentive to gloss over the tribulations that led to those spectacular results."
Thumbnail "DeepMind has quietly open sourced three new impressive reinforcement learning frameworks." OpenSpiel, SpriteWorld, and bsuite.

"OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games."

"Spriteworld is a python-based RL environment that consists of a 2-dimensional arena with simple shapes that can be moved freely."

"bsuite is a collection of experiments designed to highlight key aspects of agent scalability."
Thumbnail Grover's algorithm for quantum computers allows a search of an unordered list of items to be done in time proportional to the square root of the number of items. "Despite the interest, implementing Grover's algorithm has taken time because of the significant technical challenges involved. The first quantum computer capable of implementing it appeared in 1998, but the first scalable version didn't appear until 2017, and even then it worked with only three qubits. So new ways to implement the algorithm are desperately needed."

"Today Stéphane Guillet and colleagues at the University of Toulon in France say this may be easier than anybody expected. They say they have evidence that Grover's search algorithm is a naturally occurring phenomenon. 'We provide the first evidence that under certain conditions, electrons may naturally behave like a Grover search, looking for defects in a material.'" "Free electrons naturally implement the Grover search algorithm when moving across the surface of certain crystals."