|
"Zoox's robotaxi is designed from the ground up just for passengers -- hence the lack of a steering wheel altogether. Next to each seat is a touchscreen for controlling temperature, playing music or looking at a route map. The robotaxi is symmetrical and bidirectional, so it'll never have to reverse out of a parking spot. And like Waymo's and Cruise's fleets, it's all-electric.
"Zoox hopes to make a strong first impression by deploying its purpose-built robotaxi out of the gate, instead of gradually working toward a rider-focused vehicle like its competitors. It plans to launch commercially in the coming months, starting in Las Vegas." |
|
|
Technique for adding compile-time checks to anything you can define as an invariant.
Many people have tried to make it so that buggy programs simply don't compile. But the netstack3 team has a concrete, general framework for approaching this kind of design. He broke the process into three steps: definition, enforcement, and consumption. For definition, the programmer must take something that Rust can reason about (usually types) and attach the desired property to it. This is usually done via documentation -- describing that a particular trait represents a particular property, for example. Then the programmer enforces the property by making sure that all of the code that directly deals with the type upholds the relevant invariant.
The article goes on to describe some specific techniques for doing this: adding a hidden field to a structure that is used to verify the invariant condition is being fulfilled, and zero-sized types that don't exist at run time, and have no run-time overhead, but enable the compiler to check things. The example language is Rust but these techniques may generalize to other languages and type systems. |
|
|
TrapC is fork of the C (not C++) programming language for memory safety. Well, it's an idea for a fork of the C (not C++) programming language for memory safety. Hasn't been implemented yet. |
|
|
"Time alone heightens 'threat alert' in teenagers -- even when connecting online."
"This is according to latest findings from a cognitive neuroscience experiment conducted at the University of Cambridge, which saw 40 young people aged 16-19 undergo testing before and after several hours alone -- both with and without their smartphones."
The study was done during the pandemic (April 2021). The researchers had an isolation room with an armchair, desk, office chair, desktop computer, physiological hardware (for the electrodermal activity measurement), a fridge with food and beverages, and "non-social materials": puzzles, sudoku books, digital, and analogue games. The teenagers were allowed to bring any "non-social items" (crafts, textbooks, writing materials) of their own.
They then compared isolation with and without smartphones and internet access.
"Although virtual social interactions helped our participants feel less lonely compared to total isolation, their heightened threat response remained," said Emily Towner, study lead author from Cambridge's Department of Psychology.
What the "threat response" is about is, before sending people into the isolation room, they taught them a certain shape on a computer screen was associated with a painful noise. Afterwards they tested to see people people reacted to this learned threat. People who were isolated had a heightened reaction. Isolated but with social media made people feel less lonely, but did nothing to reduce the "threat response". For that, people need to not be isolated.
So it looks like real-life interaction is different from virtual interaction. Virtual interaction reduces subjective loneliness but leaves heightened reaction to perceived threat. This is for 16-19 year olds, but they only tested 16-19 year-olds -- they did not test people of other ages so this doesn't give you a comparison with people in other ages. |
|
|
"At Antithesis, we build an autonomous, deterministic simulation testing (DST) tool. Determinism is so in the water here that it has even seeped into our front-end: our reactive notebook. In this case, determinism was a tool that enabled us to build the low-latency, reactive experiences our users enjoy."
"Reactivity is traditionally defined as a system reacting in response to changing data. In the UI/UX world, reactivity is considered a feature of some libraries (denoting automatic interface updates as data changes), rather than a programming style."
"We're seeing glimmers of instant reactivity in dev tools. First it was syntax highlighting that updates without saving; later it was syntax checks, autocomplete, and linters. Now we even have AI copilots suggesting code as you type. But great developers know there's something more important than what color the code is or how your linter feels: what's most important is what the code does when it runs."
"By running your code on keystroke, the Antithesis Notebook's reactive paradigm informs you of just that, and with an immediacy that's essential to shortening iteration cycles and flattening learning curves. When you're in a reactive regime, you're immediately forced to reckon with the result of your code. The age-old saying of 'test early, test often' becomes the default."
"It turns out that if you build something reactive enough, something magical falls out: reproducibility. In this case, maintaining the illusion of having just run every line of Notebook code from top-to-bottom mandates that if we actually did restart the Notebook and run from top-to-bottom, then we should be in the same state."
Wow, that's a strong claim. They have a demo you can interact with, and it seems to work as advertised.
"This stands in stark contrast to Jupyter, the best known notebook out there, where users decide which cells to run and in which order. Imagine Google Sheets allowing you to decide which cells were up-to-date. Chaos. For Jupyter, this scheme produces enough hidden state to motivate research on the resulting bugs. One study found that only 24% of sampled Jupyter notebooks ran without exceptions." |
|
|
"Comparing algorithms for extracting content from web pages."
Remember, kids, it's only legal to extract content from web pages if the Terms of Service permit it.
That said, extractors compared: BTE (Python), Goose3 (Python), jusText (Python), Newspaper3k (Python), Readability (JavaScript), Resiliparse (Python), Trafilatura (Python), news-please (Python), Boilerpipe (Java), Dragnet (Python), ExtractNet (Python), Go DOM Distiller (Go), BoilerNet (Python + JavaScript), and Web2Text (Python).
Looks like if you want to extract content from web pages, you should be using Python. |
|
|
Is desalination everywhere realistic? According to Tomas Pueyo, Israel has a population of 9.5 million people and 5 big seawater desalination plants in operation. These produce 55% of the country's fresh water and about 85% of its tap water.
Israel has 2 more desalination plants under construction.
Israeli households pay ~$30/month for their water.
Israel is becoming a freshwater exporter. |
|
|
"AI progress has plateaued at GPT-4 level",
"According to inside reports, Orion (codename for the attempted GPT-5 release from OpenAI) is not significantly smarter than the existing GPT-4. Which likely means AI progress on baseline intelligence is plateauing."
"Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters recently that results from scaling up pre-training -- the phase of training an AI model that uses a vast amount of unlabeled data to understand language patterns and structures -- have plateaued."
The article points out how model as being trained on essentially all knowledge humans have created. OpenAI called many models "GPT-4-something". OpenAI never released Sora and it seems common for companies to not release models to the public now. A lot of internal models are probably just not good enough to release.
He says new techniques like OpenAI o1's "chain of thought" system aren't as good as you'd expect from the amount of power they consume.
"Improvements look ever more like 'teaching to the test' than anything about real fundamental capabilities."
"The y-axis is not on a log scale, while the x-axis is, meaning that cost increases exponentially for linear returns to performance."
"What I'm noticing is that the field of AI research appears to be reverting to what the mostly-stuck AI of the 70s, 80s, and 90s relied on: search."
"AlphaProof just considers a huge number of possibilities."
"I think the return to search in AI is a bearish sign, at least for achieving AGI and superintelligence."
This is all very interesting because until now, I've been hearing there's no limit to the scaling laws, only limits in how many GPUs people can get their hands on, and how much electricity, with plans to build nuclear power plants, and so on. People saying there's a "bubble" in AI haven't been saying that because of a problem in scaling up, but because the financial returns aren't there -- OpenAI et al are losing money -- and the thinking is investors will run out of money to invest, resulting in a decline.
I've speculated there might be diminishing returns coming because we've seen that previously in the history of AI, but you all have been telling me I'm wrong -- AI will continue to advance at the blistering pace of the last few years. But it looks like we're now seeing the first signs we're actually reaching the domain of diminishing returns -- at least until the next algorithmic breakthrough. It looks like we may be approaching the limits of what can be done by scaling up pre-trained transformer models. |
|
|
"Micron breaks out a fast 60TB SSD for mega data centers"
60TB, holy moly that's huge. Speed is 12 GB/s while using just 20 watts of power. Well, 12 GB/s is the read speed. 5 GB/s for writing. That's still fast enough that the whole drive can be fully written in just 3.4 hours.
"The 6550 ION also excels in critical AI training workloads compared to competitive 60TB SSDs, achieving:"
"147% higher performance for NVIDIA Magnum IO GPUDirect Storage (GDS) and 104% better energy efficiency",
"30% higher 4KB transfer performance for deep learning IO Unet3D testing and 20% better energy efficiency,
"151% improvement in completion times for AI model checkpointing while competitors consume 209% more energy."
Nvidia Magnum IO GPUDirect Storage is a technique invented by Nvidia that enables data to flow from memory (NVMe) to the GPU without having to pass through the CPU.
Apparently they used Unet3D to test it. Unet3D is a video segmentation model. Video segmentation means for each frame of the video, it "segments" pixels into groups that all belong to the same concept, for example one segment might be "road", another might be "sidewalk", and another might be "yard", etc. It's based an a "U-net" architecture, so called because it has a large input layer that gets progressively smaller until some encoding is output, which then goes into a series of layers that get progressively bigger until the output which is the same size as the input. You can think of the input as going "down" one side of a "U" to the endcoding, then "up" the other side of the "U" to the output, hence the name "U-net". |
|
|
FrontierMath is a new math test for AI systems, with original, exceptionally challenging mathematics problems -- and all the problems are new and previously unpublished, so they can't be already in large language model (LLMs)' training sets.
We don't have a good measurement of super advanced mathematics capabilities in AI models. The researchers note that current mathematics benchmarks for AI systems, like the MATH dataset and GSM8K, measure ability at the high-school level, and early undergraduate level. The researchers are motivated by a desire to measure deep theoretical understanding, creative insight, and specialized expertise.
There's also the problem of "data contamination" -- "the inadvertent inclusion of benchmark problems in training data." "This causes artificially inflated performance scores for LLMs, and that masks the models' true reasoning (or lack of reasoning) capabilities.
"The benchmark spans the full spectrum of modern mathematics, from challenging competition-style problems to problems drawn directly from contemporary research, covering most branches of mathematics in the 2020 Mathematics Subject Classification."
I had a look at the 2020 Mathematics Subject Classification. It's a 224-page document that is just a big list of subject areas with number-and-letter codes assigned to them. For example "11N45" means "Asymptotic results on counting functions for algebraic and topological structures".
"Current state-of-the-art AI models are unable to solve more than 2% of the problems in FrontierMath, even with multiple attempts, highlighting a significant gap between human and AI capabilities in advanced mathematics."
"To understand expert perspectives on FrontierMath's difficulty and relevance, we interviewed several prominent mathematicians, including Fields Medalists Terence Tao, Timothy Gowers, and Richard Borcherds, and Internatinal Mathematics Olympiad coach Evan Chen. They unanimously characterized the problems as exceptionally challenging, requiring deep domain expertise and significant time investment to solve."
Unlike many International Mathematics Olympiad problems, the FrontierMath problems have a single numerical answer, which makes them possible to check in an automated manner -- no human hand-grading required. At the same time, they have worked to make the problems "guess-proof".
"Problems often have numerical answers that are large and nonobvious." "As a rule of thumb, we require that there should not be a greater than 1% chance of guessing the correct answer without doing most of the work that one would need to do to 'correctly' find the solution."
The numerical calculations don't need to be done in the language model -- they have access to Python to perform mathematical calculations. |
|
|
"Taiwan's technology protection rules prohibits Taiwan Semiconductor Manufacturing Co (TSMC) from producing 2-nanometer chips abroad, so the company must keep its most cutting-edge technology at home, Minister of Economic Affairs J.W. Kuo."
"Kuo made the remarks in response to concerns that TSMC might be forced to produce advanced 2-nanometer chips at its fabs in Arizona ahead of schedule after former US president Donald Trump was re-elected as the next US president." |
|
|
"The open source project DeFlock is mapping license plate surveillance cameras all over the world."
"On his drive to move from Washington state to Huntsville, Alabama, Will Freeman began noticing lots of cameras."
"Once I started getting into the South, I saw a ton of these black poles with a creepy looking camera and a solar panel on top. I took a picture of it and ran it through Google, and it brought me to the Flock website. And then I knew like, 'Oh, that's a license plate reader.' I started seeing them all over the place and realized that they were for the police."
"Flock is one of the largest vendors of automated license plate readers (ALPRs) in the country. The company markets itself as having the goal to fully 'eliminate crime' with the use of ALPRs and other connected surveillance cameras."
"And so he made a map, and called it DeFlock. DeFlock runs on Open Street Map, an open source, editable mapping software." |
|
|
"3DPrinterOS, a cloud-based 3D printing management solutions company, has entered a collaboration with the MIX Lab at Montclair State University to develop an algorithm designed to identify 3D printed gun parts."
Ok, sounds like they haven't done it yet. So, we'll get another article saying they did it... if they succeed. |
|
|
OpenFlexure "uses 3D printers and off the shelf components to build open-source, lab-grade microscopes for a fraction of traditional prices. Used in over 50 countries and every continent, the project aims to enable Microscopy for Everyone."
"An open flexure microscope is built from a combination of off-the-shelf electronics, standard optical equipment, and 3D printed parts. The 3D printed parts are designed to be made on any entry grade printer anywhere in the world." "Nothing is proprietary or hidden."
"The finished microscope can run automatically for several hours, scanning samples with a built-in autofocus. The 8-megapixel camera is comparable to many commercial sight scanners, achieving a resolution below 400 nanometers."
"In practical terms this means that individual cell damage or parasites can be identified on a microscope with parts costing under $300. The stage is fully automated, intelligently planning its own path around samples. It can also self-calibrate, warning the user if there's any damage that could impact the diagnosis. The automated stage allows huge data sets to be collected and stored.
"In pathology, this let samples be archived, shared, or used for the training of medical students. this can also be the platform for low resource artificial intelligence systems or automated image processing, making emerging technologies more accessible in low resource settings."
Something else I didn't know exists until just now. Developed at the University of Bath, University of Cambridge, and the University of Glasgow, with contributions from the Baylor College of Medicine, Bongo Tech & Research Labs, and Mboalab. |
|
|
"No more nanometers: It's time for new node naming", says Kevin Morris.
"Intel held the line from '10 micron' in 1972 through '0.35 micron' in 1995, an impressive 23-year run where the node name matched gate length. Then, in 1997 with the '0.25 micron/250 nm' node they started over-achieving with an actual gate length of 200 nm -- 20% better than the name would imply. This 'sandbagging' continued through the next 12 years, with one node (130nm) having gate length of only 70nm -- almost a 2x buffer. Then, in 2011, Intel jumped over to the other side of the ledger, ushering in what we might call the 'overstating decade' with the '22nm' node sporting a gate length of 26 nm. Since then, things have continued to slide further in that direction, with the current '10nm' node measuring in with a gate length of 18 nm -- almost 2x on the other side of the 'named' dimension."
"So essentially, since 1997, the node name has not been a representation of any actual dimension on the chip, and it has erred in both directions by almost a factor of 2."
"For the last few years, this has been a big marketing problem for Intel from a 'perception' point of view. Most industry folks understand that Intel's '10nm' process is roughly equivalent to TSMC and Samsung's '7nm' processes. But non-industry publications regularly write that Intel has 'fallen behind' because they are still back on 10nm when other fabs have 'moved on to 7nm and are working on 5nm.' The truth is, these are marketing names only, and in no way do they represent anything we might expect in comparing competing technologies."
He mentions a proposal from Philip Wong of TSMC. His proposal is you replace "nanometers" with 3 numbers that reflect the actual usefulness of the chip: density of logic transistors in number per square millimeter, density of off-chip DRAM memory in bits per square millimeter, and density of connections between the memory and the logic transistors, in number of interconnects per square millimeter. So a chip could be described as: "DL: 38M, DM: 383M, DC: 12K", or just "38M, 383M, 12K".
That proposal was made in 2020 and so far, hasn't caught on. Maybe it's tough to market as it's not as exciting as a single single-digit number like "5 nm"? But "nanometers" needs to be replaced by something.
I looked online for a website with nice tables of those three numbers for all the recent chips. Couldn't find it.
It looks like these metrics are often kept secret by semiconductor companies to try to get a competitive advantage, but people say there are websites that have information on new chips and the underlying technology, such as WikiChip, SemiWiki, AnandTech, Tom's Hardware, IEDM, VLSI Symposium, IEEE Xplore, EE Times, Nature Electronics, IEEE International Roadmap for Devices and Systems, Semiconductor Industry Association, and so on. (Links to some of these below. I haven't had time to delve too deeply into any of these sites.) |
|
|
What does the election mean for AI? Matt Wolfe says: The upcoming Trump Administration will repeal the Biden Administration's AI regulation executive order, which required developers of foundation models to report their safety tests to the federal government, and in general the upcoming Trump Administration will oppose regulation of AI, seeking for AI to advance as fast as possible so the US will stay ahead of geopolitical competitors. "Make America First in AI". The upcoming Trump Administration might even fund AI "Manhattan Projects" to further speed up AI advancement. J.D. Vance is an advocate for open source AI models. Elon Musk runs X.AI and will be working with the upcoming Trump Administration.
That's the first 6 minutes of the video. The rest is about other stuff: Runway AI's new camera control feature, Kling Face Swap, ByteDance's new X-Portrait 2, Facepoke, BlackForestLabs's Labs FLUX 1.1, Krea.AI Loras (character models), Anthropic PDF reading, Anthropic's Haiku price increase, partnerships with AI companies and the US military/defense industry, Instagram age verification, OpenAI possibly wanting to get into hardware, GPT-4o Predicted Outputs, Prime Video AI recaps, iOS 18.2 new AI features, bolt.new builds a Tetris game from a prompt, Wendy's working with Palentir for supply chain AI, SingularityNET AI that plays Minecraft, Nvidia robotics simulation tools, and a Unitree walking robot and robot dog that a guy tries to beat the heck out of. |
|