This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

OpenAI’s new LLM exposes the secrets of how AI really works

The news: ChatGPT maker OpenAI has built an experimental large language model that is far easier to understand than typical models.

Why it matters: It’s a big deal, because today’s LLMs are black boxes: Nobody fully understands how they do what they do. Building a model that is more transparent sheds light on how LLMs work in general, helping researchers figure out why models hallucinate, why they go off the rails, and just how far we should trust them with critical tasks. Read the full story.

—Will Douglas Heaven

Google DeepMind is using Gemini to train agents inside Goat Simulator 3

Google DeepMind has built a new video-game-playing agent called SIMA 2 that can navigate and solve problems in 3D virtual worlds. The company claims it’s a big step toward more general-purpose agents and better real-world robots.   

The company first demoed SIMA (which stands for “scalable instructable multiworld agent”) last year. But this new version has been built on top of Gemini, the firm’s flagship large language model, which gives the agent a huge boost in capability. Read the full story.

—Will Douglas Heaven

These technologies could help put a stop to animal testing

Earlier this week, the UK’s science minister announced an ambitious plan: to phase out animal testing.

Testing potential skin irritants on animals will be stopped by the end of next year. By 2027, researchers are “expected to end” tests of the strength of Botox on mice. And drug tests in dogs and nonhuman primates will be reduced by 2030.

It’s good news for activists and scientists who don’t want to test on animals. And it’s timely too: In recent decades, we’ve seen dramatic advances in technologies that offer new ways to model the human body and test the effects of potential therapies, without experimenting on animals. Read the full story.

—Jessica Hamzelou

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Chinese hackers used Anthropic’s AI to conduct an espionage campaign   
It automated a number of attacks on corporations and governments in September. (WSJ $)
+ The AI was able to handle the majority of the hacking workload itself. (NYT $)
+ Cyberattacks by AI agents are coming. (MIT Technology Review)

2 Blue Origin successfully launched and landed its New Glenn rocket
It managed to deploy two NASA satellites into space without a hitch. (CNN)
+ The New Glenn is the company’s largest reusable rocket. (FT $)
+ The launch had been delayed twice before. (WP $)

3 Brace yourself for flu season
It started five weeks earlier than usual in the UK, and the US is next. (Ars Technica)
+ Here’s why we don’t have a cold vaccine. Yet. (MIT Technology Review)

4 Google is hosting a Border Protection facial recognition app    
The app alerts officials whether to contact ICE about identified immigrants. (404 Media)
+ Another effort to track ICE raids was just taken offline. (MIT Technology Review)

5 OpenAI is trialling group chats in ChatGPT
It’d essentially make AI a participant in a conversation of up to 20 people. (Engadget)

6 A TikTok stunt sparked debate over how charitable America’s churches really are
Content creator Nikalie Monroe asked churches for help feeding her baby. Very few stepped up. (WP $)

7 Indian startups are attempting to tackle air pollution
But their solutions are far beyond the means of the average Indian household. (NYT $)
+ OpenAI is huge in India. Its models are steeped in caste bias. (MIT Technology Review)

8 An AI tool could help reduce wasted efforts to transplant organs
It predicts how likely the would-be recipient is to die during the brief transplantation window. (The Guardian)
+ Putin says organ transplants could grant immortality. Not quite. (MIT Technology Review)

9 3D-printing isn’t making prosthetics more affordable
It turns out that plastic prostheses are often really uncomfortable. (IEEE Spectrum)
+ These prosthetics break the mold with third thumbs, spikes, and superhero skins. (MIT Technology Review)

10 What happens when relationships with AI fall apart
Can you really file for divorce from an LLM? (Wired $)
+ It’s surprisingly easy to stumble into a relationship with an AI chatbot. (MIT Technology Review)

Quote of the day

“It’s a funky time.”

—Aileen Lee, founder and managing partner of Cowboy Ventures, tells TechCrunch the AI boom has torn up the traditional investment rulebook.

One more thing

Restoring an ancient lake from the rubble of an unfinished airport in Mexico City

Weeks after Mexican President Andrés Manuel López Obrador took office in 2018, he controversially canceled ambitious plans to build an airport on the deserted site of the former Lake Texcoco—despite the fact it was already around a third complete.

Instead, he tasked Iñaki Echeverria, a Mexican architect and landscape designer, with turning it into a vast urban park, an artificial wetland that aims to transform the future of the entire Valley region.

But as López Obrador’s presidential team nears its end, the plans for Lake Texcoco’s rebirth could yet vanish. Read the full story.

—Matthew Ponsford

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Maybe Gen Z is onto something when it comes to vibe dating.
+ Trust AC/DC to give the fans what they want, performing Jailbreak for the first time since 1991.
+ Nieves González, the artist behind Lily Allen’s new album cover, has an eye for detail.
+ Here’s what AI determines is a catchy tune.

Read more

Earlier this week, the UK’s science minister announced an ambitious plan: to phase out animal testing.

Testing potential skin irritants on animals will be stopped by the end of next year, according to a strategy released on Tuesday. By 2027, researchers are “expected to end” tests of the strength of Botox on mice. And drug tests in dogs and nonhuman primates will be reduced by 2030. 

The news follows similar moves by other countries. In April, the US Food and Drug Administration announced a plan to replace animal testing for monoclonal antibody therapies with “more effective, human-relevant models.” And, following a workshop in June 2024, the European Commission also began working on a “road map” to phase out animal testing for chemical safety assessments.

Animal welfare groups have been campaigning for commitments like these for decades. But a lack of alternatives has made it difficult to put a stop to animal testing. Advances in medical science and biotechnology are changing that.

Animals have been used in scientific research for thousands of years. Animal experimentation has led to many important discoveries about how the brains and bodies of animals work. And because regulators require drugs to be first tested in research animals, it has played an important role in the creation of medicines and devices for both humans and other animals.

Today, countries like the UK and the US regulate animal research and require scientists to hold multiple licenses and adhere to rules on animal housing and care. Still, millions of animals are used annually in research. Plenty of scientists don’t want to take part in animal testing. And some question whether animal research is justifiable—especially considering that around 95% of treatments that look promising in animals don’t make it to market.

In recent decades, we’ve seen dramatic advances in technologies that offer new ways to model the human body and test the effects of potential therapies, without experimenting on humans or other animals.

Take “organs on chips,” for example. Researchers have been creating miniature versions of human organs inside tiny plastic cases. These systems are designed to contain the same mix of cells you’d find in a full-grown organ and receive a supply of nutrients that keeps them alive.

Today, multiple teams have created models of livers, intestines, hearts, kidneys and even the brain. And they are already being used in research. Heart chips have been sent into space to observe how they respond to low gravity. The FDA used lung chips to assess covid-19 vaccines. Gut chips are being used to study the effects of radiation.

Some researchers are even working to connect multiple chips to create a “body on a chip”—although this has been in the works for over a decade and no one has quite managed it yet.

In the same vein, others have been working on creating model versions of organs—and even embryos—in the lab. By growing groups of cells into tiny 3D structures, scientists can study how organs develop and work, and even test drugs on them. They can even be personalized—if you take cells from someone, you should be able to model that person’s specific organs. Some researchers have even been able to create organoids of developing fetuses.

The UK government strategy mentions the promise of artificial intelligence, too. Many scientists have been quick to adopt AI as a tool to help them make sense of vast databases, and to find connections between genes, proteins and disease, for example. Others are using AI to design all-new drugs.

Those new drugs could potentially be tested on virtual humans. Not flesh-and-blood people, but digital reconstructions that live in a computer. Biomedical engineers have already created digital twins of organs. In ongoing trials, digital hearts are being used to guide surgeons on how—and where—to operate on real hearts.

When I spoke to Natalia Trayanova, the biomedical engineering professor behind this trial, she told me that her model could recommend regions of heart tissue to be burned off as part of treatment for atrial fibrillation. Her tool would normally suggest two or three regions but occasionally would recommend many more. “They just have to trust us,” she told me.

It is unlikely that we’ll completely phase out animal testing by 2030. The UK government acknowledges that animal testing is still required by lots of regulators, including the FDA, the European Medicines Agency, and the World Health Organization. And while alternatives to animal testing have come a long way, none of them perfectly capture how a living body will respond to a treatment.

At least not yet. Given all the progress that has been made in recent years, it’s not too hard to imagine a future without animal testing.

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

Read more

ChatGPT maker OpenAI has built an experimental large language model that is far easier to understand than typical models.

That’s a big deal, because today’s LLMs are black boxes: Nobody fully understands how they do what they do. Building a model that is more transparent sheds light on how LLMs work in general, helping researchers figure out why models hallucinate, why they go off the rails, and just how far we should trust them with critical tasks.

“As these AI systems get more powerful, they’re going to get integrated more and more into very important domains,” Leo Gao, a research scientist at OpenAI, told MIT Technology Review in an exclusive preview of the new work. “It’s very important to make sure they’re safe.”

This is still early research. The new model, called a weight-sparse transformer, is far smaller and far less capable than top-tier mass-market models like the firm’s GPT-5, Anthropic’s Claude, and Google DeepMind’s Gemini. At most it’s as capable as GPT-1, a model that OpenAI developed back in 2018, says Gao (though he and his colleagues haven’t done a direct comparison).    

But the aim isn’t to compete with the best in class (at least, not yet). Instead, by looking at how this experimental model works, OpenAI hopes to learn about the hidden mechanisms inside those bigger and better versions of the technology.

It’s interesting research, says Elisenda Grigsby, a mathematician at Boston College who studies how LLMs work and who was not involved in the project: “I’m sure the methods it introduces will have a significant impact.” 

Lee Sharkey, a research scientist at AI startup Goodfire, agrees. “This work aims at the right target and seems well executed,” he says.

Why models are so hard to understand

OpenAI’s work is part of a hot new field of research known as mechanistic interpretability, which is trying to map the internal mechanisms that models use when they carry out different tasks.

That’s harder than it sounds. LLMs are built from neural networks, which consist of nodes, called neurons, arranged in layers. In most networks, each neuron is connected to every other neuron in its adjacent layers. Such a network is known as a dense network.

Dense networks are relatively efficient to train and run, but they spread what they learn across a vast knot of connections. The result is that simple concepts or functions can be split up between neurons in different parts of a model. At the same time, specific neurons can also end up representing multiple different features, a phenomenon known as superposition (a term borrowed from quantum physics). The upshot is that you can’t relate specific parts of a model to specific concepts.

“Neural networks are big and complicated and tangled up and very difficult to understand,” says Dan Mossing, who leads the mechanistic interpretability team at OpenAI. “We’ve sort of said: ‘Okay, what if we tried to make that not the case?’”

Instead of building a model using a dense network, OpenAI started with a type of neural network known as a weight-sparse transformer, in which each neuron is connected to only a few other neurons. This forced the model to represent features in localized clusters rather than spread them out.

Their model is far slower than any LLM on the market. But it is easier to relate its neurons or groups of neurons to specific concepts and functions. “There’s a really drastic difference in how interpretable the model is,” says Gao.

Gao and his colleagues have tested the new model with very simple tasks. For example, they asked it to complete a block of text that opens with quotation marks by adding matching marks at the end.  

It’s a trivial request for an LLM. The point is that figuring out how a model does even a straightforward task like that involves unpicking a complicated tangle of neurons and connections, says Gao. But with the new model, they were able to follow the exact steps the model took.

“We actually found a circuit that’s exactly the algorithm you would think to implement by hand, but it’s fully learned by the model,” he says. “I think this is really cool and exciting.”

Where will the research go next? Grigsby is not convinced the technique would scale up to larger models that have to handle a variety of more difficult tasks.    

Gao and Mossing acknowledge that this is a big limitation of the model they have built so far and agree that the approach will never lead to models that match the performance of cutting-edge products like GPT-5. And yet OpenAI thinks it might be able to improve the technique enough to build a transparent model on a par with GPT-3, the firm’s breakthrough 2021 LLM. 

“Maybe within a few years, we could have a fully interpretable GPT-3, so that you could go inside every single part of it and you could understand how it does every single thing,” says Gao. “If we had such a system, we would learn so much.”

Read more
1 303 304 305 306 307 3,225