RSS preview of MIT Technology Review

Rss preview of Blog of MIT Technology Review

The Download: how AI really works, and phasing out animal testing

2025-11-14 21:10:00

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

OpenAI’s new LLM exposes the secrets of how AI really works

The news: ChatGPT maker OpenAI has built an experimental large language model that is far easier to understand than typical models.

Why it matters: It’s a big deal, because today’s LLMs are black boxes: Nobody fully understands how they do what they do. Building a model that is more transparent sheds light on how LLMs work in general, helping researchers figure out why models hallucinate, why they go off the rails, and just how far we should trust them with critical tasks. Read the full story.

—Will Douglas Heaven

Google DeepMind is using Gemini to train agents inside Goat Simulator 3

Google DeepMind has built a new video-game-playing agent called SIMA 2 that can navigate and solve problems in 3D virtual worlds. The company claims it’s a big step toward more general-purpose agents and better real-world robots.

The company first demoed SIMA (which stands for “scalable instructable multiworld agent”) last year. But this new version has been built on top of Gemini, the firm’s flagship large language model, which gives the agent a huge boost in capability. Read the full story.

—Will Douglas Heaven

These technologies could help put a stop to animal testing

Earlier this week, the UK’s science minister announced an ambitious plan: to phase out animal testing.

Testing potential skin irritants on animals will be stopped by the end of next year. By 2027, researchers are “expected to end” tests of the strength of Botox on mice. And drug tests in dogs and nonhuman primates will be reduced by 2030.

It’s good news for activists and scientists who don’t want to test on animals. And it’s timely too: In recent decades, we’ve seen dramatic advances in technologies that offer new ways to model the human body and test the effects of potential therapies, without experimenting on animals. Read the full story.

—Jessica Hamzelou

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Chinese hackers used Anthropic’s AI to conduct an espionage campaign
It automated a number of attacks on corporations and governments in September. (WSJ $)
+ The AI was able to handle the majority of the hacking workload itself. (NYT $)
+ Cyberattacks by AI agents are coming. (MIT Technology Review)

2 Blue Origin successfully launched and landed its New Glenn rocket
It managed to deploy two NASA satellites into space without a hitch. (CNN)
+ The New Glenn is the company’s largest reusable rocket. (FT $)
+ The launch had been delayed twice before. (WP $)

3 Brace yourself for flu season
It started five weeks earlier than usual in the UK, and the US is next. (Ars Technica)
+ Here’s why we don’t have a cold vaccine. Yet. (MIT Technology Review)

4 Google is hosting a Border Protection facial recognition app
The app alerts officials whether to contact ICE about identified immigrants. (404 Media)
+ Another effort to track ICE raids was just taken offline. (MIT Technology Review)

5 OpenAI is trialling group chats in ChatGPT
It’d essentially make AI a participant in a conversation of up to 20 people. (Engadget)

6 A TikTok stunt sparked debate over how charitable America’s churches really are
Content creator Nikalie Monroe asked churches for help feeding her baby. Very few stepped up. (WP $)

7 Indian startups are attempting to tackle air pollution
But their solutions are far beyond the means of the average Indian household. (NYT $)
+ OpenAI is huge in India. Its models are steeped in caste bias. (MIT Technology Review)

8 An AI tool could help reduce wasted efforts to transplant organs
It predicts how likely the would-be recipient is to die during the brief transplantation window. (The Guardian)
+ Putin says organ transplants could grant immortality. Not quite. (MIT Technology Review)

9 3D-printing isn’t making prosthetics more affordable
It turns out that plastic prostheses are often really uncomfortable. (IEEE Spectrum)
+ These prosthetics break the mold with third thumbs, spikes, and superhero skins. (MIT Technology Review)

10 What happens when relationships with AI fall apart
Can you really file for divorce from an LLM? (Wired $)
+ It’s surprisingly easy to stumble into a relationship with an AI chatbot. (MIT Technology Review)

Quote of the day

“It’s a funky time.”

—Aileen Lee, founder and managing partner of Cowboy Ventures, tells TechCrunch the AI boom has torn up the traditional investment rulebook.

One more thing

Restoring an ancient lake from the rubble of an unfinished airport in Mexico City

Weeks after Mexican President Andrés Manuel López Obrador took office in 2018, he controversially canceled ambitious plans to build an airport on the deserted site of the former Lake Texcoco—despite the fact it was already around a third complete.

Instead, he tasked Iñaki Echeverria, a Mexican architect and landscape designer, with turning it into a vast urban park, an artificial wetland that aims to transform the future of the entire Valley region.

But as López Obrador’s presidential team nears its end, the plans for Lake Texcoco’s rebirth could yet vanish. Read the full story.

—Matthew Ponsford

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Maybe Gen Z is onto something when it comes to vibe dating.
+ Trust AC/DC to give the fans what they want, performing Jailbreak for the first time since 1991.
+ Nieves González, the artist behind Lily Allen’s new album cover, has an eye for detail.
+ Here’s what AI determines is a catchy tune.

These technologies could help put a stop to animal testing

2025-11-14 18:00:00

Earlier this week, the UK’s science minister announced an ambitious plan: to phase out animal testing.

Testing potential skin irritants on animals will be stopped by the end of next year, according to a strategy released on Tuesday. By 2027, researchers are “expected to end” tests of the strength of Botox on mice. And drug tests in dogs and nonhuman primates will be reduced by 2030.

The news follows similar moves by other countries. In April, the US Food and Drug Administration announced a plan to replace animal testing for monoclonal antibody therapies with “more effective, human-relevant models.” And, following a workshop in June 2024, the European Commission also began working on a “road map” to phase out animal testing for chemical safety assessments.

Animal welfare groups have been campaigning for commitments like these for decades. But a lack of alternatives has made it difficult to put a stop to animal testing. Advances in medical science and biotechnology are changing that.

Animals have been used in scientific research for thousands of years. Animal experimentation has led to many important discoveries about how the brains and bodies of animals work. And because regulators require drugs to be first tested in research animals, it has played an important role in the creation of medicines and devices for both humans and other animals.

Today, countries like the UK and the US regulate animal research and require scientists to hold multiple licenses and adhere to rules on animal housing and care. Still, millions of animals are used annually in research. Plenty of scientists don’t want to take part in animal testing. And some question whether animal research is justifiable—especially considering that around 95% of treatments that look promising in animals don’t make it to market.

In recent decades, we’ve seen dramatic advances in technologies that offer new ways to model the human body and test the effects of potential therapies, without experimenting on humans or other animals.

Take “organs on chips,” for example. Researchers have been creating miniature versions of human organs inside tiny plastic cases. These systems are designed to contain the same mix of cells you’d find in a full-grown organ and receive a supply of nutrients that keeps them alive.

Today, multiple teams have created models of livers, intestines, hearts, kidneys and even the brain. And they are already being used in research. Heart chips have been sent into space to observe how they respond to low gravity. The FDA used lung chips to assess covid-19 vaccines. Gut chips are being used to study the effects of radiation.

Some researchers are even working to connect multiple chips to create a “body on a chip”—although this has been in the works for over a decade and no one has quite managed it yet.

In the same vein, others have been working on creating model versions of organs—and even embryos—in the lab. By growing groups of cells into tiny 3D structures, scientists can study how organs develop and work, and even test drugs on them. They can even be personalized—if you take cells from someone, you should be able to model that person’s specific organs. Some researchers have even been able to create organoids of developing fetuses.

The UK government strategy mentions the promise of artificial intelligence, too. Many scientists have been quick to adopt AI as a tool to help them make sense of vast databases, and to find connections between genes, proteins and disease, for example. Others are using AI to design all-new drugs.

Those new drugs could potentially be tested on virtual humans. Not flesh-and-blood people, but digital reconstructions that live in a computer. Biomedical engineers have already created digital twins of organs. In ongoing trials, digital hearts are being used to guide surgeons on how—and where—to operate on real hearts.

When I spoke to Natalia Trayanova, the biomedical engineering professor behind this trial, she told me that her model could recommend regions of heart tissue to be burned off as part of treatment for atrial fibrillation. Her tool would normally suggest two or three regions but occasionally would recommend many more. “They just have to trust us,” she told me.

It is unlikely that we’ll completely phase out animal testing by 2030. The UK government acknowledges that animal testing is still required by lots of regulators, including the FDA, the European Medicines Agency, and the World Health Organization. And while alternatives to animal testing have come a long way, none of them perfectly capture how a living body will respond to a treatment.

At least not yet. Given all the progress that has been made in recent years, it’s not too hard to imagine a future without animal testing.

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

EmTech AI 2025: How AI is revolutionizing science

2025-11-14 02:33:33

OpenAI’s new LLM exposes the secrets of how AI really works

2025-11-14 02:00:00

ChatGPT maker OpenAI has built an experimental large language model that is far easier to understand than typical models.

That’s a big deal, because today’s LLMs are black boxes: Nobody fully understands how they do what they do. Building a model that is more transparent sheds light on how LLMs work in general, helping researchers figure out why models hallucinate, why they go off the rails, and just how far we should trust them with critical tasks.

“As these AI systems get more powerful, they’re going to get integrated more and more into very important domains,” Leo Gao, a research scientist at OpenAI, told MIT Technology Review in an exclusive preview of the new work. “It’s very important to make sure they’re safe.”

This is still early research. The new model, called a weight-sparse transformer, is far smaller and far less capable than top-tier mass-market models like the firm’s GPT-5, Anthropic’s Claude, and Google DeepMind’s Gemini. At most it’s as capable as GPT-1, a model that OpenAI developed back in 2018, says Gao (though he and his colleagues haven’t done a direct comparison).

But the aim isn’t to compete with the best in class (at least, not yet). Instead, by looking at how this experimental model works, OpenAI hopes to learn about the hidden mechanisms inside those bigger and better versions of the technology.

It’s interesting research, says Elisenda Grigsby, a mathematician at Boston College who studies how LLMs work and who was not involved in the project: “I’m sure the methods it introduces will have a significant impact.”

Lee Sharkey, a research scientist at AI startup Goodfire, agrees. “This work aims at the right target and seems well executed,” he says.

Why models are so hard to understand

OpenAI’s work is part of a hot new field of research known as mechanistic interpretability, which is trying to map the internal mechanisms that models use when they carry out different tasks.

That’s harder than it sounds. LLMs are built from neural networks, which consist of nodes, called neurons, arranged in layers. In most networks, each neuron is connected to every other neuron in its adjacent layers. Such a network is known as a dense network.

Dense networks are relatively efficient to train and run, but they spread what they learn across a vast knot of connections. The result is that simple concepts or functions can be split up between neurons in different parts of a model. At the same time, specific neurons can also end up representing multiple different features, a phenomenon known as superposition (a term borrowed from quantum physics). The upshot is that you can’t relate specific parts of a model to specific concepts.

“Neural networks are big and complicated and tangled up and very difficult to understand,” says Dan Mossing, who leads the mechanistic interpretability team at OpenAI. “We’ve sort of said: ‘Okay, what if we tried to make that not the case?’”

Instead of building a model using a dense network, OpenAI started with a type of neural network known as a weight-sparse transformer, in which each neuron is connected to only a few other neurons. This forced the model to represent features in localized clusters rather than spread them out.

Their model is far slower than any LLM on the market. But it is easier to relate its neurons or groups of neurons to specific concepts and functions. “There’s a really drastic difference in how interpretable the model is,” says Gao.

Gao and his colleagues have tested the new model with very simple tasks. For example, they asked it to complete a block of text that opens with quotation marks by adding matching marks at the end.

It’s a trivial request for an LLM. The point is that figuring out how a model does even a straightforward task like that involves unpicking a complicated tangle of neurons and connections, says Gao. But with the new model, they were able to follow the exact steps the model took.

“We actually found a circuit that’s exactly the algorithm you would think to implement by hand, but it’s fully learned by the model,” he says. “I think this is really cool and exciting.”

Where will the research go next? Grigsby is not convinced the technique would scale up to larger models that have to handle a variety of more difficult tasks.

Gao and Mossing acknowledge that this is a big limitation of the model they have built so far and agree that the approach will never lead to models that match the performance of cutting-edge products like GPT-5. And yet OpenAI thinks it might be able to improve the technique enough to build a transparent model on a par with GPT-3, the firm’s breakthrough 2021 LLM.

“Maybe within a few years, we could have a fully interpretable GPT-3, so that you could go inside every single part of it and you could understand how it does every single thing,” says Gao. “If we had such a system, we would learn so much.”

Google DeepMind is using Gemini to train agents inside Goat Simulator 3

2025-11-13 23:00:00

Google DeepMind has built a new video-game-playing agent called SIMA 2 that can navigate and solve problems in a wide range of 3D virtual worlds. The company claims it’s a big step toward more general-purpose agents and better real-world robots.

Google DeepMind first demoed SIMA (which stands for “scalable instructable multiworld agent”) last year. But SIMA 2 has been built on top of Gemini, the firm’s flagship large language model, which gives the agent a huge boost in capability.

The researchers claim that SIMA 2 can carry out a range of more complex tasks inside virtual worlds, figure out how to solve certain challenges by itself, and chat with its users. It can also improve itself by tackling harder tasks multiple times and learning through trial and error.

“Games have been a driving force behind agent research for quite a while,” Joe Marino, a research scientist at Google DeepMind, said in a press conference this week. He noted that even a simple action in a game, such as lighting a lantern, can involve multiple steps: “It’s a really complex set of tasks you need to solve to progress.”

The ultimate aim is to develop next-generation agents that are able to follow instructions and carry out open-ended tasks inside more complex environments than a web browser. In the long run, Google DeepMind wants to use such agents to drive real-world robots. Marino claimed that the skills SIMA 2 has learned, such as navigating an environment, using tools, and collaborating with humans to solve problems, are essential building blocks for future robot companions.

Unlike previous work on game-playing agents such as AlphaZero, which beat a Go grandmaster in 2016, or AlphaStar, which beat 99.8% of ranked human competition players at the video game StarCraft 2 in 2019, the idea behind SIMA is to train an agent to play an open-ended game without preset goals. Instead, the agent learns to carry out instructions given to it by people.

Humans control SIMA 2 via text chat, by talking to it out loud, or by drawing on the game’s screen. The agent takes in a video game’s pixels frame by frame and figures out what actions it needs to take to carry out its tasks.

Like its predecessor, SIMA 2 was trained on footage of humans playing eight commercial video games, including No Man’s Sky and Goat Simulator 3, as well as three virtual worlds created by the company. The agent learned to match keyboard and mouse inputs to actions.

Hooked up to Gemini, the researchers claim, SIMA 2 is far better at following instructions (asking questions and providing updates as it goes) and figuring out for itself how to perform certain more complex tasks.

Google DeepMind tested the agent inside environments it had never seen before. In one set of experiments, researchers asked Genie 3, the latest version of the firm’s world model, to produce environments from scratch and dropped SIMA 2 into them. They found that the agent was able to navigate and carry out instructions there.

The researchers also used Gemini to generate new tasks for SIMA 2. If the agent failed, at first Gemini generated tips that SIMA 2 took on board when it tried again. Repeating a task multiple times in this way often allowed SIMA 2 to improve by trial and error until it succeeded, Marino said.

Git gud

SIMA 2 is still an experiment. The agent struggles with complex tasks that require multiple steps and more time to complete. It also remembers only its most recent interactions (to make SIMA 2 more responsive, the team cut its long-term memory). It’s also still nowhere near as good as people at using a mouse and keyboard to interact with a virtual world.

Julian Togelius, an AI researcher at New York University who works on creativity and video games, thinks it’s an interesting result. Previous attempts at training a single system to play multiple games haven’t gone too well, he says. That’s because training models to control multiple games just by watching the screen isn’t easy: “Playing in real time from visual input only is ‘hard mode,’” he says.

In particular, Togelius calls out GATO, a previous system from Google DeepMind, which—despite being hyped at the time—could not transfer skills across a significant number of virtual environments.

Still, he is open-minded about whether or not SIMA 2 could lead to better robots. “The real world is both harder and easier than video games,” he says. It’s harder because you can’t just press A to open a door. At the same time, a robot in the real world will know exactly what its body can and can’t do at any time. That’s not the case in video games, where the rules inside each virtual world can differ.

Others are more skeptical. Matthew Guzdial, an AI researcher at the University of Alberta, isn’t too surprised that SIMA 2 can play many different video games. He notes that most games have very similar keyboard and mouse controls: Learn one and you learn them all. “If you put a game with weird input in front of it, I don’t think it’d be able to perform well,” he says.

Guzdial also questions how much of what SIMA 2 has learned would really carry over to robots. “It’s much harder to understand visuals from cameras in the real world compared to games, which are designed with easily parsable visuals for human players,” he says.

Still, Marino and his colleagues hope to continue their work with Genie 3 to allow the agent to improve inside a kind of endless virtual training dojo, where Genie generates worlds for SIMA to learn in via trial and error guided by Gemini’s feedback. “We’ve kind of just scratched the surface of what’s possible,” he said at the press conference.

The Download: AI to measure pain, and how to deal with conspiracy theorists

2025-11-13 21:10:00

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

AI is changing how we quantify pain

Researchers around the world are racing to turn pain—medicine’s most subjective vital sign—into something a camera or sensor can score as reliably as blood pressure.

The push has already produced PainChek—a smartphone app that scans people’s faces for tiny muscle movements and uses artificial intelligence to output a pain score—which has been cleared by regulators on three continents and has logged more than 10 million pain assessments. Other startups are beginning to make similar inroads.

The way we assess pain may finally be shifting, but when algorithms measure our suffering, does that change the way we treat it? Read the full story.

—Deena Mousa

This story is from the latest print issue of MIT Technology Review magazine, which is full of fascinating stories about our bodies. If you haven’t already, subscribe now to receive future issues once they land.

How to help friends and family dig out of a conspiracy theory black hole

—Niall Firth

Someone I know became a conspiracy theorist seemingly overnight.

It was during the pandemic. They suddenly started posting daily on Facebook about the dangers of covid vaccines and masks, warning of an attempt to control us.

As a science and technology journalist, I felt that my duty was to respond. I tried, but all I got was derision. Even now I still wonder: Are there things I could have done differently to talk them back down and help them see sense?

I gave Sander van der Linden, professor of social psychology in society at the University of Cambridge, a call to ask: What would he advise if family members or friends show signs of having fallen down the rabbit hole? Read the full story.

This story is part of MIT Technology Review’s series “The New Conspiracy Age,” on how the present boom in conspiracy theories is reshaping science and technology. Check out the rest of the series here. It’s also part of our How To series, giving you practical advice to help you get things done.

If you’re interested in hearing more about how to survive in the age of conspiracies, join our features editor Amanda Silverman and executive editor Niall Firth for a subscriber-exclusive Roundtable conversation with conspiracy expert Mike Rothschild. It’s at 1pm ET on Thursday November 20—register now to join us!

Google is still aiming for its “moonshot” 2030 energy goals

—Casey Crownhart

Last week, we hosted EmTech MIT, MIT Technology Review’s annual flagship conference in Cambridge, Massachusetts. As you might imagine, some of this climate reporter’s favorite moments came in the climate sessions. I was listening especially closely to my colleague James Temple’s discussion with Lucia Tian, head of advanced energy technologies at Google.

They spoke about the tech giant’s growing energy demand and what sort of technologies the company is looking to to help meet it. In case you weren’t able to join us, let’s dig into that session and consider how the company is thinking about energy in the face of AI’s rapid rise. Read the full story.

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 ChatGPT is now “warmer and more conversational”
But it’s also slightly more willing to discuss sexual and violent content. (The Register)
+ ChatGPT has a very specific writing style. (WP $)
+ The looming crackdown on AI companionship. (MIT Technology Review)

2 The US could deny visas to visitors with obesity, cancer or diabetes
As part of its ongoing efforts to stem the flow of people trying to enter the country. (WP $)

3 Microsoft is planning to create its own AI chip
And it’s going to use OpenAI’s internal chip-building plans to do it. (Bloomberg $)
+ The company is working on a colossal data center in Atlanta. (WSJ $)

4 Early AI agent adopters are convinced they’ll see a return on their investment soon
Mind you, they would say that. (WSJ $)
+ An AI adoption riddle. (MIT Technology Review)

5 Waymo’s robotaxis are hitting American highways
Until now, they’ve typically gone out of their way to avoid them. (The Verge)
+ Its vehicles will now reach speeds of up to 65 miles per hour. (FT $)
+ Waymo is proving long-time detractor Elon Musk wrong. (Insider $)

6 A new Russian unit is hunting down Ukraine’s drone operators
It’s tasked with killing the pilots behind Ukraine’s successful attacks. (FT $)
+ US startup Anduril wants to build drones in the UAE. (Bloomberg $)
+ Meet the radio-obsessed civilian shaping Ukraine’s drone defense. (MIT Technology Review)

7 Anthropic’s Claude successfully controlled a robot dog
It’s important to know what AI models may do when given access to physical systems. (Wired $)

8 Grok briefly claimed Donald Trump won the 2020 US election
As reliable as ever, I see. (The Guardian)

9 The Northern Lights are playing havoc with satellites
Solar storms may look spectacular, but they make it harder to keep tabs on space. (NYT $)
+ Seriously though, they look amazing. (The Atlantic $)
+ NASA’s new AI model can predict when a solar storm may strike. (MIT Technology Review)

10 Apple users can now use digital versions of their passports
But it’s strictly for internal flights within the US only. (TechCrunch)

Quote of the day

“I hope this mistake will turn into an experience.”

—Vladimir Vitukhin, chief executive of the company behind Russia’s first anthropomorphic robot AIDOL, offers a philosophical response to the machine falling flat on its face during a reveal event, the New York Times reports.

One more thing

Welcome to the oldest part of the metaverse

Headlines treat the metaverse as a hazy dream yet to be built. But if it’s defined as a network of virtual worlds we can inhabit, its oldest corner has been already running for 25 years.

It’s a medieval fantasy kingdom created for the online role-playing game Ultima Online. It was the first to simulate an entire world: a vast, dynamic realm where players could interact with almost anything, from fruit on trees to books on shelves.

Ultima Online has already endured a quarter-century of market competition, economic turmoil, and political strife. So what can this game and its players tell us about creating the virtual worlds of the future? Read the full story.

—John-Clark Levin

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Unlikely duo Sting and Shaggy are starring together in a New York musical.
+ Barry Manilow was almost in Airplane!? That would be an entirely different kind of flying, altogether
+ What makes someone sexy? Well, that depends.
+ Keep an eye on those pink dolphins, they’re notorious thieves.

MIT Technology ReviewModify

Rss preview of Blog of MIT Technology Review

Why models are so hard to understand

Git gud

The author's social media

MIT Technology Review Modify