2025-09-30 23:33:15
In a small trial, a gene therapy injected into the brain slowed the disease by 75 percent over three years.
Huntington’s disease is extremely cruel. Symptoms start with random, uncontrollable twitches of the hand. Over time the disease eats aways at memory, thought, and reason. Mood swings and personality changes strip away your identity. Eventually, it leads to an early death.
Worse, unlike other diseases that gradually destroy brain function, such as Alzheimer’s disease, Huntington’s can be diagnosed with a simple genetic test. The disease is inherited through a mutated gene. People with a family history often struggle to decide if they want to get tested. If the results are positive, there are no treatments, and their fates are set.
A new therapy may now kneecap Huntington’s before symptoms take over. Preliminary results from a small group of patients found a single injection of microRNA, a type of gene therapy, into affected brain regions slowed the disease’s progression by 75 percent over three years. The patients had far better motor control, attention span, and processing speed compared to an untreated control group who had similar baseline symptoms.
The drug is being developed by the Dutch gene therapy company uniQure, which summarized the findings in a press release this month. The data hasn’t been published in a preprint article or a scientific journal nor scrutinized by other experts. With only 29 patients involved, it’s hard to generalize the benefits and safety profile for the roughly 75,000 people with Huntington’s in the US, Europe, and UK.
But the findings offer a beacon of hope. Previous attempts at a cure “have shown some small signals if you squint…but there has not been anything close to this,” Steven Finkbeiner at the Gladstone Institutes in California, who was not involved in the study, told the New York Times. And because Huntington’s can be caught early on, the treatment—if further proven effective in a larger population—could begin to ward off symptoms at an earlier age.
All of us have the Huntington’s gene, or HTT. While its exact role in cells is debatable, the gene acts as a central communicator across multiple cellular “phone lines.” It coordinates a large assembly of molecules to turn genes in brain cells on or off and is critical for early development, neuron survival, and maintaining the brain’s overall health.
In Huntington’s disease, however, HTT goes awry. Our genes are made of four molecules represented by the letters A, T, C, and G. Triplets of these letters often dictate the sequence, structure, and function of proteins, the workhorses of our cells. In the disease, one triplet, CAG, repeats like a broken record, resulting in mutated huntingtin proteins that increasingly build up inside the brain throughout a person’s life and gradually wreak havoc.
Although in the beginning brain cells can adapt, their defenses eventually stumble, and symptoms appear. In the US, this usually happens between 30 and 55 years of age.
Families with Huntington’s face a terrible dilemma. If one parent has the disease, each of their children has a 50 percent chance of inheriting it. If they don’t, their offspring are safe. Knowing the diagnosis can help with family and life planning—but it comes at a hefty emotional cost.
How the mutated huntingtin protein destroys brain cells isn’t yet clear, but most scientists agree that clearing it—or preventing it from forming in the first place—could protect the brain.
The protein is massive and made up of multiple fragments. One treatment idea uses small protein “jammers” to prevent an especially toxic form of huntingtin from weaving into large, dangerous aggregates. Another directly targets the CAG repeats with a classic but powerful form of gene therapy. But after initially promising results, a trial was halted due to a high risk of side effects and low chance symptoms would improve. Gene editing strategies, such as CRISPR, that cut out the mutated sequences are gaining steam, but they’re very early stage.
The new therapy developed by uniQUre taps into microRNA. These molecules don’t code for proteins, but they can stop a gene from making one. Like DNA, RNA can also form a double strand if its sequences match. Cells identify double-stranded RNA as alien and destroy it—potentially stopping a toxic protein from forming. The company’s new drug contains two components: A benign viral carrier and a custom genetic sequence that, once inside the cell, produces microRNA tailored to inhibit mutant protein production.
The drug, called AMT-130, doesn’t integrate into or directly edit a patient’s genome, which lowers the risk of disrupting healthy genes or triggering cancer. Although the viral carrier is eventually wiped away by the immune system, the genetic code could last for years, making the drug a potential long-term treatment.
The team injected either a low or high dose of AMT-130 into the brains of volunteers with Huntington’s using an established and highly precise surgical technique. They targeted the striatum, a nub tucked deep inside the brain that’s critical for movement and decision-making and one of the first regions ravaged by the disease. As a control group, they found hundreds of patients of similar age and disease severity, according to an investor presentation (PDF) from the company.
The results were promising. When given the highest dose, 12 people with early stages of the disease experienced, on average, a 75 percent slower decline than those without treatment, as measured using multiple standard Huntington’s assessments.
Roughly 88 percent of treated patients showed marked improvement in their attention, memory, and information processing speed based on one test. Their control over random muscle movements got better, and they were able to perform daily activities with less struggle. A brain protein often associated with symptom severity dropped to levels seen before the trial began. In contrast, those treated with a low dose of the drug had more modest and mixed results.
Multiple people experienced side effects related to the brain surgery. Headaches were the most common complaint. Some experienced brain swelling a few days after the surgery. But overall, the treatment seemed safe.
“The majority of drug-related serious adverse events occurred within the first weeks post treatment and fully resolved with steroids or palliative case,” the company noted in their presentation.
There’s reason to be skeptical. Huntington’s is a life-long disease, and it’s unknown how long the benefits of the single shot last beyond three years. It’s likely multiple shots would be needed throughout a patient’s lifespan, and future studies would have to test the additive effects. The drug slashes levels of both the mutated and normal versions of the huntingtin protein—drugs in the past have as well—which could potentially produce side effects.
New patients are now being enrolled for the trial, and the company hopes to submit an application for FDA approval by late 2026.
“This result changes everything,” Ed Wild, a leader of the project at the UCL Huntington’s Disease Center trial site, said in the press release. “On the basis of these results it seems likely AMT-130 will be the first licensed treatment to slow Huntington’s disease, which is truly world-changing stuff.”
The post A New Approach Could Transform Huntington’s Disease Treatment appeared first on SingularityHub.
2025-09-30 02:35:53
Researchers created extremely realistic voice clones with just four minutes of recordings.
The ability to synthesize realistic speech using AI has a host of applications, both benign and malicious. New research shows that today’s AI-generated voices are now indistinguishable from those of real humans.
AI’s ability to generate speech has improved dramatically in recent years. Many services are now capable of carrying out extended conversations. Typically, these tools can both clone the voices of real people and generate entirely synthetic voices.
This could make powerful AI capabilities far more accessible and raises the prospect of AI agents stepping into a range of customer-facing roles in the real world. But there are also fears these capabilities are powering an explosion of voice cloning scams, where bad actors use AI to impersonate family members or celebrities in an effort to manipulate victims.
Historically, synthesized speech has had a robotic quality that’s made it relatively easy to recognize, and even early AI-powered voice clones gave themselves away with their too-perfect cadence or occasional digital glitches. But a new study has found that the average listener can no longer distinguish between real human voices and deepfake clones made with consumer tools.
“The process required minimal expertise, only a few minutes of voice recordings, and almost no money,” Nadine Lavan at Queen Mary University of London, who led the research, said in a press release. “It just shows how accessible and sophisticated AI voice technology has become.”
To test people’s ability to distinguish human voices from AI-generated ones, the researchers created 40 completely synthetic AI voices and 40 clones of human voices in a publicly available dataset. They used the AI voice generator tool from startup ElevenLabs, and each clone took roughly four minutes of voice recordings to create.
They then challenged 28 participants to rate how real the voices sounded on a scale and make a binary judgment about whether they were human or AI-generated. In results published in PLOS One, the authors found that although people could to some extent distinguish human voices from entirely synthetic ones, they couldn’t tell the difference between voice clones and real voices.
The study also sought to understand whether AI-generated voices had become “hyper-realistic.” Studies have shown that AI image generation has improved to such a degree that AI-generated pictures of faces are often judged as more human than photos of real people.
However, the researchers found the fully synthetic voices were judged less real than human recordings, while the clones roughly matched them. Still, participants reported the AI-generated voices seemed both more dominant and trustworthy than their human counterparts.
Lavan notes that the ability to create ultra-realistic artificial voices could have positive applications. “The ability to generate realistic voices at scale opens up exciting opportunities,” she said. “There might be applications for improved accessibility, education, and communication, where bespoke high-quality synthetic voices can enhance user experience.”
But the results add to a growing body of research suggesting AI voices are quickly becoming impossible to detect. And Lavan says this has many worrying ethical implications in areas like copyright infringement, the ability to spread misinformation, and fraud.
While many companies have attempted to put guardrails on their models designed to prevent misuse, the rapid proliferation of AI technology and the inventiveness of malicious actors suggests this is a problem that is only going to get worse.
The post People Can’t Distinguish AI Voice Clones From Actual Humans Anymore appeared first on SingularityHub.
2025-09-27 22:00:00
OpenAI and Nvidia’s $100B AI Plan Will Require Power Equal to 10 Nuclear ReactorsBenj Edwards | Ars Technica
“Nvidia CEO Jensen Huang told CNBC that the planned 10 gigawatts equals the power consumption of between 4 million and 5 million graphics processing units, which matches the company’s total GPU shipments for this year and doubles last year’s volume.”
Spending on AI Is at Epic Levels. Will It Ever Pay Off?Eliot Brown and Robbie Whelan | The Wall Street Journal
“This week, consultants at Bain & Co. estimated the wave of AI infrastructure spending will require $2 trillion in annual AI revenue by 2030. By comparison, that is more than the combined 2024 revenue of Amazon, Apple, Alphabet, Microsoft, Meta, and Nvidia, and more than five times the size of the entire global subscription software market.”
There Are More Robots Working in China Than the Rest of the World CombinedMeaghan Tobin and Keith Bradsher | The New York Times
“There were more than two million robots working in Chinese factories last year, according to a report released Thursday by the International Federation of Robotics, a nonprofit trade group for makers of industrial robots. Factories in China installed nearly 300,000 new robots last year, more than the rest of the world combined, the report found.”
Huntington’s Disease Breakthrough: What to Know About the Gene TherapyGrace Wade | New Scientist
“An experimental gene therapy has become the first treatment to successfully slow the progression of Huntington’s disease. While the findings are still preliminary, the approach could be a major breakthrough and may even lead to new therapies for other neurodegenerative conditions, like Parkinson’s and Alzheimer’s.”
Google DeepMind Unveils Its First ‘Thinking’ Robotics AIRyan Whitwam | Ars Technica
“Generative AI systems that create text, images, audio, and even video are becoming commonplace. In the same way AI models output those data types, they can also be used to output robot actions. That’s the foundation of Google DeepMind’s Gemini Robotics project, which has announced a pair of new models that work together to create the first robots that ‘think’ before acting.”
UK Startup Wayve Starts Testing Self-Driving Tech in Nissan Cars on Tokyo’s StreetsJasper Jolly | The Guardian
“British startup Wayve has begun testing self-driving cars with Nissan in Japan ahead of a 2027 launch to consumers, as the company said it was in talks for a $500m investment from the chip-maker Nvidia. Wayve, based in London, said it had installed its self-driving technology on Nissan’s electric Ariya vehicles and tested them on Tokyo’s streets, after first agreeing a deal with the Japanese carmaker in April.”
Why the AI ‘Megasystem Problem’ Needs Our AttentionEric Markowitz | Big Think
“What if the greatest danger of artificial intelligence isn’t a single rogue system, but many systems quietly working together? Dr. Susan Schneider calls this the ‘megasystem problem’: networks of AI models colluding in ways we can’t predict, producing emergent structures beyond human control. It’s also something she believes is one of the most urgent—and overlooked—risks we face…with AI today.”
Exploit Allows for Takeover of Fleets of Unitree RobotsEvan Ackerman | IEEE Spectrum
“Because the vulnerability is wireless, and the resulting access to the affected platform is complete, the vulnerability becomes wormable, say the researchers, meaning ‘an infected robot can simply scan for other Unitree robots in BLE range and automatically compromise them, creating a robot botnet that spreads without user intervention.’ …As far as IEEE Spectrum is aware, this is the first major public exploit of a commercial humanoid platform.”
How Nvidia Is Backstopping America’s AI BoomRobbie Whelan and Bradley Olson | The Wall Street Journal
“[Nvidia] has used its balance sheet clout to keep the AI boom humming through deals, partnerships, and investments in companies that are among its top customers, including cloud-computing provider CoreWeave, rival chip designer Intel, and xAI.”
Chatbait Is Taking Over the InternetLila Shroff | The Atlantic
“Lately, chatbots seem to be using more sophisticated tactics to keep people talking. In some cases, like my request for headache tips, bots end their messages with prodding follow-up questions. In others, they proactively message users to coax them into conversation: After clicking through the profiles of 20 AI bots on Instagram, all of them DM’ed me first. ‘Hey bestie! what’s up?? ,’ wrote one.”
The post This Week’s Awesome Tech Stories From Around the Web (Through September 27) appeared first on SingularityHub.
2025-09-26 22:00:00
A review of over 100 years of neuroscience research asks if some brain regions are more important than others for consciousness.
What gives rise to human consciousness? Are some parts of the brain more important than others? Scientists began tackling these questions in more depth about 35 years ago. Researchers have made progress, but the mystery of consciousness remains very much alive.
In a recently published article, I reviewed over 100 years of neuroscience research to see if some brain regions are more important than others for consciousness. What I found suggests scientists who study consciousness may have been undervaluing the most ancient regions of human brains.
Consciousness is usually defined by neuroscientists as the ability to have subjective experience, such as the experience of tasting an apple or of seeing the redness of its skin.
The leading theories of consciousness suggest that the outer layer of the human brain, called the cortex (in blue in figure 1), is fundamental to consciousness. This is mostly composed of the neocortex, which is newer in our evolutionary history.
The human subcortex (figure 1, brown/beige), underneath the neocortex, has not changed much in the last 500 million years. It is thought to be like electricity for a TV, necessary for consciousness, but not enough on its own.
There is another part of the brain that some neuroscientific theories of consciousness state is irrelevant for consciousness. This is the cerebellum, which is also older than the neocortex and looks like a little brain tucked in the back of the skull (figure 1, purple). Brain activity and brain networks are disrupted in unconsciousness (like in a coma). These changes can be seen in the cortex, subcortex, and cerebellum.
As part of my analysis I looked at studies showing what happens to consciousness when brain activity is changed, for example, by applying electrical currents or magnetic pulses to brain regions.
These experiments in humans and animals showed that altering activity in any of these three parts of the brain can alter consciousness. Changing the activity of the neocortex can change your sense of self, make you hallucinate, or affect your judgment.
Changing the subcortex may have extreme effects. We can induce depression, wake a monkey from anesthesia or knock a mouse unconscious. Even stimulating the cerebellum, long considered irrelevant, can change your conscious sensory perception.
However, this research does not allow us to reach strong conclusions about where consciousness comes from, as stimulating one brain region may affect another region. Like unplugging the TV from the socket, we might be changing the conditions that support consciousness, but not the mechanisms of consciousness itself.
So I looked at some evidence from patients to see if it would help resolve this dilemma.
Damage from physical trauma or lack of oxygen to the brain can disrupt your experience. Injury to the neocortex may make you think your hand is not yours, fail to notice things on one side of your visual field, or become more impulsive.
People born without the cerebellum, or the front of their cortex, can still appear conscious and live quite normal lives. However, damaging the cerebellum later in life can trigger hallucinations or change your emotions completely.
Harm to the most ancient parts of our brain can directly cause unconsciousness (although some people recover) or death. However, like electricity for a TV, the subcortex may be just keeping the newer cortex “online,” which may be giving rise to consciousness. So I wanted to know whether, alternatively, there is evidence that the most ancient regions are sufficient for consciousness.
There are rare cases of children being born without most or all of their neocortex. According to medical textbooks, these people should be in a permanent vegetative state. However, there are reports that these people can feel upset, play, recognize people, or show enjoyment of music. This suggests that they are having some sort of conscious experience.
These reports are striking evidence that suggests maybe the oldest parts of the brain are enough for basic consciousness. Or maybe, when you are born without a cortex, the older parts of the brain adapt to take on some of the roles of the newer parts of the brain.
There are some extreme experiments on animals that can help us reach a conclusion. Across mammals—from rats to cats to monkeys—surgically removing the neocortex leaves them still capable of an astonishing number of things. They can play, show emotions, groom themselves, parent their young, and even learn. Surprisingly, even adult animals that underwent this surgery showed similar behavior.
Altogether, the evidence challenges the view that the cortex is necessary for consciousness, as most major theories of consciousness suggest. It seems that the oldest parts of the brain are enough for some basic forms of consciousness.
The newer parts of the brain—as well as the cerebellum—seem to expand and refine your consciousness. This means we may have to review our theories of consciousness. In turn, this may influence patient care as well as how we think about animal rights. In fact, consciousness might be more common than we realized.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
The post Major Theories of Consciousness May Have Been Focusing on the Wrong Part of the Brain appeared first on SingularityHub.
2025-09-25 22:00:00
It’s a first step toward AI-generated life forms.
A petri dish full of dead bacteria isn’t usually cause for celebration. But for Stanford’s Brian Hie it was a game-changer in his efforts to create synthetic life.
The perpetrator was a type of virus called a bacteriophage that infects and kills bacteria but not human cells. Bacteriophages have evolved over eons to take out dangerous bacteria and are potentially a powerful tool in the fight against antibacterial resistance.
But the new virus erased evolution from the equation. An AI similar to ChatGPT designed its entire genome. The new genetic code allowed the synthetic virus to replicate, infect, and destroy bacteria, marking the first step towards an AI-designed life form.
To be clear, although the virus works like its natural counterparts, it’s not exactly “alive.” Viruses are made of tiny scraps of genetic material and need a host—in this case, bacteria—to replicate and spread.
Even so, these viruses are the closest scientists have come to engineering new forms of life using generative AI. The results could bolster treatments against dangerous bacterial infections and shed light on how to build more complex artificial cells.
“This is the first time AI systems are able to write coherent genome-scale sequences,” Hie told Nature. The work was published as a preprint on bioRxiv and not peer-reviewed.
The genetic playbook for all life on Earth is relatively simple. Four molecules represented by the letters A, T, C, and G are arranged in three-letter groups that code amino acids and proteins.
Synthetic biologists fiddle with this genetic code by adding beneficial genes or deleting those that cause disease. Thanks to their tinkering, we can now produce insulin and a variety of other medications in E. Coli, a bacteria commonly used in the lab and biomanufacturing.
Now generative AI is changing the game again.
These algorithms can already dream up DNA sequences, protein structures, and large molecular complexes from scratch. But building a functional genome is much harder. The sequences need to encode life’s machinery and make sure it works together as expected.
“Many important biological functions arise not from single genes, but from complex interactions encoded by entire genomes,” wrote the team.
The new study turned to Evo 1 and Evo 2, two generative AI models developed at the nonprofit Arc Institute. Rather than inhaling blogs, YouTube comments, and Reddit posts, Evo 2 was trained on roughly 128,000 genomes—9.3 trillion DNA letter pairs—spanning all of life’s domains, making it the largest AI model for biology to date.
The models eventually learned how changes in DNA sequences alter RNA, proteins, and overall health, allowing them to write new proteins and small genomes from scratch.
Evo 1, for example, generated new CRISPR gene-editing tools and bacterial genomes—although the latter often contained wildly unnatural sequences that prevented them from powering living synthetic bacteria. Evo 2 produced a full set of human mitochondrial DNA that churned out proteins similar to naturally occurring ones. The model also created a minimal bacterial genome and a yeast chromosome. But none of these were tested in living cells to see if they worked.
The new work focused on simpler biological systems—bacteriophages. These viruses attack bacteria and are now in clinical trials to combat antibiotic resistance. Synthetic bacteriophages could, in theory, be even deadlier.
The team began with phiX174, a virus with just a single strand of DNA, 11 genes, and 7 chunks of gene-regulating DNA. Despite its petite genome, the virus has all it needs to infect hosts, replicate, and spread. It also has a long history in synthetic biology. Its genome has been fully sequenced and synthesized in the lab, so it’s easier to tinker with. It’s also been shown to be safe and “has continually served as a pivotal model within molecular biology,” wrote the team.
Although the Evo AI models were already trained on around two million genomes, the team fine-tuned their abilities by putting them through a kind of “masterclass” on phage DNA. They also added genome and protein constraints seen in these viruses and prompts to encourage novelty.
The AI models next generated thousands of genomes, some containing obvious errors. Both models relied on the template from training but also came up with their own spins on a phage genome. Roughly 40 percent of their DNA letters were similar to phiX174, but some sequences were out the box with completely different genetic identities.
The team zeroed in on and synthesized 302 potential candidates and tested them for their ability to infect and destroy bacteria. Overall, 16 AI-designed candidates acted like bacteriophages. They tunneled into E. Coli bacteria, replicated, burst through the bacteria’s membranes, and spread to neighboring cells. Surprisingly, a combination of the synthetic viruses could also infect and kill other strains of E. Coli, which they were not designed to do.
“These results demonstrate that genome language models…can design viable phage genomes,” wrote the team.
Generative AI could massively speed up scientists’ ability to write synthetic life. Instead of extensive trial-and-error lab tests to decode how genes and other molecular components work together, Evo has essentially internalized those interactions.
With more testing, the technology could be a boon for phage therapy, helping researchers treat serious bacterial infections in people or crops, such as cabbage and bananas.
But the thought of AI-generated viruses can be alarming. So, the team added a series safeguards. Evo’s initial training intentionally left out information on viruses that infect eukaryotes, including human cells. And without humans guiding the models—an approach called supervised learning—the algorithms struggled to design functional genomes. Also, both the phiX174 virus and E. Coli have a long and safe history in biomedical research.
Regardless, the techniques here could potentially be used to enhance human-infecting viruses. “One area where I urge extreme caution is any viral enhancement research, especially when it’s random so you don’t know what you are getting,” J. Craig Venter, a pioneer in synthetic biology, told MIT Technology Review.
Engineering a larger genome, such as that of E. Coli, would need more work. Viruses hijack their host’s cells to replicate. Bacteria, in contrast, need the molecular machinery to grow and proliferate. Meanwhile, debates on the ethics and safety of synthetic life are gaining steam.
The authors say their results lay the foundations for the design of useful living systems at the genome scale with generative AI. Although there’s likely a long and bumpy road ahead, Hie is optimistic. With lots more work, “the next step is AI-generated life,” he said.
The post AI-Designed Viruses Are Replicating and Killing Bacteria appeared first on SingularityHub.
2025-09-24 06:21:35
The AI is designed from the bottom up to prevent privacy breaches.
Training AI models on your data can provide powerful new insights, but it can also potentially result in them leaking sensitive information. Now Google has released a new model designed from the bottom up to prevent these kinds of privacy breaches.
Large language models are a promising way to extract valuable information from the piles of unstructured data most companies are sitting on. But much of this data is full of highly sensitive details about customers, intellectual property, and company finances.
That’s a problem because language models tend to memorize some of the data they’re trained on and can occasionally spit it back out verbatim. That can make it very hard to ensure these models don’t reveal private data to the wrong people in the wrong context.
One potential workaround is an approach called differential privacy, which allows you to extract insights from data without revealing the specifics of the underlying information. However, it makes training AI models significantly less effective, requiring more data and computing resources to achieve a given level of accuracy.
Now though, Google researchers have mapped the trade-offs between privacy guarantees, compute budgets, and data requirements to come up with a recipe for efficiently building privacy-preserving AI models. And they’ve used this playbook to create a 1-billion-parameter model called VaultGemma that performs on par with older models of similar sizes, showing privacy can be protected without entirely sacrificing capability.
“VaultGemma represents a significant step forward in the journey toward building AI that is both powerful and private by design,” the researchers write in a blog post.
Differential privacy involves injecting a small amount of noise, or random data, during the AI training process. This doesn’t change the overarching patterns and insights the model learns, but it obfuscates the contributions of particular data points. This makes it harder for the model to memorize specific details from the dataset that could later be regurgitated.
However, the amount of privacy this technique provides, known as the privacy budget, is directly proportional to the amount of noise added in the training process. And the more noise you add, the less effective the training process and the more data and compute you have to use. These three factors interact in complicated ways that make it tricky to figure out the most efficient way to build a model with specific privacy guarantees and performance.
So the Google team carried out a series of experiments with the company’s open-source Gemma family of models, varying these key parameters to discover how they interact. From this, they outlined a series of scaling laws, detailed in a pre-print on arXiv, that allowed them to predict how altering compute, data, and privacy budgets affects a model’s final performance.
One of their main insights was that ramping up compute during training doesn’t boost model accuracy unless the model is fed more data or privacy guarantees are loosened. They also found the optimal model size is roughly an order of magnitude smaller than models without differential privacy, suggesting it may be difficult to extend the approach to today’s largest models.
However, the scaling laws also predict the most compute-efficient training configuration for a particular dataset size and privacy budget. This allowed them to reduce computing requirements by between 5 and 100 times compared to alternate configurations, while achieving similar accuracy.
The team used these insights to create VaultGemma, which performed comparably to the similarly sized GPT-2 model that OpenAI released in 2019. Given the pace of advances in AI, matching the performance of a model from six years ago is not an especially high bar, but the researchers say the scaling laws they’ve identified should help close that gap.
And in a technical report accompanying the model release, the team provide strong evidence their approach prevents the model from memorizing training data. They took one million training data samples, each 100 tokens long, and fed the first 50 tokens to the model to see if it would complete the sample. While all three generations of Gemma models were guilty of regurgitating some amount of data, they found no evidence VaultGemma had memorized any of the samples.
While VaultGemma remains an experimental model with no real practical value, it demonstrates that relatively sophisticated, privacy-preserving AI models are within reach. Hopefully, others can build on these scaling laws to push the field further in this direction.
The post Google’s VaultGemma AI Hoovers Up Your Data—Without Memorizing It appeared first on SingularityHub.