Science
When A.I.’s Output Is a Threat to A.I. Itself
The internet is becoming awash in words and images generated by artificial intelligence.
Sam Altman, OpenAI’s chief executive, wrote in February that the company generated about 100 billion words per day — a million novels’ worth of text, every day, an unknown share of which finds its way onto the internet.
A.I.-generated text may show up as a restaurant review, a dating profile or a social media post. And it may show up as a news article, too: NewsGuard, a group that tracks online misinformation, recently identified over a thousand websites that churn out error-prone A.I.-generated news articles.
In reality, with no foolproof methods to detect this kind of content, much will simply remain undetected.
All this A.I.-generated information can make it harder for us to know what’s real. And it also poses a problem for A.I. companies. As they trawl the web for new data to train their next models on — an increasingly challenging task — they’re likely to ingest some of their own A.I.-generated content, creating an unintentional feedback loop in which what was once the output from one A.I. becomes the input for another.
In the long run, this cycle may pose a threat to A.I. itself. Research has shown that when generative A.I. is trained on a lot of its own output, it can get a lot worse.
Here’s a simple illustration of what happens when an A.I. system is trained on its own output, over and over again:
This is part of a data set of 60,000 handwritten digits.
When we trained an A.I. to mimic those digits, its output looked like this.
This new set was made by an A.I. trained on the previous A.I.-generated digits. What happens if this process continues? After 20 generations of training new A.I.s on their predecessors’ output, the digits blur and start to erode.
After 30 generations, they converge into a single shape.
While this is a simplified example, it illustrates a problem on the horizon.
Imagine a medical-advice chatbot that lists fewer diseases that match your symptoms, because it was trained on a narrower spectrum of medical knowledge generated by previous chatbots. Or an A.I. history tutor that ingests A.I.-generated propaganda and can no longer separate fact from fiction.
Just as a copy of a copy can drift away from the original, when generative A.I. is trained on its own content, its output can also drift away from reality, growing further apart from the original data that it was intended to imitate.
In a paper published last month in the journal Nature, a group of researchers in Britain and Canada showed how this process results in a narrower range of A.I. output over time — an early stage of what they called “model collapse.”
The eroding digits we just saw show this collapse. When untethered from human input, the A.I. output dropped in quality (the digits became blurry) and in diversity (they grew similar).
How an A.I. that draws digits “collapses” after being trained on its own output
If only some of the training data were A.I.-generated, the decline would be slower or more subtle. But it would still occur, researchers say, unless the synthetic data was complemented with a lot of new, real data.
Degenerative A.I.
In one example, the researchers trained a large language model on its own sentences over and over again, asking it to complete the same prompt after each round.
When they asked the A.I. to complete a sentence that started with “To cook a turkey for Thanksgiving, you…,” at first, it responded like this:
Even at the outset, the A.I. “hallucinates.” But when the researchers further trained it on its own sentences, it got a lot worse…
An example of text generated by an A.I. model. After two generations, it started simply printing long lists.
An example of text generated by an A.I. model after being trained on its own sentences for 2 generations.
And after four generations, it began to repeat phrases incoherently. An example of text generated by an A.I. model after being trained on its own sentences for 4 generations.
“The model becomes poisoned with its own projection of reality,” the researchers wrote of this phenomenon.
This problem isn’t just confined to text. Another team of researchers at Rice University studied what would happen when the kinds of A.I. that generate images are repeatedly trained on their own output — a problem that could already be occurring as A.I.-generated images flood the web.
They found that glitches and image artifacts started to build up in the A.I.’s output, eventually producing distorted images with wrinkled patterns and mangled fingers.
When A.I. image models are trained on their own output, they can produce distorted images, mangled fingers or strange patterns.
A.I.-generated images by Sina Alemohammad and others.
“You’re kind of drifting into parts of the space that are like a no-fly zone,” said Richard Baraniuk, a professor who led the research on A.I. image models.
The researchers found that the only way to stave off this problem was to ensure that the A.I. was also trained on a sufficient supply of new, real data.
While selfies are certainly not in short supply on the internet, there could be categories of images where A.I. output outnumbers genuine data, they said.
For example, A.I.-generated images in the style of van Gogh could outnumber actual photographs of van Gogh paintings in A.I.’s training data, and this may lead to errors and distortions down the road. (Early signs of this problem will be hard to detect because the leading A.I. models are closed to outside scrutiny, the researchers said.)
Why collapse happens
All of these problems arise because A.I.-generated data is often a poor substitute for the real thing.
This is sometimes easy to see, like when chatbots state absurd facts or when A.I.-generated hands have too many fingers.
But the differences that lead to model collapse aren’t necessarily obvious — and they can be difficult to detect.
When generative A.I. is “trained” on vast amounts of data, what’s really happening under the hood is that it is assembling a statistical distribution — a set of probabilities that predicts the next word in a sentence, or the pixels in a picture.
For example, when we trained an A.I. to imitate handwritten digits, its output could be arranged into a statistical distribution that looks like this:
Distribution of A.I.-generated data
Examples of
initial A.I. output:
The distribution shown here is simplified for clarity.
The peak of this bell-shaped curve represents the most probable A.I. output — in this case, the most typical A.I.-generated digits. The tail ends describe output that is less common.
Notice that when the model was trained on human data, it had a healthy spread of possible outputs, which you can see in the width of the curve above.
But after it was trained on its own output, this is what happened to the curve:
Distribution of A.I.-generated data when trained on its own output
It gets taller and narrower. As a result, the model becomes more and more likely to produce a smaller range of output, and the output can drift away from the original data.
Meanwhile, the tail ends of the curve — which contain the rare, unusual or surprising outcomes — fade away.
This is a telltale sign of model collapse: Rare data becomes even rarer.
If this process went unchecked, the curve would eventually become a spike:
Distribution of A.I.-generated data when trained on its own output
This was when all of the digits became identical, and the model completely collapsed.
Why it matters
This doesn’t mean generative A.I. will grind to a halt anytime soon.
The companies that make these tools are aware of these problems, and they will notice if their A.I. systems start to deteriorate in quality.
But it may slow things down. As existing sources of data dry up or become contaminated with A.I. “slop,” researchers say it makes it harder for newcomers to compete.
A.I.-generated words and images are already beginning to flood social media and the wider web. They’re even hiding in some of the data sets used to train A.I., the Rice researchers found.
“The web is becoming increasingly a dangerous place to look for your data,” said Sina Alemohammad, a graduate student at Rice who studied how A.I. contamination affects image models.
Big players will be affected, too. Computer scientists at N.Y.U. found that when there is a lot of A.I.-generated content in the training data, it takes more computing power to train A.I. — which translates into more energy and more money.
“Models won’t scale anymore as they should be scaling,” said Julia Kempe, the N.Y.U. professor who led this work.
The leading A.I. models already cost tens to hundreds of millions of dollars to train, and they consume staggering amounts of energy, so this can be a sizable problem.
‘A hidden danger’
Finally, there’s another threat posed by even the early stages of collapse: an erosion of diversity.
And it’s an outcome that could become more likely as companies try to avoid the glitches and “hallucinations” that often occur with A.I. data.
This is easiest to see when the data matches a form of diversity that we can visually recognize — people’s faces:
This set of A.I. faces was created by the same Rice researchers who produced the distorted faces above. This time, they tweaked the model to avoid visual glitches.
A grid of A.I.-generated faces showing variations in their poses, expressions, ages and races.
This is the output after they trained a new A.I. on the previous set of faces. At first glance, it may seem like the model changes worked: The glitches are gone.
After one generation of training on A.I. output, the A.I.-generated faces appear more similar.
After two generations …
After two generations of training on A.I. output, the A.I.-generated faces are less diverse than the original image.
After three generations …
After three generations of training on A.I. output, the A.I.-generated faces grow more similar.
After four generations, the faces all appeared to converge.
After four generations of training on A.I. output, the A.I.-generated faces appear almost identical.
This drop in diversity is “a hidden danger,” Mr. Alemohammad said. “You might just ignore it and then you don’t understand it until it’s too late.”
Just as with the digits, the changes are clearest when most of the data is A.I.-generated. With a more realistic mix of real and synthetic data, the decline would be more gradual.
But the problem is relevant to the real world, the researchers said, and will inevitably occur unless A.I. companies go out of their way to avoid their own output.
Related research shows that when A.I. language models are trained on their own words, their vocabulary shrinks and their sentences become less varied in their grammatical structure — a loss of “linguistic diversity.”
And studies have found that this process can amplify biases in the data and is more likely to erase data pertaining to minorities.
Ways out
Perhaps the biggest takeaway of this research is that high-quality, diverse data is valuable and hard for computers to emulate.
One solution, then, is for A.I. companies to pay for this data instead of scooping it up from the internet, ensuring both human origin and high quality.
OpenAI and Google have made deals with some publishers or websites to use their data to improve A.I. (The New York Times sued OpenAI and Microsoft last year, alleging copyright infringement. OpenAI and Microsoft say their use of the content is considered fair use under copyright law.)
Better ways to detect A.I. output would also help mitigate these problems.
Google and OpenAI are working on A.I. “watermarking” tools, which introduce hidden patterns that can be used to identify A.I.-generated images and text.
But watermarking text is challenging, researchers say, because these watermarks can’t always be reliably detected and can easily be subverted (they may not survive being translated into another language, for example).
A.I. slop is not the only reason that companies may need to be wary of synthetic data. Another problem is that there are only so many words on the internet.
Some experts estimate that the largest A.I. models have been trained on a few percent of the available pool of text on the internet. They project that these models may run out of public data to sustain their current pace of growth within a decade.
“These models are so enormous that the entire internet of images or conversations is somehow close to being not enough,” Professor Baraniuk said.
To meet their growing data needs, some companies are considering using today’s A.I. models to generate data to train tomorrow’s models. But researchers say this can lead to unintended consequences (such as the drop in quality or diversity that we saw above).
There are certain contexts where synthetic data can help A.I.s learn — for example, when output from a larger A.I. model is used to train a smaller one, or when the correct answer can be verified, like the solution to a math problem or the best strategies in games like chess or Go.
And new research suggests that when humans curate synthetic data (for example, by ranking A.I. answers and choosing the best one), it can alleviate some of the problems of collapse.
Companies are already spending a lot on curating data, Professor Kempe said, and she believes this will become even more important as they learn about the problems of synthetic data.
But for now, there’s no replacement for the real thing.
About the data
To produce the images of A.I.-generated digits, we followed a procedure outlined by researchers. We first trained a type of a neural network known as a variational autoencoder using a standard data set of 60,000 handwritten digits. We then trained a new neural network using only the A.I.-generated digits produced by the previous neural network, and repeated this process in a loop 30 times.
To create the statistical distributions of A.I. output, we used each generation’s neural network to create 10,000 drawings of digits. We then used the first neural network (the one that was trained on the original handwritten digits) to encode these drawings as a set of numbers, known as a “latent space” encoding. This allowed us to quantitatively compare the output of different generations of neural networks. For simplicity, we used the average value of this latent space encoding to generate the statistical distributions shown in the article.
Science
Commentary: My toothache led to a painful discovery: The dental care system is full of cavities as you age
I had a nagging toothache recently, and it led to an even more painful revelation.
If you X-rayed the state of oral health care in the United States, particularly for people 65 and older, the picture would be full of cavities.
“It’s probably worse than you can even imagine,” said Elizabeth Mertz, a UC San Francisco professor and Healthforce Center researcher who studies barriers to dental care for seniors.
Mertz once referred to the snaggletoothed, gap-filled oral health care system — which isn’t really a system at all — as “a mess.”
But let me get back to my toothache, while I reach for some painkiller. It had been bothering me for a couple of weeks, so I went to see my dentist, hoping for the best and preparing for the worst, having had two extractions in less than two years.
Let’s make it a trifecta.
My dentist said a molar needed to be yanked because of a cellular breakdown called resorption, and a periodontist in his office recommended a bone graft and probably an implant. The whole process would take several months and cost roughly the price of a swell vacation.
I’m lucky to have a great dentist and dental coverage through my employer, but as anyone with a private plan knows, dental insurance can barely be called insurance. It’s fine for cleanings and basic preventive routines. But for more complicated and expensive procedures — which multiply as you age — you can be on the hook for half the cost, if you’re covered at all, with annual payout caps in the $1,500 range.
“The No. 1 reason for delayed dental care,” said Mertz, “is out-of-pocket costs.”
So I wondered if cost-wise, it would be better to dump my medical and dental coverage and switch to a Medicare plan that costs extra — Medicare Advantage — but includes dental care options. Almost in unison, my two dentists advised against that because Medicare supplemental plans can be so limited.
Sorting it all out can be confusing and time-consuming, and nobody warns you in advance that aging itself is a job, the benefits are lousy, and the specialty care you’ll need most — dental, vision, hearing and long-term care — are not covered in the basic package. It’s as if Medicare was designed by pranksters, and we’re paying the price now as the percentage of the 65-and-up population explodes.
So what are people supposed to do as they get older and their teeth get looser?
A retired friend told me that she and her husband don’t have dental insurance because it costs too much and covers too little, and it turns out they’re not alone. By some estimates, half of U.S. residents 65 and older have no dental insurance.
That’s actually not a bad option, said Mertz, given the cost of insurance premiums and co-pays, along with the caps. And even if you’ve got insurance, a lot of dentists don’t accept it because the reimbursements have stagnated as their costs have spiked.
But without insurance, a lot of people simply don’t go to the dentist until they have to, and that can be dangerous.
“Dental problems are very clearly associated with diabetes,” as well as heart problems and other health issues, said Paul Glassman, associate dean of the California Northstate University dentistry school.
There is one other option, and Mertz referred to it as dental tourism, saying that Mexico and Costa Rica are popular destinations for U.S. residents.
“You can get a week’s vacation and dental work and still come out ahead of what you’d be paying in the U.S.,” she said.
Tijuana dentist Dr. Oscar Ceballos told me that roughly 80% of his patients are from north of the border, and come from as far away as Florida, Wisconsin and Alaska. He has patients in their 80s and 90s who have been returning for years because in the U.S. their insurance was expensive, the coverage was limited and out-of-pocket expenses were unaffordable.
“For example, a dental implant in California is around $3,000-$5,000,” Ceballos said. At his office, depending on the specifics, the same service “is like $1,500 to $2,500.” The cost is lower because personnel, office rent and other overhead costs are cheaper than in the U.S., Ceballos said.
As we spoke by phone, Ceballos peeked into his waiting room and said three patients were from the U.S. He handed his cellphone to one of them, San Diegan John Lane, who said he’s been going south of the border for nine years.
“The primary reason is the quality of the care,” said Lane, who told me he refers to himself as 39, “with almost 40 years of additional” time on the clock.
Ceballos is “conscientious and he has facilities that are as clean and sterile and as medically up to date as anything you’d find in the U.S.,” said Lane, who had driven his wife down from San Diego for a new crown.
“The cost is 50% less than what it would be in the U.S.,” said Lane, and sometimes the savings is even greater than that.
Come this summer, Lane may be seeing even more Californians in Ceballos’ waiting room.
“Proposed funding cuts to the Medi-Cal Dental program would have devastating impacts on our state’s most vulnerable residents,” said dentist Robert Hanlon, president of the California Dental Assn.
Dental student Somkene Okwuego smiles after completing her work on patient Jimmy Stewart, 83, who receives affordable dental work at the Ostrow School of Dentistry of USC on the USC campus in Los Angeles on February 26, 2026.
(Genaro Molina / Los Angeles Times)
Under Proposition 56’s tobacco tax in 2016, supplemental reimbursements to dentists have been in place, but those increases could be wiped out under a budget-cutting proposal. Only about 40% of the state’s dentists accept Medi-Cal payments as it is, and Hanlon told me a CDA survey indicates that half would stop accepting Medi-Cal patients and many others will accept fewer patients.
“It’s appalling that when the cost of providing healthcare is at an all-time high, the state is considering cutting program funding back to 1990s levels,” Hanlon said. “These cuts … will force patients to forgo or delay basic dental care, driving completely preventable emergencies into already overcrowded emergency departments.”
Somkene Okwuego, who as a child in South L.A. was occasionally a patient at USC’s Herman Ostrow School of Dentistry clinic, will graduate from the school in just a few months.
I first wrote about Okwuego three years ago, after she got an undergrad degree in gerontology, and she told me a few days ago that many of her dental patients are elderly and have Medi-Cal or no insurance at all. She has also worked at a Skid Row dental clinic, and plans after graduation to work at a clinic where dental care is free or discounted.
Okwuego said “fixing the smiles” of her patients is a privilege and boosts their self-image, which can help “when they’re trying to get jobs.” When I dropped by to see her Thursday, she was with 83-year-old patient Jimmy Stewart.
Stewart, an Army veteran, told me he had trouble getting dental care at the VA and had gone years without seeing a dentist before a friend recommended the Ostrow clinic. He said he’s had extractions and top-quality restorative care at USC, with the work covered by his Medi-Cal insurance.
I told Stewart there could be some Medi-Cal cuts in the works this summer.
“I’d be screwed,” he said.
Him and a lot of other people.
steve.lopez@latimes.com
Science
Diablo Canyon clears last California permit hurdle to keep running
Central Coast Water authorities approved waste discharge permits for Diablo Canyon nuclear plant Thursday, making it nearly certain it will remain running through 2030, and potentially through 2045.
The Pacific Gas & Electric-owned plant was originally supposed to shut down in 2025, but lawmakers extended that deadline by five years in 2022, fearing power shortages if a plant that provides about 9 percent the state’s electricity were to shut off.
In December, Diablo Canyon received a key permit from the California Coastal Commission through an agreement that involved PG&E giving up about 12,000 acres of nearby land for conservation in exchange for the loss of marine life caused by the plant’s operations.
Today’s 6-0 vote by the Central Coast Regional Water Board approved PG&E’s plans to limit discharges of pollutants into the water and continue to run its “once-through cooling system.” The cooling technology flushes ocean water through the plant to absorb heat and discharges it, killing what the Coastal Commission estimated to be two billion fish each year.
The board also granted the plant a certification under the Clean Water Act, the last state regulatory hurdle the facility needed to clear before the federal Nuclear Regulatory Commission (NRC) is allowed to renew its permit through 2045.
The new regional water board permit made several changes since the last one was issued in 1990. One was a first-time limit on the chemical tributyltin-10, a toxic, internationally-banned compound added to paint to prevent organisms from growing on ship hulls.
Additional changes stemmed from a 2025 Supreme Court ruling that said if pollutant permits like this one impose specific water quality requirements, they must also specify how to meet them.
The plant’s biggest water quality impact is the heated water it discharges into the ocean, and that part of the permit remains unchanged. Radioactive waste from the plant is regulated not by the state but by the NRC.
California state law only allows the plant to remain open to 2030, but some lawmakers and regulators have already expressed interest in another extension given growing electricity demand and the plant’s role in providing carbon-free power to the grid.
Some board members raised concerns about granting a certification that would allow the NRC to reauthorize the plant’s permits through 2045.
“There’s every reason to think the California entities responsible for making the decision about continuing operation, namely the California [Independent System Operator] and the Energy Commission, all of them are sort of leaning toward continuing to operate this facility,” said boardmember Dominic Roques. “I’d like us to be consistent with state law at least, and imply that we are consistent with ending operation at five years.”
Other board members noted that regulators could revisit the permits in five years or sooner if state and federal laws changes, and the board ultimately approved the permit.
Science
Deadly bird flu found in California elephant seals for the first time
The H5N1 bird flu virus that devastated South American elephant seal populations has been confirmed in seals at California’s Año Nuevo State Park, researchers from UC Davis and UC Santa Cruz announced Wednesday.
The virus has ravaged wild, commercial and domestic animals across the globe and was found last week in seven weaned pups. The confirmation came from the U.S. Department of Agriculture’s National Veterinary Services Laboratory in Ames, Iowa.
“This is exceptionally rapid detection of an outbreak in free-ranging marine mammals,” said Professor Christine Johnson, director of the Institute for Pandemic Insights at UC Davis’ Weill School of Veterinary Medicine. “We have most likely identified the very first cases here because of coordinated teams that have been on high alert with active surveillance for this disease for some time.”
Since last week, when researchers began noticing neurological and respoiratory signs of the disease in some animals, 30 seals have died, said Roxanne Beltran, a professor of ecology and evolutionary biology at UC Santa Cruz. Twenty-nine were weaned pups and the other was an adult male. The team has so far confirmed the virus in only seven of the dead pups.
Infected animals often have tremors convulsions, seizures and muscle weakness, Johnson said.
Beltran said teams from UC Santa Cruz, UC Davis and California State Parks monitor the animals 260 days of the year, “including every day from December 15 to March 1” when the animals typically come ashore to breed, give birth and nurse.
The concerning behavior and deaths were first noticed Feb. 19.
“This is one of the most well-studied elephant seal colonies on the planet,” she said. “We know the seals so well that it’s very obvious to us when something is abnormal. And so my team was out that morning and we observed abnormal behaviors in seals and increased mortality that we had not seen the day before in those exact same locations. So we were very confident that we caught the beginning of this outbreak.”
In late 2022, the virus decimated southern elephant seal populations in South America and several sub-Antarctic Islands. At some colonies in Argentina, 97% of pups died, while on South Georgia Island, researchers reported a 47% decline in breeding females between 2022 and 2024. Researchers believe tens of thousands of animals died.
More than 30,000 sea lions in Peru and Chile died between 2022 and 2024. In Argentina, roughly 1,300 sea lions and fur seals perished.
At the time, researchers were not sure why northern Pacific populations were not infected, but suspected previous or milder strains of the virus conferred some immunity.
The virus is better known in the U.S. for sweeping through the nation’s dairy herds, where it infected dozens of dairy workers, millions of cows and thousands of wild, feral and domestic mammals. It’s also been found in wild birds and killed millions of commercial chickens, geese and ducks.
Two Americans have died from the virus since 2024, and 71 have been infected. The vast majority were dairy or commercial poultry workers. One death was that of a Louisiana man who had underlying conditions and was believed to have been exposed via backyard poultry or wild birds.
Scientists at UC Santa Cruz and UC Davis increased their surveillance of the elephant seals in Año Nuevo in recent years. The catastrophic effect of the disease prompted worry that it would spread to California elephant seals, said Beltran, whose lab leads UC Santa Cruz’s northern elephant seal research program at Año Nuevo.
Johnson, the UC Davis researcher, said the team has been working with stranding networks across the Pacific region for several years — sampling the tissue of birds, elephant seals and other marine mammals. They have not seen the virus in other California marine mammals. Two previous outbreaks of bird flu in U.S. marine mammals occurred in Maine in 2022 and Washington in 2023, affecting gray and harbor seals.
The virus in the animals has not yet been fully sequenced, so it’s unclear how the animals were exposed.
“We think the transmission is actually from dead and dying sea birds” living among the sea lions, Johnson said. “But we’ll certainly be investigating if there’s any mammal-to-mammal transmission.”
Genetic sequencing from southern elephant seal populations in Argentina suggested that version of the virus had acquired mutations that allowed it to pass between mammals.
The H5N1 virus was first detected in geese in China in 1996. Since then it has spread across the globe, reaching North America in 2021. The only continent where it has not been detected is Oceania.
Año Nuevo State Park, just north of Santa Cruz, is home to a colony of some 5,000 elephant seals during the winter breeding season. About 1,350 seals were on the beach when the outbreak began. Other large California colonies are located at Piedras Blancas and Point Reyes National Sea Shore. Most of those animals — roughly 900 — are weaned pups.
It’s “important to keep this in context. So far, avian influenza has affected only a small proportion of the weaned at this time, and there are still thousands of apparently healthy animals in the population,” Beltran said in a press conference.
Public access to the park has been closed and guided elephant seal tours canceled.
Health and wildlife officials urge beachgoers to keep a safe distance from wildlife and keep dogs leashed because the virus is contagious.
-
World4 days agoExclusive: DeepSeek withholds latest AI model from US chipmakers including Nvidia, sources say
-
Massachusetts4 days agoMother and daughter injured in Taunton house explosion
-
Montana1 week ago2026 MHSA Montana Wrestling State Championship Brackets And Results – FloWrestling
-
Denver, CO4 days ago10 acres charred, 5 injured in Thornton grass fire, evacuation orders lifted
-
Louisiana7 days agoWildfire near Gum Swamp Road in Livingston Parish now under control; more than 200 acres burned
-
Technology1 week agoYouTube TV billing scam emails are hitting inboxes
-
Technology1 week agoStellantis is in a crisis of its own making
-
Politics1 week agoOpenAI didn’t contact police despite employees flagging mass shooter’s concerning chatbot interactions: REPORT