Science
When A.I.’s Output Is a Threat to A.I. Itself
The internet is becoming awash in words and images generated by artificial intelligence.
Sam Altman, OpenAI’s chief executive, wrote in February that the company generated about 100 billion words per day — a million novels’ worth of text, every day, an unknown share of which finds its way onto the internet.
A.I.-generated text may show up as a restaurant review, a dating profile or a social media post. And it may show up as a news article, too: NewsGuard, a group that tracks online misinformation, recently identified over a thousand websites that churn out error-prone A.I.-generated news articles.
In reality, with no foolproof methods to detect this kind of content, much will simply remain undetected.
All this A.I.-generated information can make it harder for us to know what’s real. And it also poses a problem for A.I. companies. As they trawl the web for new data to train their next models on — an increasingly challenging task — they’re likely to ingest some of their own A.I.-generated content, creating an unintentional feedback loop in which what was once the output from one A.I. becomes the input for another.
In the long run, this cycle may pose a threat to A.I. itself. Research has shown that when generative A.I. is trained on a lot of its own output, it can get a lot worse.
Here’s a simple illustration of what happens when an A.I. system is trained on its own output, over and over again:
This is part of a data set of 60,000 handwritten digits.
When we trained an A.I. to mimic those digits, its output looked like this.
This new set was made by an A.I. trained on the previous A.I.-generated digits. What happens if this process continues? After 20 generations of training new A.I.s on their predecessors’ output, the digits blur and start to erode.
After 30 generations, they converge into a single shape.
While this is a simplified example, it illustrates a problem on the horizon.
Imagine a medical-advice chatbot that lists fewer diseases that match your symptoms, because it was trained on a narrower spectrum of medical knowledge generated by previous chatbots. Or an A.I. history tutor that ingests A.I.-generated propaganda and can no longer separate fact from fiction.
Just as a copy of a copy can drift away from the original, when generative A.I. is trained on its own content, its output can also drift away from reality, growing further apart from the original data that it was intended to imitate.
In a paper published last month in the journal Nature, a group of researchers in Britain and Canada showed how this process results in a narrower range of A.I. output over time — an early stage of what they called “model collapse.”
The eroding digits we just saw show this collapse. When untethered from human input, the A.I. output dropped in quality (the digits became blurry) and in diversity (they grew similar).
How an A.I. that draws digits “collapses” after being trained on its own output
If only some of the training data were A.I.-generated, the decline would be slower or more subtle. But it would still occur, researchers say, unless the synthetic data was complemented with a lot of new, real data.
Degenerative A.I.
In one example, the researchers trained a large language model on its own sentences over and over again, asking it to complete the same prompt after each round.
When they asked the A.I. to complete a sentence that started with “To cook a turkey for Thanksgiving, you…,” at first, it responded like this:
Even at the outset, the A.I. “hallucinates.” But when the researchers further trained it on its own sentences, it got a lot worse…
An example of text generated by an A.I. model. After two generations, it started simply printing long lists.
An example of text generated by an A.I. model after being trained on its own sentences for 2 generations.
And after four generations, it began to repeat phrases incoherently. An example of text generated by an A.I. model after being trained on its own sentences for 4 generations.
“The model becomes poisoned with its own projection of reality,” the researchers wrote of this phenomenon.
This problem isn’t just confined to text. Another team of researchers at Rice University studied what would happen when the kinds of A.I. that generate images are repeatedly trained on their own output — a problem that could already be occurring as A.I.-generated images flood the web.
They found that glitches and image artifacts started to build up in the A.I.’s output, eventually producing distorted images with wrinkled patterns and mangled fingers.
When A.I. image models are trained on their own output, they can produce distorted images, mangled fingers or strange patterns.
A.I.-generated images by Sina Alemohammad and others.
“You’re kind of drifting into parts of the space that are like a no-fly zone,” said Richard Baraniuk, a professor who led the research on A.I. image models.
The researchers found that the only way to stave off this problem was to ensure that the A.I. was also trained on a sufficient supply of new, real data.
While selfies are certainly not in short supply on the internet, there could be categories of images where A.I. output outnumbers genuine data, they said.
For example, A.I.-generated images in the style of van Gogh could outnumber actual photographs of van Gogh paintings in A.I.’s training data, and this may lead to errors and distortions down the road. (Early signs of this problem will be hard to detect because the leading A.I. models are closed to outside scrutiny, the researchers said.)
Why collapse happens
All of these problems arise because A.I.-generated data is often a poor substitute for the real thing.
This is sometimes easy to see, like when chatbots state absurd facts or when A.I.-generated hands have too many fingers.
But the differences that lead to model collapse aren’t necessarily obvious — and they can be difficult to detect.
When generative A.I. is “trained” on vast amounts of data, what’s really happening under the hood is that it is assembling a statistical distribution — a set of probabilities that predicts the next word in a sentence, or the pixels in a picture.
For example, when we trained an A.I. to imitate handwritten digits, its output could be arranged into a statistical distribution that looks like this:
Distribution of A.I.-generated data
Examples of
initial A.I. output:
The distribution shown here is simplified for clarity.
The peak of this bell-shaped curve represents the most probable A.I. output — in this case, the most typical A.I.-generated digits. The tail ends describe output that is less common.
Notice that when the model was trained on human data, it had a healthy spread of possible outputs, which you can see in the width of the curve above.
But after it was trained on its own output, this is what happened to the curve:
Distribution of A.I.-generated data when trained on its own output
It gets taller and narrower. As a result, the model becomes more and more likely to produce a smaller range of output, and the output can drift away from the original data.
Meanwhile, the tail ends of the curve — which contain the rare, unusual or surprising outcomes — fade away.
This is a telltale sign of model collapse: Rare data becomes even rarer.
If this process went unchecked, the curve would eventually become a spike:
Distribution of A.I.-generated data when trained on its own output
This was when all of the digits became identical, and the model completely collapsed.
Why it matters
This doesn’t mean generative A.I. will grind to a halt anytime soon.
The companies that make these tools are aware of these problems, and they will notice if their A.I. systems start to deteriorate in quality.
But it may slow things down. As existing sources of data dry up or become contaminated with A.I. “slop,” researchers say it makes it harder for newcomers to compete.
A.I.-generated words and images are already beginning to flood social media and the wider web. They’re even hiding in some of the data sets used to train A.I., the Rice researchers found.
“The web is becoming increasingly a dangerous place to look for your data,” said Sina Alemohammad, a graduate student at Rice who studied how A.I. contamination affects image models.
Big players will be affected, too. Computer scientists at N.Y.U. found that when there is a lot of A.I.-generated content in the training data, it takes more computing power to train A.I. — which translates into more energy and more money.
“Models won’t scale anymore as they should be scaling,” said Julia Kempe, the N.Y.U. professor who led this work.
The leading A.I. models already cost tens to hundreds of millions of dollars to train, and they consume staggering amounts of energy, so this can be a sizable problem.
‘A hidden danger’
Finally, there’s another threat posed by even the early stages of collapse: an erosion of diversity.
And it’s an outcome that could become more likely as companies try to avoid the glitches and “hallucinations” that often occur with A.I. data.
This is easiest to see when the data matches a form of diversity that we can visually recognize — people’s faces:
This set of A.I. faces was created by the same Rice researchers who produced the distorted faces above. This time, they tweaked the model to avoid visual glitches.
A grid of A.I.-generated faces showing variations in their poses, expressions, ages and races.
This is the output after they trained a new A.I. on the previous set of faces. At first glance, it may seem like the model changes worked: The glitches are gone.
After one generation of training on A.I. output, the A.I.-generated faces appear more similar.
After two generations …
After two generations of training on A.I. output, the A.I.-generated faces are less diverse than the original image.
After three generations …
After three generations of training on A.I. output, the A.I.-generated faces grow more similar.
After four generations, the faces all appeared to converge.
After four generations of training on A.I. output, the A.I.-generated faces appear almost identical.
This drop in diversity is “a hidden danger,” Mr. Alemohammad said. “You might just ignore it and then you don’t understand it until it’s too late.”
Just as with the digits, the changes are clearest when most of the data is A.I.-generated. With a more realistic mix of real and synthetic data, the decline would be more gradual.
But the problem is relevant to the real world, the researchers said, and will inevitably occur unless A.I. companies go out of their way to avoid their own output.
Related research shows that when A.I. language models are trained on their own words, their vocabulary shrinks and their sentences become less varied in their grammatical structure — a loss of “linguistic diversity.”
And studies have found that this process can amplify biases in the data and is more likely to erase data pertaining to minorities.
Ways out
Perhaps the biggest takeaway of this research is that high-quality, diverse data is valuable and hard for computers to emulate.
One solution, then, is for A.I. companies to pay for this data instead of scooping it up from the internet, ensuring both human origin and high quality.
OpenAI and Google have made deals with some publishers or websites to use their data to improve A.I. (The New York Times sued OpenAI and Microsoft last year, alleging copyright infringement. OpenAI and Microsoft say their use of the content is considered fair use under copyright law.)
Better ways to detect A.I. output would also help mitigate these problems.
Google and OpenAI are working on A.I. “watermarking” tools, which introduce hidden patterns that can be used to identify A.I.-generated images and text.
But watermarking text is challenging, researchers say, because these watermarks can’t always be reliably detected and can easily be subverted (they may not survive being translated into another language, for example).
A.I. slop is not the only reason that companies may need to be wary of synthetic data. Another problem is that there are only so many words on the internet.
Some experts estimate that the largest A.I. models have been trained on a few percent of the available pool of text on the internet. They project that these models may run out of public data to sustain their current pace of growth within a decade.
“These models are so enormous that the entire internet of images or conversations is somehow close to being not enough,” Professor Baraniuk said.
To meet their growing data needs, some companies are considering using today’s A.I. models to generate data to train tomorrow’s models. But researchers say this can lead to unintended consequences (such as the drop in quality or diversity that we saw above).
There are certain contexts where synthetic data can help A.I.s learn — for example, when output from a larger A.I. model is used to train a smaller one, or when the correct answer can be verified, like the solution to a math problem or the best strategies in games like chess or Go.
And new research suggests that when humans curate synthetic data (for example, by ranking A.I. answers and choosing the best one), it can alleviate some of the problems of collapse.
Companies are already spending a lot on curating data, Professor Kempe said, and she believes this will become even more important as they learn about the problems of synthetic data.
But for now, there’s no replacement for the real thing.
About the data
To produce the images of A.I.-generated digits, we followed a procedure outlined by researchers. We first trained a type of a neural network known as a variational autoencoder using a standard data set of 60,000 handwritten digits. We then trained a new neural network using only the A.I.-generated digits produced by the previous neural network, and repeated this process in a loop 30 times.
To create the statistical distributions of A.I. output, we used each generation’s neural network to create 10,000 drawings of digits. We then used the first neural network (the one that was trained on the original handwritten digits) to encode these drawings as a set of numbers, known as a “latent space” encoding. This allowed us to quantitatively compare the output of different generations of neural networks. For simplicity, we used the average value of this latent space encoding to generate the statistical distributions shown in the article.
Science
Video: NASA Announces Artemis III Crew
new video loaded: NASA Announces Artemis III Crew
transcript
transcript
NASA Announces Artemis III Crew
NASA announced the crew of Artemis III mission, which will fly to low-Earth orbit to test rendezvous and docking maneuvers with one or two lunar landers.
-
“I am excited to welcome you as the next crew in the Artemis journey to successfully return to the moon — this time to stay.” “I’m honored by the role that I’ve been given. I’m also very humbled by the task in front of us. But first and foremost, I’m grateful.” “So with that, the Artemis II crew, comrade, hands you the baton. You got the controls.” “As you know, we had a significant anomaly at our Launch Complex 36A on May 28. We’ve redoubled our efforts and are moving forward.”
By Alisa Shodiyev Kaff
June 9, 2026
Science
Santa Monica Mountains’ last steelhead trout survived the Palisades fire — and even had babies
Scientists feared the Santa Monica Mountains’ last remaining steelhead trout were dead, smothered by debris flows unleashed by the Palisades fire.
But the endangered fish surprised them: A team of biologists recently spotted 30 of the rare trout — and 21 babies — in Topanga Creek.
“There was a lot of happy dancing in the creek,” said Rosi Dagit, principal conservation biologist for the Resource Conservation District of the Santa Monica Mountains, which works with public and private landowners to conserve natural resources.
That’s because the steelhead here are endangered, at both the state and federal levels. Once, they swam in most streams of the Santa Monicas, but their numbers plummeted amid overfishing and coastal development. Increasingly frequent wildfire has further stressed their habitat. Topanga Creek, a biodiversity hot spot, is home to their last known population in the mountains that stretch from the Hollywood Hills to Point Mugu in Ventura County.
The trout that were spotted, including this one, are part of a distinct Southern California population that’s listed as endangered at the state and federal levels.
(RCDSMM Stream Team)
The California Department of Fish and Wildlife spearheaded a complex mission to rescue trout threatened by the Palisades fire that sparked in January 2025.
Time was of the essence. The fire hadn’t yet been fully contained. But rain was on the way, which would sweep massive amounts of sediment from the denuded hillsides into the water. Fish are often killed this way.
Crews stunned the fish with electricity, scooped them up in buckets, trucked them to a hatchery and ultimately moved them to Arroyo Hondo Creek in Santa Barbara County.
Within days, Topanga Creek was choked with mud. Some assumed the fish left behind were goners.
But in March, the conservation district’s team found four. The following month, when water conditions were clearer, they saw more.
“These fish continue to amaze me,” said Kyle Evans, environmental program manager for the state Department of Fish and Wildlife, who had seen the damage to the creek. “I had seen populations get wiped out in similar situations. So when I heard, I was thrilled.”
Evans surmises the fish that survived were in an area of the creek where less charred material and sediment were swept in.
“These fish likely hunkered down, were hiding under some rocks or places to try to get away from the main concentration of flow,” he said. “And luckily they weren’t buried.”
The ones that were spotted were fairly small, around 6 to 14 inches. Rainbow trout and steelhead trout are the same species, but with different lifestyles. If the fish remain in freshwater, they’ll be considered rainbows. However, they can migrate to the ocean and become steelhead, where they typically grow larger before returning to their natal waters to spawn.
Topanga Creek hasn’t fully recovered from the damage it sustained, but scientists say it’s looking better. Surveys last year were “so depressing,” Dagit said, with very few animals, and stretches that were essentially transformed into flat roads from all the sediment buildup. Some of the riparian canopy burned right down to the creek.
Then came 32 inches of rain over the last nine months, scouring out and moving sediment, creating deeper pools. Dagit said they recently found newt egg masses for the first time in years, as well as a few adult newts and many frogs. Plants that provide cover are starting to recover.
She provided photos comparing certain pools last year and this year, some dramatically transformed. In September 2025, the Shrine Pool could have been an overgrown hiking trail. This April, it was filled with shallow water.
The Shrine Pool in September 2025, left, and the same location in April 2026, right, with RCDSMM’s Isaac Yelchin donning a wetsuit.
(RCDSMM Stream Team)
Topanga Creek is home to another endangered fish, the small but hardy northern tidewater goby, often described as cute. Not long before the trout operation, Dagit led a rescue of hundreds of these fish too. Many were repatriated to the lagoon at the mouth of the creek in a moving ceremony last June.
There’s still the matter of what to do with the trout that were moved to Santa Barbara County last year. Evans would like to bring them home to the Santa Monicas at some point, but isn’t sure if it will happen. On one hand, they could bolster the small, genetically isolated surviving population. On the other, they might inadvertently bring in a disease or bacteria. There is some time to decide. Evans estimates the creek still needs to recover for two to three more years.
For now, the fish are functioning fine in their adopted creek. Experts worried the trauma wrought by the move would disrupt their spawning process, but they had babies that spring. This year, they spawned again.
Science
Pacifica pier cracks, another coastal casualty as seas continue to rise
The Pacifica Municipal Pier was shut down and taped off Thursday after city workers noticed cracks running through the landmark structure and concrete chunks falling into the ocean.
It’s just one of many coastal California structures that have recently crumbled under pressure from a rising and relentless ocean.
Officials from the small, beach city south of San Francisco said the pier was closed due to “cracking, separation, and displacement of the concrete walkway and structural elements.”
It will stay closed while structural engineers asses its safety.
Photos taken by city employees show a wide crack that runs from top to bottom and across the structure as well. Other photos show a large horizontal crack under the foundation of a small restaurant on the pier, the Chit Chat Cafe.
The cafe was also shut down.
This is not the first time the 53-year-old pier has shown signs of stress. In 2021, part of it was shut down after handrails along the edge collapsed. And in 2023, after a series of storms pummeled the Central California coast, damaging parts of the pier, the structure was partially closed for more than year.
Those same storms caused extensive damage in Aptos and Capitola, 70 miles south, where piers and waterfront infrastructure were swept away or damaged.
In 2024, a 150- to 180- foot section of the Santa Cruz wharf was ripped off by powerful waves.
At least 10 of the state’s dozens of coastal public piers were closed for part or all of 2024 due to structural damage sustained in winter storms since 2022. At least five others have longer-term upgrades planned to address structural issues.
“These things are costly to maintain,” said Zach Plopper, senior environmental director at Surfrider. “They are a part of our California coastal culture in many ways, but we’re going to need to reckon with, one, the state that they’re in, and two, the continuous and worsening threats they’re going to experience,”
He said most of the piers were constructed in the early 1900s, and they weren’t built to withstand decades of rough seas, storms and rising sea level.
“With this incoming El Niño, which is forecasted to be significant, and this marine heat wave we’re in the midst of, we’re kind of in uncharted waters as far as what this winter could bring in terms of storms and swells to the California coast, and we’re likely going to see a lot more damage,” he said. “Not just piers, but roads and other coastal infrastructure up and down the state.”
There was no storm in Pacifica earlier this week, so no single event could be blamed for the destruction.
However, a 2025 report from an outside engineering firm, GHD, found that several sections of the pier were in “poor” or “serious” condition, and they recommended closure before anticipated storms or events that could “subject the piles to high winds, swells and large waves.”
The firm found several areas of the pier where concrete was missing and rebar was exposed and corroding.
“The pier has continued to experience high winds and large waves in a harsh marine environment,” the engineers wrote in the report, noting that continuous exposure to seawater or marine spray was “detrimental” to the structure.
A 2023 city report estimated it would cost $19 million to repair.
That same year, a state law was enacted to require local governments along the California coast to plan for sea level rise in the coming decades.
Sea level has risen some 8 inches, on average, along the coast in the past 150 years, Plopper said, and researchers anticipate another foot in the next 25 years.
“We’re going to see profound shifts on our coastline, none that we have ever experienced before, and building static structures on the coast just doesn’t work all that well,” he said. “We’re going to have to make some really hard decisions.”
-
Colorado6 minutes ago$25.7M Colorado private ski mountain property heads to auction
-
Connecticut9 minutes agoConnecticut driver spots snake in car while driving, police say
-
Delaware14 minutes agoDelaware Bay’s new oil spill response boat officially christened
-
Florida21 minutes agoSheriff’s Office investigating fatal shooting of child in Florida City
-
Georgia24 minutes agoNorthwest Georgia Congressman pushes for impeachment of federal judge for misconduct
-
Hawaii29 minutes agoA wet start to the dry season in East Hawaii – West Hawaii Today
-
Illinois39 minutes agoIllinois gas tax set to increase every year—without a vote
-
Indiana44 minutes agoState Fair announces next wave of free concerts