Researchers warn that AI could ruin the web if left unchecked

AIs training themselves off other AI-produced content will lead to a downward spiral, with severe consequences on the quality front, experts warn.

1 minute & 40 seconds read time

Artificial intelligence is taking a lot of flak on all sorts of fronts of late, and the latest shot to be fired is the suggestion that AI might effectively mess up the worldwide web.

The AI hype train is only equalled by the AI-dire-warnings train (Image Credit: Pixabay)

The AI hype train is only equalled by the AI-dire-warnings train (Image Credit: Pixabay)

As you may be aware, the current surge of AI is based on those driven by Large Language Models or LLMs (like ChatGPT), meaning these are basically giant data hoovers, training themselves, and picking up their answers, from multiple sources online.

As The Independent reports, the problem that a new study has highlighted is that AI chatbots training themselves on content produced by other AIs could lead to a "downward spiral of gibberish on the internet," or so we're warned.

The fear is that junk content produced by AIs now will lead to even shabbier content down the line, which will be reused itself, producing the aforementioned downward spiral towards nonsense.

Currently, the web is full of largely human-generated content - which admittedly contains a whole lot of gibberish already - but that'll change down the line as AIs produce more content.

The study, which involved researchers from Oxford University in the UK among others, experimented with training subsequent generations of AI off each other.

The results? The paper (still in preprint and not yet peer-reviewed) observed:

"We discover that learning from data produced by other models causes model collapse - a degenerative process whereby, over time, models forget the true underlying data distribution."

Just a few generations of AI being trained in this way leads to 'major' degradation in the quality of content.

So, what's needed, the researchers argue, is a system to clearly label AI-produced content online, ensuring that future AIs don't train themselves off any such content, and stick to human-generated material instead.

That sounds very sensible, though clearly, this is something that'll need to be acted on quickly - and given the sheer scale of the worldwide web, and the fact that AI content is already being put out there, well, the clock is definitely ticking.

Elsewhere, fears around AI have been expressed that reach far beyond the ruin of the web - like the threat of the extinction of humanity (yes, that old chestnut).

Buy at Amazon

Moorebot Scout - Tiny AI-Powered Smart Camera Mobile Robot

TodayYesterday7 days ago30 days ago
* Prices last scanned on 5/26/2024 at 2:58 am CDT - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission.

Darren has written for numerous magazines and websites in the technology world for almost 30 years, including TechRadar, PC Gamer, Eurogamer, Computeractive, and many more. He worked on his first magazine (PC Home) long before Google and most of the rest of the web existed. In his spare time, he can be found gaming, going to the gym, and writing books (his debut novel – ‘I Know What You Did Last Supper’ – was published by Hachette UK in 2013).

What's in Darren's PC?

Newsletter Subscription

Related Tags