Researchers warn that AI could ruin the web if left unchecked

AIs training themselves off other AI-produced content will lead to a downward spiral, with severe consequences on the quality front, experts warn.

VIEW GALLERY - 2

Darren Allan

Tech Reporter

Published Jun 20, 2023 8:15 AM CDT
Updated Jul 11, 2023 2:02 AM CDT

1 minute & 30 seconds read time

Voice: DefaultSpeed

0:00 / --:--

Artificial intelligence is taking a lot of flak on all sorts of fronts of late, and the latest shot to be fired is the suggestion that AI might effectively mess up the worldwide web.

The AI hype train is only equalled by the AI-dire-warnings train (Image Credit: Pixabay)

VIEW GALLERY - 2 IMAGES

As you may be aware, the current surge of AI is based on those driven by Large Language Models or LLMs (like ChatGPT), meaning these are basically giant data hoovers, training themselves, and picking up their answers, from multiple sources online.

As The Independent reports, the problem that a new study has highlighted is that AI chatbots training themselves on content produced by other AIs could lead to a "downward spiral of gibberish on the internet," or so we're warned.

Read more: OpenAI says DeepSeek stole its data to train its breakthrough AI
Read more: Meta accused of downloading torrents of 81.7TB of pirated books to train its Llama AI models

The fear is that junk content produced by AIs now will lead to even shabbier content down the line, which will be reused itself, producing the aforementioned downward spiral towards nonsense.

Currently, the web is full of largely human-generated content - which admittedly contains a whole lot of gibberish already - but that'll change down the line as AIs produce more content.

The study, which involved researchers from Oxford University in the UK among others, experimented with training subsequent generations of AI off each other.

The results? The paper (still in preprint and not yet peer-reviewed) observed:

"We discover that learning from data produced by other models causes model collapse - a degenerative process whereby, over time, models forget the true underlying data distribution."

Just a few generations of AI being trained in this way leads to 'major' degradation in the quality of content.

So, what's needed, the researchers argue, is a system to clearly label AI-produced content online, ensuring that future AIs don't train themselves off any such content, and stick to human-generated material instead.

That sounds very sensible, though clearly, this is something that'll need to be acted on quickly - and given the sheer scale of the worldwide web, and the fact that AI content is already being put out there, well, the clock is definitely ticking.

Elsewhere, fears around AI have been expressed that reach far beyond the ruin of the web - like the threat of the extinction of humanity (yes, that old chestnut).

Researchers warn that AI could ruin the web if left unchecked

Best Deals: Moorebot Scout - Tiny AI-Powered Smart Camera Mobile Robot

Comments

Similar News Stories