Setting aside the fact that it's probably not the brightest idea to let AI use people's unfiltered online personas as a way to think and sound like us, Google's not-so-subtle update to its privacy policies states that it's OK to use the entire public internet to train its AI models.

Artist impression of AI that used public social media and Reddit posts to train.
"Google uses information to improve our services and to develop new products, features, and technologies that benefit our users and the public," the Google policy update says. "For example, we use publicly available information to help train Google's AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities."
If you've posted any comment or statement online about anything - even something as innocuous as a hot-take on who should play the next Superman in DC's cinematic reboot - the odds are that it's now sitting somewhere within the data-driven brain of an AI chatbot.
Before the change, this part of Google's privacy policy simply mentioned using public data to help train Google Translate, which has now been expanded to include Bard and Cloud AI. What makes this all very strange is that this isn't simply about data on Google services and accounts but covers the entire public internet.
Google Bard is the company's answer to ChatGPT. There's an ongoing debate about how and where AI models obtain the data it trains on - to the point where issues like 'data scraping' affecting social-focused platforms like Twitter and Reddit are real concerns for companies and the millions of people that post daily.
Outlets like Gizmodo and others have reached out to Google to clarify what it means when it says it will use "publicly available information" to train its AI models, and so far, have not received a response.