Microsoft responds to claims all Word and Excel files are being used to train AI

Microsoft has responded to claims that it has quietly switched on an 'opt-out' feature that enables the scraping of Word and Excel files for AI training.

TL;DR: Microsoft's Connected Experiences feature in Office apps like Word and Excel allegedly collects data for AI training, according to nixCraft. However, Microsoft denies using customer data for training AI models like Copilot. Users can disable this feature via Privacy Settings. Microsoft states the feature supports internet-required functions like co-authoring.

Companies developing artificial intelligence tools require large swaths of data for AI training, and what better way to gather large quantities of data than by scraping it from people using popular applications or programs?

@nixCraft, an author of Cyberciti.biz has claimed Microsoft is participating in this type of scheme with Office, and it's Connected Experiences. According to nixCraft, Redmond's Connected Experiences feature automatically scraps data from Word and Excel files, and that data is used to train Microsoft's AI tools, such as Copilot. According to reports, this feature is turned on automatically, which means user-generated Word documents and Excel files are included in Microsoft's AI training dataset unless the user manually disables the feature.

However, following reports sourcing @nixCraft's claims, Microsoft has since responded, saying customer data within Microsoft 365 apps, which includes Word and Excel, isn't used to train the company's Large Language Models (LLMs), the underlying technology powering AI tools such as Copilot, or ChatGPT. Microsoft also added, "This setting only enables features requiring internet access like co-authoring a document."

For those wondering how to check if this feature is enabled, or would like to disable it, you can follow these steps: File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and uncheck the appropriate box.

