What actually happens to data you put into ChatGPT
OpenAI's data handling varies significantly depending on which ChatGPT product you use. The free consumer tier (ChatGPT.com) trains on conversations by default β you can opt out in settings, but most business users are not aware of this and have not done so. The ChatGPT Plus paid tier also trains on conversations by default with an opt-out. ChatGPT Team, ChatGPT Enterprise, and API access with the appropriate settings can be configured to not train on your data β but this requires deliberate setup and a paid account.
The practical implication: if your employees are using the free tier or a personal Plus subscription to process client data, that data is likely being sent to OpenAI and may be used for training. This is a data breach in the privacy law sense β you have disclosed personal information about your clients to a third party without their consent and without a proper legal basis.
The confidentiality breach risk
Most professional service firms β law firms, accounting firms, consulting firms, financial advisers β have confidentiality obligations to clients that are broader than privacy law. Confidentiality agreements typically prohibit disclosure of client information to any third party without the client's consent. Inputting client information into a commercial AI tool operated by a third party (OpenAI, Google, Anthropic, or any other provider) is disclosure to a third party for the purpose of these agreements. Unless your client agreement specifically permits this β and almost none do, because most were drafted before this was a relevant consideration β you are likely breaching your confidentiality obligation every time an employee inputs client information into a commercial AI tool.
Building a practical data classification framework
The solution is not to ban AI tools β the efficiency gains are too significant and the practice is too widespread to be effectively banned. The solution is a data classification framework that tells employees what data they can and cannot put into which AI tools. Public data (information that is already public knowledge, your own published content, publicly available research) can generally go into commercial AI tools. Internal data (internal processes, non-confidential business information) can go into approved enterprise AI tools with appropriate settings. Confidential client data requires either approved enterprise tools with verified data handling settings or internal AI infrastructure. Regulated data (health information, financial data subject to specific rules) requires the highest level of protection and should only go into specifically approved tools.