What's safe to put into AI: a CEO's data-handling guide

A traffic-light rule your whole team can remember: what's always safe to put into AI, what needs care, and what should never touch a consumer account.

What you'll have when you're done

A clear, memorable rule for what data is safe to put into AI, and the confidence to give your team a straight answer instead of a nervous "be careful." You will understand the one fact that governs all of it (consumer plans versus business plans), and you will have a green-yellow-red framework simple enough that a new hire gets it in 30 seconds. This is the foundation under any safe AI rollout.

Your team is already pasting company data into AI, and they don't know the rule

Picture the scene one operator described: your security protocols watching an employee paste decades of company information into free ChatGPT so they can draft an email. It happens constantly, and not because anyone is reckless, because nobody told them the rule, and the rule is genuinely counterintuitive.

Here is the fact that changes everything: on consumer AI plans (free or personal-paid), your conversations are, by default, used to train the model. Anthropic's consumer terms moved to that default in August 2025; consumer ChatGPT works the same way. So the customer list someone pasted to "just draft a quick email" is, by default, training data, not a private vault. The fix is not fear, it is a simple rule everyone can follow. (The deeper mechanics live in is your data safe in AI; this is the rule you hand your team.)

I will admit I got this wrong before I understood it. Early on I pasted a chunk of a real customer contract into a personal account to get a plain-English summary, the most natural thing in the world, and only later did I register that I had just handed a third party a confidential document under an NDA, on a tier that by default could train on it. Nothing came of it. But "nothing came of it" is not a control, it is luck, and luck is not a policy you can give 30 people. That near-miss is exactly why the rule below is blunt and memorable rather than nuanced and forgettable.

What you need first

To know what plans your team actually uses, including the personal accounts nobody mentions in the meeting.
Five minutes to learn the green-yellow-red rule and make it yours.
A way to communicate it (this becomes the core of your AI usage policy).

The rule, step by step

Step 1Know which world you're in: consumer or business

Everything hinges on this. Consumer plans (Claude Free/Pro/Max, ChatGPT Free/Plus) may train on your chats by default and retain them. Business plans (Claude Team/Enterprise, ChatGPT Team/Enterprise) contractually do not train on your data and give you retention control. The plan, not the product, decides whether data is safe. Before applying the rest of the rule, know your tier.

The trap is that the app looks identical across tiers, so people assume "we pay for it, so it's safe," which conflates a personal paid plan with a business one. They are not the same: a personal Pro subscription is still a consumer plan. The way to actually know is to check whether the account was provisioned by your company (through Google Workspace or your SSO) and whether there is a signed agreement with the vendor. If someone signed up with their own email and a credit card, it is consumer, no matter how much it costs. When in doubt, assume consumer until proven otherwise.

Step 2Green: public or already-shared. Safe anywhere

If the information is already public or freely shared (your website copy, a published blog post, a generic question), it is green. Put it into any AI, any tier. No risk, because there is nothing to leak. Concretely, green is: your published marketing copy, a press release, a generic how-to question ("how do I structure a cold email?"), anything already on your website, and public information about other companies. If it is already on the open internet, AI seeing it changes nothing.

Step 3Yellow: internal but not sensitive. De-identify or use business tier

Internal information that is not sensitive (a draft strategy memo with no names, an internal process question) is yellow. Fine on a business plan. On anything else, strip the identifying specifics first. When in doubt, treat it as yellow and de-identify. Concretely, yellow is: an unannounced product idea with no customer names attached, internal process docs, a rough org-design question, anonymized metrics ("revenue grew 30% last quarter, why might margin still drop?"). The move that makes yellow safe on any tier is de-identification: "our customer Acme is unhappy about latency" becomes "a key customer is unhappy about latency." Same useful question, nothing leaked.

Step 4Red: PII, regulated, secret, or under NDA. Business tier only, never personal

Customer or employee data, regulated data (health, financial, payment), trade secrets, anything under an NDA, this is red. Red only goes into a business plan with training off, and never into a personal account. The blunt version for your team: if you would be uncomfortable seeing it on a competitor's screen, it is red. Concretely, red is: a customer list or any customer PII, employee records and comp, your actual financials, source code or proprietary algorithms, anything a contract or NDA covers, board materials, and regulated data (health under HIPAA, payment-card data, anything covered by GDPR). The test that resolves most arguments: would a screenshot of this in the wrong hands be a problem you would have to manage? If yes, red.

Step 5Remember that "delete" isn't "gone"

One fact that kills the "I'll just delete the chat" defense: in 2025 a court order forced an AI provider to preserve consumer chats, including deleted ones, for a stretch. On consumer tiers, deletion is not a guarantee of erasure. Another reason red data belongs only on a business tier built for it.

How you'll know it's working

Your team can answer "is this safe to put in AI?" themselves, without asking you, because the rule is simple enough to internalize. You stop having the vague anxiety about what people are pasting, because there is a clear line and a sanctioned place for the sensitive stuff. The real win is that people keep using AI (you want that) without the leaks.

When it breaks

People still paste red data into personal accounts. The rule alone is not enough without a sanctioned alternative. That is the next workflow: set up AI without leaking customer data.
"Yellow vs red" gets debated forever. Default to the stricter bucket. When unsure, treat it as red; the cost of over-caution is low.
Someone says "but it's encrypted / they promise privacy." Marketing promises are not the same as the contractual terms of your tier. Verify against the plan, not the homepage.
People paste red data because de-identifying is annoying. That friction is real. The durable fix is not nagging, it is giving them a business-tier account where red data is allowed, so there is nothing to strip. The rule and the sanctioned tool work together; the rule alone leans on willpower.
A clever tool "needs" the raw data to be useful. Sometimes true (a customer-support bot has to see customer messages). That is fine on the business tier with training off; it is not fine on a personal account. The rule is about where, not whether you can use real data at all.

Make it yours. The three buckets are universal; the examples should be yours. Spend ten minutes writing the five most common red items for your specific business (for a clinic, patient records; for an agency, client creative under NDA; for a fintech, anything touching account numbers) and the yellow items people debate. A rule with your own examples in it is one people actually apply, because they recognize their own work in it rather than translating from a generic list.

Where this fits in your harness

This rule is the foundation of safe AI use, and it is deliberately simple so it travels. The full mechanics are in is your data safe in AI; the reason banning AI backfires is in shadow AI. Once your team knows the rule, give them the safe default with setting up AI without leaking customer data and codify it in a one-page usage policy.

The architecture behind this workflow.

Two operator manuals for the same job, run two ways: OpenCLAW for the always-on harness, Claude Code for the focused-work CLI. Pick one, or get the bundle for $149.

Browse the books · $99 each

Want one workflow like this taken apart end-to-end every week? The Tuesday Pro Deep Dive · $39/mo.