Why AI tools should never train on your company's data

Hero · Article 05

Structural commitment, not policy

No training · structural

Pulse workspace · encrypted at rest · row level security

The training pipeline does not exist. There is nothing to opt out of.

Figure 01 · Pulse design system

Every AI vendor’s website has some version of this language: “We take your data privacy seriously,” “Your data is your data,” “Enterprise grade security.” It is so universal that it has stopped meaning anything. Buyers skim past it. AI vendors copy each other’s privacy pages. The whole category has become unfalsifiable marketing.

Underneath the marketing, the practices are very different. Some vendors train on customer data by default. Some train opt out. Some train opt in. Some do not train but reserve the right. Some make structural commitments that can be audited. The differences matter enormously, but they are invisible to most buyers because everyone uses the same language.

This article is about why the differences matter, what to look for instead of marketing claims, and what a real structural commitment looks like.

Why training on customer data is so common

Three reasons AI vendors train on customer data, listed in order of how charitable they sound.

Reason 1: To improve the product for the customer. The argument goes: if we train on your data, our model gets better at your specific kind of work, and you benefit. This is partially true at the technical level. Models do get better at specific domains when trained on domain data. But the structure of “training on customer data” almost always means training a shared model that all customers see, not training a customer specific model. The benefit to any specific customer is marginal, and the cost (their data informs a model competitors might access) is significant.

Reason 2: To improve the product for all customers. This is more honest. The vendor argues that aggregating training data across customers produces a better product for everyone. This is the rationale most vendors actually pursue, often quietly. It is also the rationale that customers most clearly disagree with: most customers do not want their data benefiting their competitors.

Reason 3: To monetize the data.This is the rationale vendors never state publicly but sometimes pursue. Aggregated data has value beyond improving a model. It can be sold to research firms, used to train downstream products, or exposed to partners. This is the practice that makes “we train on customer data” most alarming to enterprise buyers.

Three reasons, three different levels of justification, all enabled by the same broad permission language: “we may use customer data to improve our services.”

The Atlassian moment

In April 2026, Atlassian announced that starting August 2026, Rovo would train on customer data from Free, Standard, and Premium tiers by default. Enterprise tier customers could opt out, but lower tiers could not. This was a policy change disclosed via an updated terms of service document, with limited proactive communication.

The reaction was telling. Enterprise customers, who had opt out options, mostly accepted the change. Small teams and mid market customers, who did not have opt out, started actively looking for alternatives. The Atlassian decision did not break their enterprise business, but it did permanently undermine trust at the SMB and mid market level.

This is the pattern. Once a vendor establishes a position on data training, changing it is incredibly costly. Vendors who start out training on customer data tend to keep training. Vendors who commit to never training tend to keep that commitment. The starting position becomes the long term position because the cost of breaking customer trust is much higher than the benefit of changing the policy.

This is why structural commitments matter. The decision a vendor makes at founding is the decision they are likely to keep for a decade.

Spectrum

Vendor data practices · from aggressive to restrictive

Aggressive
Default opt in
Trains automatically on customer content with broad permissions language.
Common
Default training, opt out available
Trains by default. Opt out exists but is rarely exercised because it is hard to find.
Customer controlled
Limited training
Trains only on explicitly opted in tenants.
Contractual
No training (policy)
Promise in the terms of service. Reversible by updating the terms.
Where Pulse sits
No training (structural)
The training pipeline does not exist. There is nothing to opt out of.

Figure 02 · Pulse design system

What “structural commitment” actually means

There is a meaningful difference between policy commitments and structural commitments. The distinction matters because policy commitments can be reversed at any time. Structural commitments cannot.

Policy commitment example. A vendor says in their terms of service that they will not train on customer data. This is a contractual promise. The vendor could update the terms (with notice) and start training. The commitment lasts as long as the policy stays in place. Real, but reversible.

Structural commitment example. A vendor builds their product so that customer data never enters their training pipeline. The training infrastructure has no connection to customer data. There is no internal team that has access to both customer data and training systems. Reversing this commitment would require rebuilding internal systems, not just changing a policy. Real and structurally durable.

The difference is what an architecture team would have to do to start training on customer data tomorrow.

For Pulse, the answer is: rebuild meaningful parts of the system. Customer data lives in tenant isolated databases with row level security. Training infrastructure does not exist as a separate concept because we do not train on anything. To start, we would have to build a training pipeline, decide what to train, build the consent mechanisms (or eliminate them), build the data export mechanisms, and survive the customer reaction. Multi month commitment.

For most vendors, the answer is: flip a config flag. Their training pipelines already exist. They train on something. Switching to “also train on customer data” is operationally a small change. The contractual commitment is what keeps them from doing it.

Buyers cannot easily tell the difference between policy and structural commitments from marketing copy. But the difference is enormous.

The four structural commitments worth looking for

When evaluating an AI vendor, ask about these four specifically. Vague answers tell you the commitment is not structural.

Training on customer data.“Never” is what you want to hear. “Only with explicit opt in” is acceptable. “We may train to improve our services” is a red flag.
Individual productivity metrics.Does the system measure and surface “user X sent N messages this week” or similar individual level productivity data? If yes, the vendor is building surveillance infrastructure regardless of how they market it. Microsoft’s 2020 Productivity Score scandal still haunts that company; vendors building similar capabilities are setting up the same scandal.
Permission expansion.Does the AI tool show users content they could not access in the source systems? Some tools “expand” permissions to make answers more useful. This is a security gap waiting to be discovered by an enterprise customer. Tools with permission inheritance (showing only what the user can see in source systems) avoid this gap structurally.
Audit logging. Is every retrieval, action, and decision logged with full traceability? Audit logs are how enterprises detect when something has gone wrong. Tools without comprehensive audit logging are unauditable, which is itself a security gap.

Each of these is a yes or no question. Vendors who answer with hedged language are revealing where they actually stand.

What this means for Pulse

We made these four commitments structural before launch, not because we expected immediate enterprise buyers, but because we believed they would become a competitive advantage over time.

The reasoning was: AI tools see more sensitive information than any prior generation of enterprise software. The category will be judged on its trust posture. Vendors who commit early to anti surveillance, anti training, anti permission expansion practices will become the safe choice as the category matures. Vendors who do not will get caught by the next Atlassian moment.

Pulse’s manifesto codifies this. Specifically:

No training on customer data, ever. Structural, not policy.
No individual productivity metrics. We do not surface “Sarah sent 47 messages this week” to anyone.
No permission expansion. Every retrieval inherits from source system ACLs through our visibleDocumentIds() gate. There is no admin override.
Full audit logging. Every retrieval, every action, every Skill invocation, every connector sync.

These are not marketing claims. They are architectural decisions. The product would not work without them.

Commitments

Four structural commitments

01 · Structural
No training on customer data
The pipeline does not exist. Enforced in the architecture, not the policy.
02 · Structural
No individual productivity metrics
We do not track what individual users do. Enforced in the architecture.
03 · Structural
No permission expansion
Inherits from source. No admin override. Enforced in the architecture.
04 · Structural
Full audit logging
Every action recorded with traceability. Enforced in the architecture.

Marketing claims expire. Architecture does not.

Figure 03 · Pulse design system

Why this is genuine differentiation

A common pushback to anti training positioning is: “But your competitors will eventually adopt the same position, and then this is not a differentiator anymore.”

This is wrong in a specific way. The commitment is durable precisely because it is structural. Once a vendor like Atlassian starts training, undoing that decision is enormously costly: it requires unwinding existing training data, breaking customer relationships built on the training enabled features, and rebuilding trust that has been broken once.

Vendors who started without training do not face this cost. They keep the commitment because it is not costly to keep. They benefit from the trust position as the category matures.

The differentiation is not “we do not train on data right now.” The differentiation is “we structurally cannot train on data, and our position will be the same in five years.” That is defensible.

For customers, this matters because the AI tools your team uses today are tools you will likely keep for years. Picking a vendor whose privacy posture might shift under competitive pressure is a different decision than picking a vendor whose privacy posture is structurally fixed.

Closing: what to ask your AI vendor

Three questions to ask any AI vendor before buying.

Do you train on customer data? If yes (in any form), proceed cautiously and read the fine print. If no, ask the next question.
Is the no training commitment policy or structural? If policy, it can change at any time. If structural, ask what they mean.
What would have to happen at your company for you to start training on customer data tomorrow? This is the critical question. If the answer is “we would update our terms of service and start training,” the commitment is weak. If the answer is “we would have to rebuild meaningful parts of our infrastructure,” the commitment is real.

Pulse’s answer to the third question: we would have to build a training pipeline that currently does not exist, decide what to train, build consent mechanisms, and survive the inevitable customer reaction. This is multi month work and probably company ending. The commitment is structural.

If you are evaluating AI tools for your team, the trust posture is one of the most important and most underestimated factors. Pulse is built around structural commitments specifically because we expect the next decade of AI to be defined by which vendors can be trusted with sensitive team data. The architecture is designed for that future.

Live demo at pulsehq.tech. The full trust manifesto is linked from the landing page.