Kolsetu Logo
Retour au blog
Blog

AI Systems Should Not Learn From You

AI systems don’t just raise questions about where data is stored, but how it influences behaviour. This article explains why architectural boundaries matter - and how we ensure data stays contained, predictable, and under control.

Yves-Philipp RentschYves-Philipp Rentsch
6 min de lecture
2 février 2026

Many companies say: “We don’t train on your data.” It sounds precise. Reassuring, even. But in practice, it often skips over the part that actually matters. Because depending on how a system is built, your data can still influence behaviour without ever being used to explicitly retrain a model. That influence just shows up in less obvious places. Shared embeddings get updated. Ranking improves globally. Retrieval logic adapts based on aggregated usage. Nothing is labelled as “training,” but behaviour still shifts based on inputs that originate somewhere else. From a technical perspective, this is where things start to diverge. There is a meaningful difference between systems that update shared model weights, systems that optimise globally across tenants, and systems that keep each environment properly isolated. They may look similar from the outside. They are not.

The Difference Between Data and Influence

Most conversations around AI security still revolve around data exposure. Is it encrypted? Who can access it? Where is it stored? All valid questions. Just not the full picture. A system can keep data technically secure and still allow influence to move in ways that are much harder to see. If interactions from one customer affect shared embeddings, tweak ranking behaviour, or shape how retrieval is optimised globally, then one environment is influencing another. No raw data needs to be visible for that to happen. Over time, this creates coupling. Behaviour starts to depend on signals that are not visible within a given system. When something changes, it becomes difficult to explain why. We’ve seen systems where behaviour shifted week to week, and nobody could point to a single change that caused it. That is usually the moment people realise they are no longer fully in control of the system.

Why Architectural Boundaries Matter

A lot of systems today process interactions locally but optimise globally. On paper, that sounds efficient. In practice, it introduces a mismatch. Execution happens inside a tenant, but learning happens across tenants. Behaviour evolves based on signals that are not contained within the system’s own context. You can get away with that for a while. In regulated environments, not for long. If outputs change, someone will eventually ask why. If decisions differ, there needs to be a traceable path from input to outcome. Once influence is distributed across environments, that traceability starts to fall apart. The system still works. But it becomes harder to reason about - and even harder to defend.

What This Looks Like in Practice

Many modern AI systems rely on external model providers to generate responses. That means data leaves the system boundary, is processed by a third-party model, and then returned as an output. Providers will usually state that the data is not stored or used for training. Contractually, that may well be true. But that is not the whole story. Because the system still depends on what is sent to that model. And in many implementations, that responsibility sits with the customer or the application layer. If personal data is included in prompts, it will be processed. Not maliciously. Not incorrectly. Just… by design. At that point, you are no longer dealing with a purely contained system. You are relying on a combination of configuration, discipline, and, if we’re honest, a bit of hope that nothing unintended slips through. And yes, that includes systems like Intercom’s Fin or similar AI agents. They sit on top of external LLMs. They generate responses based on customer data. And while they provide controls, they do not fundamentally eliminate the possibility that personal data is processed externally. If your architecture allows that path, you own that risk.

A Deliberate Approach to Isolation

At Kolsetu, we chose to remove that entire class of problems at the architectural level. We do not fine-tune shared foundation models with customer data, and we do not allow behavioural optimisation to happen across tenants. Model weights stay exactly as they are, regardless of how individual systems are used. Instead, behaviour is shaped through context. Each deployment runs in its own environment, with its own knowledge layer and its own data pipeline. Information is stored and retrieved per tenant, using embeddings and vector stores that never leave that boundary. Retrieval is scoped, indexing is isolated, and access is controlled end to end. It is not the most “efficient” way to build a global system. It is a much cleaner way to build something you can actually control.

How Systems Improve Without Shared Learning

None of this means the system stands still. Improvement still happens. It just happens locally. Over time, systems become more effective because the knowledge base gets cleaner, retrieval improves, and context is assembled more precisely. The model itself does not change. What changes is how information is selected and used. It is a quieter form of learning. Less impressive in a demo. Much more predictable in production. And importantly, when behaviour improves, you can actually explain why.

Implications for Data Protection and Governance

This architecture has direct consequences. Personal data stays where it originates. There is no cross-tenant influence, no aggregation of behavioural signals, and no blending of context across systems. When outputs change, you can trace it back to something concrete: data, configuration, workflow. Not some invisible feedback loop buried in a shared system. It also avoids drifting into areas that raise regulatory eyebrows. There is no cross-context profiling, no hidden optimisation layer that mixes signals across environments, and no reliance on customers to “be careful” with what they send. From a compliance standpoint, that matters.

Security Is About Controlling Influence

Most security discussions still focus on protecting data. That is necessary, but it is only half the story. In AI systems, you also need to control how data affects behaviour. Where influence flows. Where it stops. What it can and cannot change. If you do not control that, systems slowly become harder to explain - even if everything is encrypted and access-controlled. If you do control it, behaviour stays predictable, even as the system evolves. That distinction tends to matter more over time than most people expect.

The Role of Systems Like Elba

This is the principle behind how Elba is designed. Elba operates in structured environments where context persists and workflows are explicit. It retains relevant information over time, but only within the scope of a given system. That allows it to combine past interactions and current inputs in a way that improves outcomes - without introducing cross-tenant dependencies. Because retrieval is controlled, outputs stay grounded. Because workflows are defined, decisions remain traceable. And because environments are isolated, behaviour does not drift just because something changed somewhere else. It is a slightly less magical approach. But it is a lot more reliable.

Conclusion

The real question is not whether a system trains on your data. It is how your data can influence the system at all. Once influence starts moving across boundaries, systems become harder to understand, harder to control, and harder to explain. Keeping that influence contained does not make systems simpler, but it makes them predictable. At Kolsetu, that is a trade-off we are willing to make. Because in operational systems, especially in regulated environments, predictability tends to matter more than cleverness.

A propos de l'auteur

Yves-Philipp Rentsch

Yves-Philipp Rentsch

Yves-Philippe is Kolsetu's CISO and DPO with nearly two decades of experience in information security, business continuity, and compliance across finance, software, and fintech. Outside his day-to-day work, he enjoys writing about cybersecurity, data privacy, and the occasional industry rant - usually with the goal of making complex security topics a bit more understandable.

Articles recents

Aller plus loin

Accedez a des comparaisons et pages industrie pour plus de contexte.


AI Systems Should Not Learn From You | Kolsetu Blog