How to actually build GDPR-compliant systems
Most teams don’t fail GDPR because they ignore the rules. They fail because they don’t understand what the law is trying to prevent. This post explains how that intent translates into system design, and what builders need to change to avoid data drifting out of control over time.
Most teams don’t misunderstand the regulation, they misunderstand the intent.
If you ask a typical engineering team about GDPR, they usually know the surface-level requirements. They know users can request deletion, they know data shouldn’t be kept forever, and they know consent matters in some cases.
These requirements are treated as isolated obligations instead of expressions of how a system is supposed to behave over time.
So teams implement what looks necessary. They add a deletion endpoint, define retention policies, maybe introduce a consent layer, and move on. Each decision makes sense on its own, but none of them shape the system itself.
Over time, the system evolves in a different direction. Data accumulates, spreads, and gets reused in ways that no longer match the original intent. Compliance becomes something you try to reconstruct after the fact, instead of something the system naturally supports.
What the law is actually trying to prevent
GDPR is not trying to make systems “compliant” in an abstract sense. It is trying to prevent a specific failure mode that is common in software systems.
Left unchecked, personal data expands. It gets collected with vague justification, copied into logs and secondary systems, reused for new features, and retained long after its original purpose has disappeared. Over time, it becomes difficult to answer basic questions about where data lives, why it exists, and what is happening to it.
The law is designed to stop exactly that.
Read this way, the rules stop feeling separate. They all constrain how data is allowed to evolve inside a system. Data should exist for a reason, it should not quietly turn into something else, it should not live longer than intended, and it should not end up in places you did not account for.
This is what “privacy by design and by default” actually means
“Privacy by design and by default” is not something you add later. It means these constraints are built into the system, and still hold even if no one touches it later.
In practice, that shows up in defaults.
If your system logs full request bodies by default, keeps data indefinitely by default, and shares data with integrations without restriction, then it is non-compliant by default.
The inverse is what the law expects.
If you deploy your system and do nothing else, it should collect only the data it needs. Logs should not contain raw personal data unless explicitly required. Data should expire unless something actively extends its lifetime. Integrations should only receive what they need, not everything that is available.
These outcomes should not depend on discipline or cleanup work. They should be how the system behaves when no one is paying attention.
And that is decided in code.
How systems drift away from that intent
Systems rarely violate this intent in one obvious step. They drift.
A developer adds a field because it might be useful later. For example, instead of storing “is_verified”, the system stores full identity data because it might be needed in the future. That field becomes part of a request payload that gets logged. The logging is configured once, so everything goes in - emails, identifiers, raw input.
Later, the same data is exported to an analytics tool because it is already available. Months later, it is reused in a feature no one originally planned.
Data that once had a narrow purpose now exists in multiple places, is used in ways no one planned, and has no clear end.
The same pattern shows up with logging. It starts as observability, but in practice often means logging full payloads because that is the easiest way to debug. At that point, logs stop being just operational data and become a parallel data store - one that is harder to control and often retained longer than primary data.
Integrations create another layer. A third-party service receives data because it is convenient to pass it through. But now that data follows a different lifecycle: different retention, different access, different visibility. From your system’s perspective, it has left. From a legal perspective, it has not.
Why implementing “the rules” is not enough
When teams don’t understand this dynamic, they try to fix compliance at the edges.
They build deletion mechanisms without knowing where data exists. A “delete user” function removes the main record, but leaves logs untouched, analytics data intact, and third-party systems unchanged. They define retention policies without controlling all storage locations. The database may clean up data after 30 days, but logs keep it for a year because that is the default configuration. They introduce consent flows without aligning them with actual data usage. The UI reflects one thing, while the system continues behaving the same way underneath.
From the outside, this looks fine.
In reality, it doesn’t address the core issue. If data has already spread, deletion becomes incomplete. If it is copied into logs or external tools, retention becomes inconsistent. If purpose is not reflected in how data is structured and used, limiting its use becomes difficult.
The system appears compliant, but behaves in ways that contradict the intent of the law.
What changes when you understand the intent
For a builder, the shift is not about learning more rules. It is about recognising what must not happen over time.
Data should not expand beyond its original purpose. It should not duplicate into parts of the system that were never meant to handle it. It should not outlive the reason it was collected. And it should not be reused without that change being explicit.
Once you see that, the requirements stop feeling arbitrary. They become guardrails that keep the system from drifting.
That changes how decisions are made.
Adding a field is no longer trivial, because you need to decide whether you actually need the raw value or whether a derived version would be enough. Logging is no longer just debugging, because you have to decide whether that payload needs to be there at all. Integrating another system is no longer just a shortcut, because you need to understand what data you are sending and why.
None of this slows you down. It just makes the consequences visible at the moment decisions are made.
Fixing this early matters
These dynamics are easiest to manage when the system is small and data flows are simple.
At that stage, it is still possible to answer questions like “where does this field end up?” or “how would we delete this user completely?” without much effort.
As the system grows, each decision compounds. Data gets duplicated, dependencies increase, and the cost of change rises. Fixing issues stops being local and becomes systemic. Removing a single piece of data might require changes across multiple services, pipelines, and integrations.
That is why GDPR feels difficult in mature systems. Not because the requirements are complex, but because the system was never designed with those constraints in mind.
What building with the law actually looks like
Building with GDPR in mind does not mean turning engineers into legal experts. It means designing systems that resist the natural tendency of data to spread and accumulate.
In practice, that means you know what happens to data after you introduce it. You know where it flows, where it is stored, and how it can be removed. You avoid introducing data you cannot trace or delete later. You make conscious decisions about what gets logged, what gets shared, and what gets retained.
Data enters the system for a clear reason. Its movement is intentional. Its lifetime is bounded. And its removal is possible without reverse-engineering the system.
When those properties are present, compliance is no longer something you assemble later. It is already reflected in how the system behaves.
Most teams don’t fail GDPR because they ignore it. They fail because they never understood what it was trying to prevent - and by the time they realise it, the system has already drifted too far.
About the Author
Yves-Philipp Rentsch
Yves-Philippe is Kolsetu's CISO and DPO with nearly two decades of experience in information security, business continuity, and compliance across finance, software, and fintech. Outside his day-to-day work, he enjoys writing about cybersecurity, data privacy, and the occasional industry rant - usually with the goal of making complex security topics a bit more understandable.
Recent Articles

AI is real but much of the boom is not
AI is transforming how work gets done, but the noise around it is louder than the reality. Beneath the hype, real systems are already reshaping industries, even as bad actors and inflated claims make it harder to see what actually matters.

Most compliance problems are design problems
Most teams treat compliance as something you deal with at the end. In reality, most problems are created much earlier in system design, data handling, and everyday engineering decisions. If your system cannot explain what happens to data, you don’t have a legal problem. You have a design problem.
Keep Exploring
Jump to related comparisons and industry pages for deeper context.
More from the blog
Read recent articles on operational AI and regulated workflows.
Compare AI platforms
Review detailed side-by-side competitor breakdowns for enterprise decisions.
Elba vs Bland AI
See differences in compliance controls and workflow execution.
Healthcare workflows
Explore how AI supports patient operations and continuity of care.
Insurance workflows
Understand claim operations, handoffs, and response automation.
Financial services workflows
See operational AI use cases for regulated banking and finance teams.