Is GDPR-compliant AI on a website even possible?

Yes, but not as a blanket answer. What matters is which data goes where and whether it ends up in training. Server-side AI with no personal data is uncritical; a chatbot that sends user input to an external LLM needs a data processing agreement, a transparent privacy notice, genuine consent, and, for non-EU vendors, a third-country transfer assessment.

Do I need a data processing agreement for an LLM?

As soon as personal data, even just the IP address or the input content, flows to an external LLM vendor, yes. The DPA governs the processing on your behalf. For vendors outside the EU, Standard Contractual Clauses and a third-country transfer assessment under Schrems II come on top. Check in the same step whether a training exclusion is contractually guaranteed.

What is the most common mistake in data protection for AI features?

The forgotten deletion path. Many teams implement consent and the DPA cleanly but cannot fully remove the data on an Article 17 erasure request, because embeddings sit in the vector store and conversation histories live outside the main database. The deletion path has to cover every storage location from day one.

Which AI features should you deliberately not build?

Emotion analysis, behavioural profiling via webcam, microphone, or input patterns, and fully automated decisions with legal effect under Article 22 GDPR. These functions usually force a Data Protection Impact Assessment, are hard to justify, and increasingly collide with the EU AI Act. In nine out of ten cases the business goal is reachable without them.

GDPR-Compliant AI on Your Website: What Works

GDPR-compliant AI on your website is possible, but never as a blanket yes, and never by embedding a chatbot shipped out of the United States. If you want a clear line, you get one the moment you keep two questions apart: where does the data go, and where does it come from?

In roughly every second conversation over the past twelve months, some version of this lands on the table: "Can we use AI on our website without running into the GDPR?" We are not lawyers and we do not replace legal advice. But we have spent two years building AI-assisted features into B2B websites (lead qualification, content generation, semantic search) and we have wrestled, very concretely, with what it takes technically and contractually for this to hold up inside the European framework. Anyone who needs legal certainty at the end still talks to a lawyer, but by then with far sharper questions.

The question stays open because it conflates two things that belong apart. The first fault line is data flow: which data moves, where to, and who processes it? That determines whether you need a processing arrangement, whether a third-country transfer happens, whether Standard Contractual Clauses have to apply. The second fault line is data origin: whose data may be processed at all, and does it end up in a model's training data?

Lay those two axes down and every use case sorts into one of four categories: unproblematic, consent-required, contractually coverable, or better left unbuilt. That replaces gut feel with a decision logic you can defend in front of your data protection officer or the board.

The four categories for any AI use case

Case 1: server-side AI with no personal data. A keyword generator, a translation, a summary of editorial content. Here the GDPR is not the bottleneck, because no personal data arises in the first place. The basics still apply: a vendor with a European server region (Anthropic Claude, OpenAI via Azure EU, Mistral, Aleph Alpha), a clean data processing agreement, disciplined API key management. This is the unproblematic category, and the bulk of what marketing departments actually want falls right here.

Case 2: AI with user data in real time. Chatbot, semantic search, lead qualification. The moment the IP address or the input content flows to an external LLM vendor, you are processing personal data. What holds up here: a documented data flow map, a data processing agreement with the LLM vendor, a transparent privacy notice that names the vendor and the type of data, and genuine consent for AI before activation, not a pre-ticked banner. If the vendor sits outside the EU, Standard Contractual Clauses and a third-country transfer assessment under Schrems II come on top. This is the consent-required, contractually coverable category.

Case 3: AI with long-term data storage. Conversations that "learn", profiles, personalisation. Here purpose limitation, data minimisation, storage limitation, and the rights to access and erasure hit at full force. Pseudonymise or anonymise wherever you can. Define retention periods and enforce them automatically. And, the point where it gets technically serious, keep a deletion path ready that also reaches the embeddings in the vector store and the conversation history. If you run pgvector as your AI backend, you have to design that path in from the start, or erasure stays theory; we have described the architecture behind it in detail elsewhere.

Case 4: AI we don't recommend. A separate section for that in a moment; this category deserves more than a single line.

The point that gets overlooked: training data

Are the data sent to an LLM vendor used for training? With the large vendors (OpenAI, Anthropic, Google) the answer in B2B, API, and enterprise plans is no by default. In consumer plans it isn't, and the practice can change. That is exactly why the training exclusion belongs in the contract before any integration, not in an assumption.

We prefer vendors who communicate training exclusion as the default rather than selling it as a premium option. This isn't a detail for legal; it's an argument that goes straight into your privacy notice and that you can stand behind in front of customers. Build on a US vendor's goodwill here and you build on sand, because the question of whether European data is safe at all on US infrastructure is not settled, see the CLOUD Act and the matter of data sovereignty.

Which AI features you deliberately don't build

Here the honest answer gets uncomfortable. Three classes of feature are technically buildable but shouldn't be.

Emotion analysis and behavioural profiling via webcam, microphone, or input patterns almost always force a Data Protection Impact Assessment, are hard to pass, and carry a negative social charge. The EU AI Act sharpens this line further: emotion recognition in the workplace and in education falls under the prohibited practices, and biometric categorisation counts as a high-risk application with substantial obligations. So you add a second regulatory dimension on top of the GDPR, with the transition periods still to be verified as of mid-2026.

The third class is fully automated decisions with legal effect, such as a lead score that governs access or terms. That is the territory of Article 22 GDPR, which ties such decisions to strict conditions by default. Our recommendation in all three cases: first check whether the business goal is reachable without the mechanism. In nine out of ten cases it is.

And then there is the genuinely hard part, the one that appears in no contract template: the deletion path. An Article 17 erasure request has to reach the data everywhere, in the main database, in the conversation history, and in the embeddings of the vector store. That is exactly where systems fail in practice, because the vector store, a downstream index, is so often forgotten. On top of that, the legal ground itself is in motion: the relationship between the EU-US Data Privacy Framework and Schrems, vendors' training practices, the interpretation of the AI Act, none of it is finally sorted. Build on a cut-off date here and you build wrong.

The pragmatic procedure before any integration

Before a single line of integration code exists, we answer five questions in this order. First: which data actually flows, recorded end to end, from click to answer? Second: which vendor, European-hosted and with a training exclusion? Third: which contractual basis, DPA, Standard Contractual Clauses where applicable, a third-country transfer assessment? Fourth: which consent, technically enforced before activation? Fifth: what does the deletion path look like, where does the data live, everywhere, and how does it get removed without remainder?

The fifth point is the one most often forgotten and at the same time the best litmus test for a team's technical maturity. If you can't draw the deletion path, you haven't understood the architecture. The same question of data sovereignty and storage location, incidentally, already arises one layer down, at the backend itself; which architecture even makes that control possible is what we cover on our Supabase architecture overview.

What I tell decision-makers

Don't rely on a single vendor's compliance statement; rely on an architecture that enforces three things: the training exclusion in the contract, consent technically before activation, and a deletion path that reaches every storage location down to the vector store. Those three pieces of homework are non-negotiable. They decide whether the system stays legally operable the next time the legal ground shifts.

The rest is sorting work. Separate data flow from data origin, place every use case into the four categories, and strike the fourth without debate. Work this way and you run AI on the website not in spite of the GDPR, but in a form you can defend before the board and before the supervisory authority alike. That is no legal sleight of hand. It's engineering hygiene.

GDPR-Compliant AI on Your Website: What Works, What Doesn't

The four categories for any AI use case

The point that gets overlooked: training data

Which AI features you deliberately don't build

The pragmatic procedure before any integration

What I tell decision-makers

About the author: Matthias Radscheit

Frequently asked questions

Sources

Related articles

Supabase or Firebase: In Europe, the Choice Isn't a Feature Question

The Art of Writing a Marketing Website for an API

7 Ways Your Marketing Website Builds Trust In The Age Of AI

Let's talk about your project

Custom WordPress Theme vs. Pre-Made Theme: Pros and Cons

WordPress Backend Too Slow – What Can You Do?

How Does Sanity.io Scale with High Website Traffic?

Framer or WordPress? 10 questions to decide

Migrating from Firebase to Supabase: the hard part is the data model

Supabase as an AI backend: pgvector, Edge Functions and the consolidation question

Why "GDPR-com­pli­ant AI on your web­site" so rarely gets a clear an­swer

The four cat­e­gories for any AI use case

The point that gets over­looked: train­ing data

Which AI fea­tures you de­lib­er­ate­ly don't build

The prag­mat­ic pro­ce­dure be­fore any in­te­gra­tion

What I tell de­ci­sion-mak­ers

About the author: Matthias Radscheit

Frequently asked questions

Sources

Related articles

Supabase or Firebase: In Europe, the Choice Isn't a Feature Question

The Art of Writing a Marketing Website for an API

7 Ways Your Marketing Website Builds Trust In The Age Of AI

Let's talk about your project

Why "GDPR-compliant AI on your website" so rarely gets a clear answer

The four categories for any AI use case

The point that gets overlooked: training data

Which AI features you deliberately don't build

The pragmatic procedure before any integration

What I tell decision-makers