I Built an AI That Audits Websites for Privacy. Here's What It Found.

How It Started#

A few months ago I built a one-prompt website with an AI agent. I was showing it off to a colleague from data governance when he stopped me mid-sentence: "I found at least five GDPR violations in two minutes."

He was right. The AI had generated a site that loaded tracking scripts without consent, set long-lived cookies with no policy, and embedded third-party resources with zero safeguards. It looked great. It was a compliance nightmare.

That got me curious. If an AI can accidentally build a non-compliant site in seconds, how are the sites that should know better actually doing? I manage server-side GTM tracking infrastructure at my day job — the GA4 migration, consent mode, first-party cookies on a custom subdomain. I know how it's supposed to work. So I built a tool to check.

The Scanner#

It's an AI-powered privacy auditor. It uses Playwright with Firefox in headless mode to visit a website three times:

Ignore — Load the page, don't touch the consent banner, observe what fires
Accept — Click accept, observe what changes
Reject — Click reject, observe what should stop

For each variant it captures cookies, trackers, third-party requests, fingerprinting APIs, localStorage, IndexedDB, security headers, and legal page content. Then Claude analyzes the findings against GDPR, ePrivacy, and actual DPA enforcement precedent to produce a scored report.

The entire pipeline — scanning, analysis, scoring, and presentation generation — runs without manual intervention. I point it at a URL, it gives me back a 20+ slide interactive audit deck.

The Audit Series#

I scanned 10 major websites — a mix of social media, e-commerce, travel, news, and fast fashion — all either Dutch sites or sites that serve Dutch visitors. The scoring uses 7 weighted categories: consent mechanism, pre-consent tracking, legal pages, cross-border transfers, security headers, cookie management, and dark patterns. Each site gets a score from 1.0 (failing) to 10.0 (exemplary).

I'm not publishing the individual results just yet. Before making any findings public, I'm reaching out to the Data Protection Officers of each company to share the results directly and give them an opportunity to respond. That's the responsible thing to do.

What I Found (Anonymized)#

Without identifying specific sites, here are the patterns that came up across the board:

Pre-consent tracking is rampant. Multiple sites fire trackers, set cookies, and run fingerprinting scripts before the consent banner even finishes loading. One site sets 41 cookies and fires multiple tracking requests before the consent dialog even finishes loading.

Reject buttons lie. On at least two sites, clicking "Reject" or "Decline" does not actually stop tracking. One site sets more cookies after you click reject than before you interacted at all.

Fingerprinting is the new frontier. WebGL renderer queries, Canvas fingerprinting, WebRTC peer connection probing, AudioContext analysis — multiple sites collect device identity without consent. This falls squarely under ePrivacy Article 5(3), but most sites don't even mention it in their cookie policies.

Security headers are an afterthought. The average site is missing 3-4 of the 6 standard security headers. SRI coverage is effectively zero — one site loads 103 external resources (scripts and stylesheets) without a single integrity hash.

What's Next#

Once I've heard back from the DPOs — or given them reasonable time to respond — I'll publish each audit individually with the full interactive scan deck, key findings, score breakdown, and specific recommendations citing the GDPR articles and DPA enforcement decisions that apply.

In the meantime, if you want your own site audited, I now offer independent privacy audits as a service. Same methodology, same quality — delivered confidentially.

The full series will be available at datagobes.dev/playbooks/privacy-audit as each audit goes live.

I Built an AI That Audits Websites for Privacy. Here's What It Found.

How It Started#

The Scanner#

The Audit Series#

What I Found (Anonymized)#

What's Next#

Glasshouse: Audit Any Website for GDPR Violations in 90 Seconds (and File a DPA Complaint in Two More)

Pre-Consent Tracking in 2026: How It Works, Why Regulators Are Cracking Down, and How to Build a Site That Actually Waits

Cookie Banner Dark Patterns in 2026: How They Work, Why Regulators Are Cracking Down, and How to Build Symmetric Consent