How It Started#
A few months ago I built a one-prompt website with an AI agent. I was showing it off to a colleague from data governance when he stopped me mid-sentence: "I found at least five GDPR violations in two minutes."
He was right. The AI had generated a site that loaded tracking scripts without consent, set long-lived cookies with no policy, and embedded third-party resources with zero safeguards. It looked great. It was a compliance nightmare.
That got me curious. If an AI can accidentally build a non-compliant site in seconds, how are the sites that should know better actually doing? I manage server-side GTM tracking infrastructure at my day job — the GA4 migration, consent mode, first-party cookies on a custom subdomain. I know how it's supposed to work. So I built a tool to check.
The Scanner#
It's an AI-powered privacy auditor. It uses Playwright with Firefox in headless mode to visit a website three times:
- Ignore — Load the page, don't touch the consent banner, observe what fires
- Accept — Click accept, observe what changes
- Reject — Click reject, observe what should stop
For each variant it captures cookies, trackers, third-party requests, fingerprinting APIs, localStorage, IndexedDB, security headers, and legal page content. Then Claude analyzes the findings against GDPR, ePrivacy, and actual DPA enforcement precedent to produce a scored report.
The entire pipeline — scanning, analysis, scoring, and presentation generation — runs without manual intervention. I point it at a URL, it gives me back a 20+ slide interactive audit deck.
The Scores#
I scanned 10 major websites. A mix of social media, e-commerce, travel, news, and fast fashion — all either Dutch sites or sites that serve Dutch visitors.
The scoring uses 7 weighted categories: consent mechanism, pre-consent tracking, legal pages, cross-border transfers, security headers, cookie management, and dark patterns. Each site gets a score from 1.0 (failing) to 10.0 (exemplary).
Here's the leaderboard:
| # | Website | Score | Verdict |
|---|---|---|---|
| #01 | linkedin.com | 5.1 / 10 | — |
| #02 | x.com | 6.1 / 10 | — |
| #03 | facebook.com | 4.1 / 10 | — |
| #04 | mediamarkt.nl | 6.0 / 10 | — |
| #05 | nu.nl | 5.8 / 10 | — |
| #06 | dyson.com | 3.4 / 10 | — |
| #07 | shein.com | 3.7 / 10 | — |
| #08 | tiktok.com | 4.9 / 10 | — |
| #09 | booking.com | 4.5 / 10 | — |
| #10 | coolblue.nl | 5.4 / 10 | — |
Average: 4.9 / 10. Not a single site above 6.2.
What I Found#
Without spoiling the individual deep dives, here are the patterns:
Pre-consent tracking is rampant. Multiple sites fire trackers, set cookies, and run fingerprinting scripts before the consent banner even finishes loading. One site fires 20 trackers and sets 41 cookies before you see the first pixel of the consent dialog.
Reject buttons lie. On at least two sites, clicking "Reject" or "Decline" does not actually stop tracking. One site sets more cookies after you click reject than before you interacted at all.
Fingerprinting is the new frontier. WebGL renderer queries, Canvas fingerprinting, WebRTC peer connection probing, AudioContext analysis — multiple sites collect device identity without consent. This falls squarely under ePrivacy Article 5(3), but most sites don't even mention it in their cookie policies.
Security headers are an afterthought. The average site is missing 3-4 of the 6 standard security headers. SRI coverage is effectively zero — one site loads 103 external scripts without a single integrity hash.
What's Next#
Starting next week, I'll publish one full audit every week. Each post will include the complete interactive scan deck, the key findings, the score breakdown, and specific recommendations citing the GDPR articles and DPA enforcement decisions that apply.
I'm starting with the worst. Next week: dyson.com at 3.4 / 10.
Every scan is available in full at datagobes.dev/playbooks/privacy-audit.