Headless CMS SEO Audits: Contentful, Sanity, and Strapi

Headless CMS architectures decouple content from rendering. That's the feature. It's also the SEO problem: content editors publish through one interface, the frontend renders through another, and bugs in the rendering layer don't show up in the CMS. An editor pushes a new article. They see it live in Contentful's preview. Three weeks later you discover the structured data isn't being emitted, the canonical tag has been wrong for the entire time, and the LCP regressed because someone updated the hero image transform.

This post is about how to wire SEO Score API into a headless CMS pipeline so those gaps close automatically. We'll cover Contentful, Sanity, and Strapi specifically — the pattern is the same and the differences are which webhooks fire and what payload they send.

The two patterns that work

You want one or both of these patterns running:

Pattern 1 — Audit on publish. A content editor hits "publish." A webhook fires. Your handler audits the URL that piece of content will live at. If the score is below threshold, post a comment back into the CMS or Slack.

Pattern 2 — Audit on deploy. The next.js / Astro / Nuxt frontend rebuilds. The build hook fires. You audit a list of critical pages and gate on regression. This is the pre-deploy gate pattern applied to ISR/SSG sites.

Most teams want both. Pattern 1 catches content authoring problems; pattern 2 catches code/template problems.

Pattern 1 — Audit on publish (Contentful)

Contentful publishes a webhook for Entry.publish. Subscribe to it, audit the resolved URL, gate on score:

# webhook.py — Flask handler, deploy to Vercel/Cloudflare/Lambda
from flask import Flask, request, jsonify
import os, requests

app = Flask(__name__)
API_KEY = os.environ["SEOSCORE_API_KEY"]
SITE_BASE = "https://yoursite.com"
BASE = "https://api.seoscoreapi.com"

@app.post("/contentful-publish")
def handle_publish():
    payload = request.get_json()
    if payload.get("sys", {}).get("type") != "Entry":
        return "ignored", 200

    entry_id = payload["sys"]["id"]
    slug = payload.get("fields", {}).get("slug", {}).get("en-US")
    content_type = payload["sys"]["contentType"]["sys"]["id"]

    # Map content type → URL pattern
    url_template = {
        "blogPost": f"{SITE_BASE}/blog/{{slug}}",
        "landingPage": f"{SITE_BASE}/{{slug}}",
    }.get(content_type)

    if not url_template or not slug:
        return "no url mapping", 200

    url = url_template.format(slug=slug)

    # Wait a moment for ISR to populate, then audit
    import time; time.sleep(15)

    r = requests.get(
        f"{BASE}/audit?url={url}",
        headers={"X-API-Key": API_KEY},
        timeout=120,
    )
    data = r.json()
    score = data.get("score", 0)

    if score < 75:
        post_to_slack(
            f":warning: New publish scored {score} on {url}\n"
            f"Top issues: {', '.join(i['name'] for i in data.get('priority', [])[:3])}"
        )

    return jsonify({"score": score}), 200

The 15-second sleep is the awkward but necessary part of headless CMS auditing. ISR (incremental static regeneration) on Vercel/Netlify takes a beat to rebuild the page after a content change. If you audit too fast you'll audit the stale version. 15 seconds is comfortably outside that window for most setups; 60 seconds if your build is slow.

Pattern 1 — Audit on publish (Sanity)

Sanity uses GROQ-powered webhooks. The payload shape is different but the handler is essentially the same:

@app.post("/sanity-publish")
def handle_sanity():
    payload = request.get_json()
    doc_type = payload.get("_type")
    slug = payload.get("slug", {}).get("current")

    url = {
        "post": f"{SITE_BASE}/blog/{slug}",
        "page": f"{SITE_BASE}/{slug}",
    }.get(doc_type)

    if not url:
        return "", 200

    # ... same audit + gate logic ...

Configure the webhook in Sanity Manage with a GROQ filter like _type in ["post", "page"] to avoid getting paged on every taxonomy edit.

Pattern 1 — Audit on publish (Strapi)

Strapi sends webhooks for entry.publish with the full entry serialized. The shape is closer to Contentful's:

@app.post("/strapi-publish")
def handle_strapi():
    payload = request.get_json()
    if payload.get("event") != "entry.publish":
        return "", 200

    entry = payload.get("entry", {})
    model = payload.get("model")
    slug = entry.get("slug")

    url = build_url(model, slug)
    if not url:
        return "", 200

    # ... same audit + gate logic ...

The pattern is the same across all three. The only thing you need to do per-CMS is map the webhook payload to a URL.

Pattern 2 — Audit on deploy

After Vercel/Netlify finishes building, audit the critical page list using batch:

@app.post("/build-hook")
def handle_build():
    critical_urls = [
        f"{SITE_BASE}/",
        f"{SITE_BASE}/blog",
        f"{SITE_BASE}/pricing",
        f"{SITE_BASE}/about",
    ]
    # Add the top 10 most-trafficked blog posts dynamically
    critical_urls += fetch_top_blog_posts(10)

    r = requests.post(
        f"{BASE}/audit/batch",
        headers={"X-API-Key": API_KEY},
        json={"urls": critical_urls},
    ).json()

    failures = [
        result for result in r["results"]
        if result.get("score", 100) < 75
    ]

    if failures:
        post_to_slack(f"{len(failures)} pages dropped below 75 after deploy")

This catches the case where a code change broke rendering for a type of content without any single piece of content being edited — exactly the gap that headless architectures introduce.

What goes wrong specifically with headless CMS sites

Patterns we see flagged repeatedly on headless audits:

Image transforms drift. Contentful Images API and Sanity's image pipeline can serve images at the wrong size if the rendering layer's <img sizes> attribute doesn't match. LCP regresses silently.
Schema markup defined in the CMS but not rendered. A Product schema entry exists in Contentful. The frontend was supposed to read it. Someone refactored the template, the read broke, the schema is gone from the rendered HTML. No CMS warning, because as far as the CMS knows the field is fine.
Canonical tags pointing at the preview environment. Preview-vs-production environment variables get crossed. Half your blog posts canonicalize to preview.yoursite.com. Google deindexes them.
alt text in the CMS that isn't being emitted. Editor fills in alt text. The frontend renders the image without the alt prop because of a typo in the JSX. Editor never knows.
Stale ISR. A piece of content was updated but the CDN's still serving the previous version because the revalidate hook misfired. The CMS shows the new content; production serves the old.

Every one of these is invisible to the editor and invisible to anyone reading the CMS. They're only visible to something that reads the rendered HTML. Which is what the API does.

Budget math for headless

How many audits a typical headless setup uses:

Audit on publish: 1 per piece of content published. A team publishing 30 articles/month and 5 pages/month = 35 audits.
Audit on deploy: 10-20 critical pages × ~30 deploys/month = 300-600 audits.
Weekly full-site audit (catch the slow regressions): depends on site size, usually 50-500 audits/week.

Total for a typical mid-size headless site: ~1,000-3,000 audits/month. That's Basic at $15/mo for the small end and Pro at $39/mo for the typical case.

For very high-cadence publishers (news sites doing 200+ articles/day) you're in Ultra ($99/mo) territory and probably want to sample rather than audit every post.

The Webhook security note

One thing to handle properly: webhook signature verification. All three CMSes sign their webhook payloads with a secret you configure. Verify the signature before doing anything. Without that, anyone who finds your webhook endpoint URL can trigger arbitrary audit calls against your API key.

import hmac, hashlib

@app.before_request
def verify_signature():
    sig = request.headers.get("X-Contentful-Signature")
    expected = hmac.new(
        os.environ["CONTENTFUL_SECRET"].encode(),
        request.get_data(),
        hashlib.sha256,
    ).hexdigest()
    if not hmac.compare_digest(sig or "", expected):
        return "invalid signature", 403

Sanity and Strapi have analogous schemes.

Getting started

Pick the pattern that matches the leakier side of your team. If editors are publishing fast and code changes infrequently, prioritize Pattern 1. If you're shipping code daily and the editorial cadence is slower, prioritize Pattern 2.
Stand up the webhook handler on whatever serverless platform you already use (Vercel Functions, Cloudflare Workers, Lambda). Twenty lines of code.
Run it for a week. The first few alerts are calibration; the alert noise floor should drop within a sprint once you address the standing issues.
Add the second pattern.

If your headless setup is medium-cadence (a typical SaaS marketing site with weekly editorial and weekly deploys), Basic ($15/mo) covers everything in this post comfortably. If you're a higher-volume publisher, Pro ($39/mo) gives you the headroom plus the 1-year history that makes "what broke when?" answerable for the long-tail content.