Audit Every Page on Your Site from sitemap.xml in One Command

"Audit my site" almost never means one URL. It means the homepage, the pricing page, the top twenty blog posts, every product, every category, every location page. On any real site that's hundreds to thousands of URLs, and clicking each one through a free checker is not a workflow — it's a way to lose an afternoon.

This post walks through the workflow we recommend instead: pull your sitemap.xml, hand the URL list to the batch audit endpoint, and export a single CSV ranked by priority. It's ~40 lines of Python. It works on any site that publishes a sitemap. And the size of your site tells you which plan tier to start on.

The whole script

import csv
import os
import xml.etree.ElementTree as ET
import requests

API_KEY = os.environ["SEOSCORE_API_KEY"]
SITEMAP_URL = "https://example.com/sitemap.xml"
BASE = "https://api.seoscoreapi.com"

# 1. Fetch the sitemap
xml = requests.get(SITEMAP_URL, timeout=20).text
ns = {"sm": "http://www.sitemaps.org/schemas/sitemap/0.9"}
root = ET.fromstring(xml)
urls = [loc.text for loc in root.findall(".//sm:url/sm:loc", ns)]

print(f"Found {len(urls)} URLs in sitemap")

# 2. Batch audit (chunked at 50 to stay under per-call limits)
rows = []
for i in range(0, len(urls), 50):
    chunk = urls[i:i + 50]
    r = requests.post(
        f"{BASE}/audit/batch",
        headers={"X-API-Key": API_KEY},
        json={"urls": chunk},
        timeout=180,
    )
    r.raise_for_status()
    for result in r.json()["results"]:
        rows.append({
            "url": result["url"],
            "score": result.get("score", 0),
            "grade": result.get("grade", "F"),
            "seo": result.get("categories", {}).get("seo", 0),
            "performance": result.get("categories", {}).get("performance", 0),
            "accessibility": result.get("categories", {}).get("accessibility", 0),
            "ai_readability": result.get("categories", {}).get("ai_readability", 0),
            "priority_issues": len(result.get("priority", [])),
        })

# 3. Sort worst-first and write CSV
rows.sort(key=lambda r: r["score"])
with open("audit-report.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=rows[0].keys())
    writer.writeheader()
    writer.writerows(rows)

print(f"Wrote {len(rows)} rows to audit-report.csv")
print(f"Worst page: {rows[0]['url']} ({rows[0]['score']})")

That's the whole thing. Drop it in a audit.py, set SEOSCORE_API_KEY, run it, open the CSV in whatever spreadsheet you like.

Why batch matters

Running 500 URLs through GET /audit one at a time means 500 round trips, 500 rate-limit hits, and 500 chances for a transient error to break the loop. POST /audit/batch accepts up to 50 URLs per call, runs them concurrently on our side, and returns a single response. For 500 URLs you do 10 batch calls instead of 500 sequential ones, and the whole audit finishes in two or three minutes.

Batch is not available on the free tier — it's the line where a free SEO checker stops being useful and an API starts paying for itself. Here's the rough math by site size:

Site size	Plan we recommend	Why
Up to 200 pages	Starter — $5/mo	Monthly cap covers one full re-audit per month
200–1,000 pages	Basic — $15/mo	Re-audit weekly without burning your quota
1,000–5,000 pages	Pro — $39/mo	Weekly re-audits plus headroom for ad-hoc spot checks
5,000+ pages	Ultra — $99/mo	Daily or per-commit audits across the whole catalog

The cheapest tier that lets you re-audit at the cadence you actually want is the right tier. If your sitemap has 3,000 URLs and you want a Monday-morning snapshot every week, that's 12,000 audits/month — Pro covers it with room to spare.

Handling sitemap indexes

Most large sites don't publish a flat sitemap.xml; they publish a sitemap index that points at child sitemaps. The script above breaks on those. Two extra lines fix it:

def collect_urls(sitemap_url):
    xml = requests.get(sitemap_url, timeout=20).text
    root = ET.fromstring(xml)
    # Sitemap index — recurse
    if root.tag.endswith("sitemapindex"):
        urls = []
        for child in root.findall(".//sm:sitemap/sm:loc", ns):
            urls.extend(collect_urls(child.text))
        return urls
    # Regular sitemap
    return [loc.text for loc in root.findall(".//sm:url/sm:loc", ns)]

urls = collect_urls(SITEMAP_URL)

That's enough to handle WordPress (Yoast/Rank Math both publish indexes), Shopify (one index per resource type), and most enterprise CMS setups.

Make the CSV actionable

Sorting by score gets you "worst pages first." That's a start, but the high-value moves are usually:

High-traffic pages with mid-tier scores. A blog post with 12,000 pageviews/month and a score of 72 is worth fixing before a product page with 40 pageviews and a score of 41.
Pages with a low category score even if overall is fine. A product page scoring 85 overall but 58 in accessibility is an ADA risk you don't want to ignore.
Pages that just regressed. That's where historical tracking comes in — see the historical SEO score tracking post for the /history endpoint that adds month-over-month deltas to each row.

To join your audit report to traffic data, export GA4 or Search Console to CSV and merge in pandas:

import pandas as pd
audit = pd.read_csv("audit-report.csv")
traffic = pd.read_csv("ga4-pages.csv")  # url, pageviews
joined = audit.merge(traffic, on="url", how="left").fillna(0)
joined["impact"] = (100 - joined["score"]) * joined["pageviews"]
joined.sort_values("impact", ascending=False).head(50).to_csv("priority.csv")

That gives you a 50-row priority list ranked by expected impact of a fix, not just by raw score. The pages that show up at the top are the ones that are bad and matter.

Scheduling it

Once the script works, the obvious next step is running it weekly. A cron line on any server:

0 8 * * 1 /usr/bin/python3 /home/you/audit.py >> /var/log/seoaudit.log 2>&1

Or as a GitHub Action that posts the diff to Slack:

name: Weekly SEO audit
on:
  schedule:
    - cron: "0 8 * * 1"
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: {python-version: "3.11"}
      - run: pip install requests
      - run: python audit.py
        env:
          SEOSCORE_API_KEY: ${{ secrets.SEOSCORE_API_KEY }}
      - uses: actions/upload-artifact@v4
        with:
          name: audit-report
          path: audit-report.csv

The agency-monitor-setup post has a more involved Slack-alerting variant if you want to skip the artifact and just get pinged when scores drop.

What you do not want to do

A few traps we see people fall into:

Auditing the homepage and assuming the rest follows. Templates differ. Product pages and blog posts on the same site routinely score 15+ points apart. If you haven't sampled the long tail, you haven't audited the site.
Hammering the API with 500 separate GET /audit calls instead of using batch. It's slower, it hits rate limits, and on Pro you'll eat through your monthly cap five times faster than you needed to.
Treating the CSV as a static deliverable. The first audit is a baseline. The value compounds when you run it weekly and watch the trend — which is exactly what the historical endpoints are for.

Getting started

Pull your sitemap URL, copy the script above, set SEOSCORE_API_KEY, and run it. If your sitemap has more than a few hundred URLs, grab a Basic key so you've got room to re-run the audit on a weekly cadence. The first run gives you a baseline; the fourth run is where the trend becomes useful.

If you've got an enterprise sitemap with 10,000+ URLs and want help architecting the right batch size and cadence, the Ultra tier ships with that headroom built in.