llms.txt is a plain-text file you place at the root of your domain that gives large language models structured guidance about your site — what to read, what to skip, and how to understand your content. It is to AI crawlers what robots.txt is to search engine bots, except instead of blocking, it guides.

What problem does llms.txt solve?

When ChatGPT, Perplexity, Claude, or Gemini crawl the web to build their knowledge or answer queries, they have no structured way to understand which pages on your site matter most, what your site is about, or how your content is organized. They treat every page equally and rely entirely on their own parsing. llms.txt gives them a direct signal — a curated index of your most important content written in a format they can process efficiently.

Without it, AI systems may index low-value pages, miss important documentation, or misattribute what your site does. With it, you have a direct channel to tell AI crawlers: here is what matters, here is how it fits together, and here is the authoritative version of what we do.

What is the llms.txt format?

llms.txt uses a simple Markdown structure. The file lives at https://yourdomain.com/llms.txt and follows this pattern:

# Your Site or Product Name

> One-sentence description of what your site does and who it's for.

## Section Name
- [Page Title](https://yourdomain.com/page): One-line description of what this page covers.
- [Page Title](https://yourdomain.com/other-page): One-line description.

## Another Section
- [Page Title](https://yourdomain.com/docs): One-line description.

The spec is intentionally minimal. A heading gives the AI the site name, a blockquote gives a short description, and then sections list the most important URLs with human-readable descriptions. AI crawlers can fetch this file first and use it to prioritize which pages to read, what the site is about, and how the pieces fit together.

What should you include in llms.txt?

A well-structured llms.txt file covers the pages that define what you do, not every page on your site. Good candidates to include are:

  • Your homepage — the primary statement of what your product or service is
  • Product or feature pages that explain your core offering
  • Documentation or API reference that technical users and AI systems both need to understand your capabilities
  • Pricing page — LLMs answering "how much does X cost" need this
  • About or team page — for E-E-A-T and entity attribution
  • Key blog posts or guides that define your domain expertise
  • A changelog or what's new page if you ship frequently

Pages to leave out: tag pages, search result pages, user-generated content pages with thin content, login or account pages, and any page you would not want indexed.

How does llms.txt differ from robots.txt and sitemap.xml?

File Purpose Audience
robots.txt Allow or block crawlers from accessing pages All web crawlers
sitemap.xml List all indexable URLs with metadata Search engine crawlers
llms.txt Curate and describe your most important content LLM crawlers and AI systems

robots.txt is a permission file. sitemap.xml is an exhaustive index. llms.txt is a curated guide. A site can have all three serving different purposes. robots.txt controls access, sitemap.xml handles discovery at scale, and llms.txt provides context that neither of the others can.

How does llms.txt affect your AIO score?

SEO Score API checks for llms.txt as part of the AIO (AI Optimization) score, which measures whether AI systems can discover, parse, trust, and cite your content. The llms.txt check is worth 4 out of 39 possible AIO points — the second highest weighted check in the AIO category after AI bot access in robots.txt.

A missing llms.txt does not block AI crawlers from your site, but it means they have no structured guidance. Sites with an llms.txt file consistently score higher on AIO because the file signals active intent to be AI-readable, provides direct context for entity attribution, and helps retrieval systems prioritize your best content when building answers.

How do you write a good llms.txt for a SaaS product?

Here is an example for a developer API product:

# SEO Score API

> A developer API for instant on-page SEO audits. Returns a score, grade, and prioritized issue list for any URL. Covers 82 checks across SEO, performance, accessibility, and AI readability.

## Getting Started
- [API Documentation](https://seoscoreapi.com/docs): Full API reference, authentication, and endpoint details.
- [Quickstart](https://seoscoreapi.com/#signup): Sign up for a free API key with 5 audits per day.
- [Pricing](https://seoscoreapi.com/#pricing): Plans from free to $99/mo with 25,000 audits.

## Features
- [SXO, AEO, and AIO Scores](https://seoscoreapi.com/blog/sxo-aeo-aio-scores): Three extended scoring dimensions bundled into every audit on paid plans.
- [Shareable Reports](https://seoscoreapi.com/blog/white-label-seo-reports): Public report pages and SVG badges for every audited domain.
- [SEO Monitoring](https://seoscoreapi.com/blog/seo-monitoring-alerting): Scheduled audits with email alerts on score drops.

## Integrations
- [Python SDK](https://pypi.org/project/seoscoreapi/): pip install seoscoreapi
- [Node.js SDK](https://www.npmjs.com/package/seoscoreapi): npm install seoscoreapi
- [n8n Integration](https://seoscoreapi.com/blog/n8n-seo-automation): No-code SEO automation workflows.

Keep descriptions factual and specific. Avoid marketing language — AI systems extract meaning better from precise, literal descriptions than from adjectives. "Returns a score, grade, and prioritized issue list" is more useful to an LLM than "powerful, industry-leading SEO insights." One sentence per URL is enough; the goal is orientation, not persuasion.

Should every website have llms.txt?

Any site that wants to appear in AI-generated answers — product comparisons, how-to queries, definitions, recommendations — should have one. That is increasingly every commercial website. The file takes under 30 minutes to write, requires no infrastructure, and has no downside. The risk of not having one is that AI systems either skip your best content or misrepresent what you do.

For content-heavy sites, agencies, SaaS products, and documentation-driven developer tools, the value is highest because there is more content to curate and more risk of AI crawlers landing on the wrong pages.


Frequently Asked Questions

What is llms.txt?

llms.txt is a plain-text Markdown file placed at the root of a domain that gives AI crawlers and large language models structured guidance about a site's content — what pages matter most, what the site does, and how the content is organized. It is an emerging standard proposed to help AI systems index and cite web content more accurately.

Does llms.txt affect SEO or Google rankings?

No. llms.txt is not read by Googlebot and has no direct effect on traditional search rankings. Its purpose is specifically to guide LLM-based retrieval systems — ChatGPT's browsing, Perplexity's indexing, Claude's web access — rather than search engine crawlers. It is complementary to, not a replacement for, traditional SEO signals.

How long should llms.txt be?

There is no strict limit, but shorter is better. A file covering 10 to 30 of your most important URLs with clear one-line descriptions is more useful than an exhaustive list of every page. AI systems use it as a curated starting point, not a complete sitemap. If you have more content to describe, use sections to organize it logically.

Is llms.txt an official standard?

llms.txt was proposed in 2024 and has been adopted by a growing number of sites and platforms. It is not an IETF or W3C standard, but major AI companies have indicated support for the format. The spec is stable enough to implement now — the cost of adopting it is low and the potential benefit of better AI visibility is real.

How do I check if my llms.txt is working?

Run an AIO audit via SEO Score API — it checks for the presence of llms.txt at your domain root and reports whether it was found. You can also verify directly by visiting https://yourdomain.com/llms.txt in a browser. For deeper validation, check that the file is served with a text/plain content type and is accessible without authentication.