GEO · Explainer

What is llms.txt? — and how to write one AI engines actually use.

llms.txt is a plain-text file that hands AI engines a clean, canonical summary of your site — your identity, your best pages, and how you want to be cited. Here's what it is, how it differs from the files you already have, and how to write one that earns accurate citations.

Quick answer

llms.txt is a plain-text Markdown file at the root of your domain (yoursite.com/llms.txt) that gives AI engines a clean, canonical summary of your site — who you are, what you do, your most important pages, and how you want to be cited. Think of it as a sitemap written for large language models instead of crawlers.

llms.txt is a curated, machine-readable summary of your site, written for AI. It lives at yoursite.com/llms.txt, it's plain Markdown, and its job is to hand an engine the facts you most want quoted — your identity, your best pages, your canonical answers — without making it crawl and guess.

The format was proposed in September 2024 by Jeremy Howard and has been adopted by a growing list of companies. No engine guarantees it reads the file yet, but it's cheap to publish, easy to keep accurate, and aligned with where AI search is heading. Below is what it is, what goes in it, how it differs from the files you already have, and how to write one that actually gets used.

What is llms.txt?

llms.txt is a single Markdown file you publish at your domain root that summarizes your site for large language models. It opens with your brand name and a one-line description, then offers short prose and curated links to the pages that matter most. The goal is retrieval: when an engine wants to understand or cite your brand, it can read one clean file instead of reconstructing you from scattered HTML.

The analogy that lands: robots.txt controls access, a sitemap lists URLs, and llms.txt explains meaning. It's the difference between handing someone a building's floor plan and handing them a one-page brief on what the company inside actually does.

Why llms.txt exists

AI engines work from limited context and don't always render JavaScript, follow every link, or parse a sprawling site cleanly. Left to infer, they paraphrase — and paraphrase drifts into error: the wrong tagline, an outdated price, a competitor's framing of your category. llms.txt closes that gap by giving the engine your own words, in a format it can lift verbatim.

That makes it a core piece of generative engine optimization: you're not hoping the model guesses right, you're supplying the canonical version up front. The brands that win AI citations tend to be the ones whose facts are easiest to retrieve and hardest to misstate.

llms.txt vs robots.txt, sitemap.xml, and agents.md

These files are complementary, not competing. robots.txt grants or blocks crawler access. sitemap.xml enumerates every URL for indexing. agents.md orients autonomous agents on what they can do on your site. llms.txt is the curated summary that shapes how engines understand and quote you. The clearest way to see it is robots.txt beside llms.txt — the two most often confused:

How llms.txt compares to robots.txt
Aspect robots.txt llms.txt
PurposeControls which URLs crawlers may accessSummarizes your site so engines understand and cite it
AudienceSearch and AI crawlersLarge language models (ChatGPT, Claude, Perplexity)
FormatDirectives (User-agent, Allow, Disallow)Markdown prose plus curated links
It answers“What am I allowed to fetch?”“Who is this brand and what should I quote?”
Standardized?Yes — decades-old, universally honoredEmerging — proposed 2024, adoption growing
Use both: robots.txt grants the access, llms.txt shapes the understanding.

How to write an llms.txt file

A good llms.txt reads like a tight one-page brief. Keep it factual, lead with the lines you'd want quoted, and link only to pages that matter. The structure we use:

  1. Brand + one-line summary. An H1 of your exact brand name, then a single sentence (or blockquote) you'd be happy to see quoted verbatim.
  2. Identity paragraph. One or two plain paragraphs: what you do, who you serve, where you operate — unambiguous, no marketing fog.
  3. Curated links. Sections of Markdown links to canonical pages — services, products with pricing, comparisons, key guides — each with a one-line reason to visit.
  4. Authoritative answers. Canonical answers to the questions buyers ask, phrased the way you want them repeated.
  5. Citation policy. State your exact brand name and preferred phrasings so engines quote you consistently.
The test: read your llms.txt as if you were the engine. If every line is something you'd be glad to see quoted back to a buyer, it's working. If a line is vague or puffy, an engine will skip it — or paraphrase it wrong. For a live example, read our own llms.txt.

Common mistakes

Most weak llms.txt files share the same flaws: dumping every URL instead of curating (that's what sitemap.xml is for); marketing language an engine can't safely quote; stale facts like old pricing that then get cited against you; serving the file as HTML or behind a redirect instead of plain text at the root; and treating it as a replacement for clean on-page structure and schema rather than a complement to them.

Does llms.txt actually work?

Honest answer: it's a forward-looking signal, not a guaranteed ranking factor. No major engine has confirmed it weights llms.txt, and you shouldn't expect it to move citations on its own. What it does is make your canonical facts trivial to retrieve and hard to misstate — and it costs almost nothing to maintain. Pair it with the things engines demonstrably use today: server-rendered answers, clean entities, and structured data.

Measure it the way you measure the rest of AI visibility — by citation share and brand-mention accuracy across engines over time, not by the file in isolation. If engines start quoting your exact phrasings and your facts stop drifting, the file is earning its place.

llms.txt won't make a mediocre brand citable. It makes a clear one impossible to misquote — and in AI search, being quoted correctly is half the battle.

Key takeaways

  • llms.txt is a curated Markdown summary of your site at yoursite.com/llms.txt, written for AI engines.
  • It complements robots.txt (access) and sitemap.xml (URLs) — it adds meaning: identity, key pages, and canonical answers.
  • Write it as a one-page brief: brand line, identity, curated links, authoritative answers, and a citation policy.
  • It's a forward-looking signal, not a ranking factor — pair it with clean structure and schema, and measure citation share over time.
FAQ

Questions about llms.txt.

What is llms.txt?

llms.txt is a plain-text Markdown file you place at the root of your domain (yoursite.com/llms.txt) that gives AI engines a clean, canonical summary of your site — who you are, what you do, your key pages, and how you want to be cited. It is to large language models what a sitemap is to search crawlers: a curated, machine-readable map of the content that matters.

Where do you put the llms.txt file?

Place llms.txt at the root of your domain so it resolves at https://yoursite.com/llms.txt — the same convention as robots.txt. It must return plain text (Content-Type text/plain or text/markdown) with HTTP 200 and be reachable without authentication so any AI crawler can fetch it.

Is llms.txt the same as robots.txt or sitemap.xml?

No. robots.txt tells crawlers what they may access, and sitemap.xml lists every URL. llms.txt is different: it is a curated summary written for large language models — concise prose plus links to your most important pages — so an engine can understand and quote your brand without crawling the whole site. They are complementary, not substitutes.

Do AI engines actually read llms.txt?

Adoption is still emerging and no major engine guarantees it reads llms.txt today. But the file is low-cost, standards-aligned, and already fetched by some AI tools and crawlers. Treat it as a forward-looking signal that makes your canonical facts easy to retrieve — paired with clean on-page structure and schema, which engines do use now.

What should an llms.txt file include?

Start with an H1 of your brand name and a one-line summary, then a short paragraph of who you are and what you do. Add sections of curated links (with brief descriptions) to your most important pages, authoritative answers to common questions, products and pricing, and a citation policy that states the exact brand name and preferred phrasings you want engines to use.

Does llms.txt help with SEO?

Not directly. llms.txt is not a confirmed Google ranking signal and won't change your blue-link positions. Its value is for AI answer engines — making your canonical facts easy to retrieve and quote accurately. It complements SEO and GEO rather than replacing either; the pages it points to still need to be crawlable, well-structured, and trustworthy.

Share X LinkedIn

Want an llms.txt — and the rest of your AI-readability — done right?

Start with an AI visibility audit. We'll check your llms.txt, robots.txt, schema, and on-page structure, show you where engines misstate your brand, and ship the exact fixes that earn accurate citations.