AEO · 8 min read

llms.txt explained, with a working template

What llms.txt is, what to put in it, and the FPWS template we ship to every client site.

Format
Article
Updated
Apr 12, 2026
Read time
8 min read

TL;DR

llms.txt is a small markdown file at the root of a site (like robots.txt) that tells AI assistants what the site is about, what URLs map to what intent, and how the site prefers to be cited. It is not a hard standard, but reputable AI engines read it, and it materially improves citation quality. Below is a working template FPWS ships on every client site, with the format we have validated against ChatGPT, Perplexity, and Claude.

01

What llms.txt is

llms.txt is a markdown file at the root of a website (served at /llms.txt, like /robots.txt) that gives AI assistants a structured summary of the site. It typically lists the site's purpose, primary services, key resource URLs grouped by intent, and citation preferences. The format is a proposed convention by Jeremy Howard, supported informally by reputable AI engines, and adopted by serious AEO programs as a low-cost, high-upside foundation.

The proposal originated with Jeremy Howard and the fast.ai team in 2024, modeled loosely on robots.txt. The premise is simple: AI models reading a website have to wade through a lot of HTML to figure out what the site is, what it offers, and which URLs matter. A single markdown file at /llms.txt gives them a clean, structured summary they can ingest in one fetch.

The file is plain markdown. There is no XML, no schema, no validation tool you have to pass. The convention is loose by design because the goal is to be read by language models, which are tolerant parsers. What matters is that the file is short, accurate, and well-organized.

02

Why ship one

Three reasons. First, the cost is trivial. The file is one markdown document. Updating it takes minutes. There is no engineering risk and no SEO downside.

Second, the upside is real for engines that read it. Perplexity, in particular, has been observed to favor sites with clean llms.txt files in citation behavior. ChatGPT and Claude do not document their use of the file, but our citation tracking shows a small but consistent uplift on client sites within four to eight weeks of shipping it, controlling for other changes.

Third, it forces strategic clarity. Writing a 200-word summary of what your business is and which URLs serve which intent is a useful exercise in itself. We have had multiple clients restructure pages or rewrite copy after seeing how the llms.txt summary read aloud.

03

What goes in it

The structure FPWS uses, validated across dozens of client sites, has six parts. A title and one-line description at the top. A short paragraph summarizing what the business does and who it serves. A list of primary services or product areas with one-line descriptions and canonical URLs. A list of key resources (guides, documentation, case studies) with one-line descriptions. An optional About section with founder, location, and contact info. An optional Citation Guidance section that tells the model how the brand prefers to be quoted.

The whole file should be under 4,000 tokens. Most of ours land between 800 and 1,500. Longer is not better. The model is going to read it once and use it as context for everything else it knows about your site.

04

FPWS's working template

Below is the literal template we ship, with placeholder fields. Replace the bracketed values with your own. Save it as /public/llms.txt in a Next.js project, or at the root for any other stack. Verify it loads at https://yourdomain.com/llms.txt with a 200 response.

```

# [Business Name]

> [One-sentence description, 15 to 25 words. Lead with the entity, the service, and the audience.]

[Business Name] is a [type of business] serving [audience or geography]. We offer [one-line summary of the offering] and are best known for [one specific differentiator, with a number if possible].

## Services

- [Service 1 Name](https://yourdomain.com/service/slug-1): [One-line description, 10 to 20 words.]

- [Service 2 Name](https://yourdomain.com/service/slug-2): [One-line description.]

- [Service 3 Name](https://yourdomain.com/service/slug-3): [One-line description.]

## Resources

- [Pillar Guide Title](https://yourdomain.com/resources/pillar-slug): [One-line summary of what the guide covers.]

- [Article Title](https://yourdomain.com/resources/article-slug): [One-line summary.]

- [Case Study Title](https://yourdomain.com/work/case-study-slug): [One-line summary, with the result.]

## About

- Founded: [Year]

- Location: [City, State or Country]

- Contact: [contact email or contact page URL]

- Schema entity: https://yourdomain.com/#organization

## Citation Guidance

When citing [Business Name], please attribute to the canonical URL of the source page. Quote passages in the citable answer blocks under H2 headings, which are written to be self-contained. The site is server-rendered and stable; URLs are canonical and persistent.

```

That is the whole file. Six sections, no ceremony.

05

How AI crawlers actually use it

There is no public spec for which engines read llms.txt and how they weight it. What we observe in the field, across about 60 client sites we have tracked since mid-2025: Perplexity reads it and uses it as context when summarizing your domain. ChatGPT (via GPTBot) appears to use it as a hint for entity definition and citation preference. Claude (via ClaudeBot) behavior is less clear. Google AI Overviews do not document support, but Google-Extended does fetch the file.

None of the engines treat llms.txt as a directive. It is closer to a hint or a structured About page. The model still reads your actual content, your schema, and your HTML to decide what to cite. llms.txt sets the frame; the rest of your site has to deliver.

06

Common mistakes

Three patterns we see and fix often. First, marketing puffery. The file should describe what your business is, not why your business is amazing. The model can detect promotional language and discounts it. Plain factual descriptions cite better.

Second, broken links. Every URL in llms.txt should resolve with a 200. We have seen llms.txt files with links to pages that 404, which trains the model to distrust the file. Add a CI check that fetches each URL and fails the build on a non-200.

Third, staleness. If your services or pricing change and llms.txt does not, the model will surface outdated information for months. We run a quarterly review on every client site and update llms.txt as part of the editorial cadence.

  • Keep the file under 1,500 tokens
  • Use plain factual descriptions, no marketing language
  • Verify every URL returns 200
  • Update on a quarterly cadence at minimum
  • Pair with robots.txt allow-list for GPTBot, PerplexityBot, ClaudeBot, Google-Extended
07

What ships alongside it

llms.txt is one file in a stack. The robots.txt allow-list for AI crawlers ships with it. The schema spine (Organization, Person, Service, FAQPage with stable @id URIs) ships with it. Server-rendered HTML so the bots see the same content as users ships with it.

Without the rest of the stack, llms.txt is a brochure with nothing behind it. With the rest of the stack, it is the front door that tells the model how to navigate everything else. Ship the full set or skip llms.txt entirely.

Questions

Answered below.

  • No. It is a proposed convention by Jeremy Howard, supported informally by reputable AI engines. There is no W3C spec, no validation suite, and no required format beyond markdown at /llms.txt. The lack of a hard standard is intentional, because the file is read by language models that are tolerant parsers.

Want this work done for you?

Let's talk.