Skip to main content

Markdown for AI

Markdown is the native language of large language models. They were trained on it, they write it back to you, and a growing set of AI files run on it. Here is why — and how to use it well.

Covers prompt structure, token efficiency, and the four files every AI workflow now relies on: README.md, AGENTS.md, llms.txt, and CLAUDE.md.

The big idea

Markdown is the lingua franca between you and the model

They were trained on it

Models ingested billions of Markdown documents — GitHub READMEs, Stack Overflow answers, docs, forum posts. The syntax is deeply familiar, so they parse and produce it with almost no error.

It is token-efficient

A heading is one # character; bold is two asterisks. The same structure in HTML needs opening and closing tags. Fewer tokens means more of your context budget goes to actual content.

It encodes structure

Headings, lists, and code fences map directly to the document hierarchy a model reasons over. Plain text loses that structure; Markdown keeps it human-readable and machine-parseable at once.

Why LLMs Prefer Markdown

Ask ChatGPT, Claude, or Gemini a question and the answer comes back with headings, bold text, bullet lists, and fenced code. That is not a coincidence. Markdown sits at the intersection of three things a model cares about: it is familiar from training, it is cheap in tokens, and it carries structure that plain text throws away.

They were trained on it

Models ingested billions of Markdown documents — GitHub READMEs, Stack Overflow answers, docs, forum posts. The syntax is deeply familiar, so they parse and produce it with almost no error.

It is token-efficient

A heading is one # character; bold is two asterisks. The same structure in HTML needs opening and closing tags. Fewer tokens means more of your context budget goes to actual content.

It encodes structure

Headings, lists, and code fences map directly to the document hierarchy a model reasons over. Plain text loses that structure; Markdown keeps it human-readable and machine-parseable at once.

Tokens: HTML vs Markdown

Every character you send to a model costs tokens, and your context window is finite. HTML wraps each element in opening and closing tags; Markdown encodes the same structure with one or two characters. The two snippets below describe the exact same content — but the Markdown version is markedly smaller, which means more of the budget is left for the content that matters.

HTML — heaviermore tokens
<h2>Installation</h2>
<p>Run the following command:</p>
<pre><code>npm install acme</code></pre>
<ul>
  <li>Requires Node 18+</li>
  <li>Works on macOS and Linux</li>
</ul>
Markdown — lighterfewer tokens
## Installation

Run the following command:

```
npm install acme
```

- Requires Node 18+
- Works on macOS and Linux
Rule of thumb: the same document is usually 30–50% smaller in tokens as Markdown than as HTML. That is why tools that feed web pages to models — scrapers, RAG pipelines, AI browsers — convert HTML to Markdown first.

Structuring Prompts with Markdown

For anything beyond a one-line question, structure helps. Use a top-level heading for the goal, sub-headings for each part of the task, bulleted lists for rules, and fenced blocks or blockquotes for the content the model should act on. This separates instructions from content and gives the model an outline to follow.

A well-structured prompt
# Role

You are a senior technical editor.

## Task

Rewrite the text below to be clearer and more concise.

## Rules

- Keep the original meaning
- Use active voice
- Target a 9th-grade reading level

## Text to edit

> The utilization of verbose terminology frequently
> impedes comprehension on the part of the reader.
How the model sees the structure

Role

You are a senior technical editor.

Task

Rewrite the text below to be clearer and more concise.

Rules

  • Keep the original meaning
  • Use active voice
  • Target a 9th-grade reading level

Text to edit

The utilization of verbose terminology frequently
impedes comprehension on the part of the reader.

Why it works: the headings tell the model what the goal is versus what the rules are; the blockquote clearly marks the text to edit, so the model does not mistake it for another instruction. New to the syntax? Start with the Markdown cheat sheet.

The AI Markdown Files

A new family of Markdown files has appeared specifically to talk to AI. They share one idea: a predictable filename and plain Markdown content that a model can read without guessing. Here is how the main ones fit together.

FileWhere it livesRead byPurpose
README.mdRepository rootHumans first, agents as fallbackProject overview, quick start, and contribution basics.
AGENTS.mdRepo root + nested per packageCoding agents (Codex, Cursor, Copilot, Jules, Aider…)Build, test, and style instructions an agent needs to work on the code.
CLAUDE.mdRepo root and ~/.claude/Claude CodeClaude-specific project guidance. Claude Code also reads AGENTS.md.
llms.txtSite root (/llms.txt)AI search engines & chatbots at inference timeA curated, low-noise Markdown map of a site's most important pages.
.cursorrules / .windsurfrulesRepo rootSingle-tool (legacy)Tool-specific rules. Largely superseded by the shared AGENTS.md format.

AGENTS.md — a README for coding agents

AGENTS.md is a plain Markdown file at the root of a repository that tells AI coding agents how to work on your project: the build and test commands, code-style conventions, and any gotchas. There are no required fields — use whatever headings make sense. Over 60,000 open-source projects ship one, and it is read by Codex, Cursor, Copilot, Jules, Aider, and most other agents.

Example AGENTS.md
# AGENTS.md

## Project overview
A Next.js app for tracking habits. TypeScript everywhere.

## Build & test
- `npm run dev` — start the dev server
- `npm run build` — production build (must pass before commit)
- `npm test` — run the Vitest suite

## Code style
- Use functional components and hooks
- Prefer named exports
- Run `npm run lint` before finishing

## PR instructions
- Title format: [area] Short description
- All tests and lint must be green
Monorepo tip: drop a nested AGENTS.md inside each package. Agents read the nearest file in the directory tree, so the closest one wins. The closest AGENTS.md to the file being edited takes precedence, and an explicit chat instruction overrides everything.

llms.txt — a curated map for AI search

llms.txt is a Markdown file at your site root that gives AI tools a short, hand-picked list of your most useful pages — small enough to fit in a context window. The spec is precise: an H1 with the project name (the only required part), an optional blockquote summary, then H2 sections containing lists of links. A special Optional section marks links that can be skipped when a shorter context is needed.

Example /llms.txt
# Acme Docs

> Acme is an open-source toolkit for building
> data pipelines in Python.

## Docs

- [Quick start](https://acme.dev/start.md): Install and run your first pipeline
- [API reference](https://acme.dev/api.md): Every public class and method

## Examples

- [ETL walkthrough](https://acme.dev/etl.md): A complete end-to-end pipeline

## Optional

- [Changelog](https://acme.dev/changelog.md): Full release history
Not a sitemap: a sitemap lists every page for search engines. llms.txt is the opposite — a deliberately small, curated overview meant to be read at inference time, when a chatbot is answering a question about your site. It is a proposal (from Answer.AI, 2024), not yet an official standard, but it is cheap to add and widely supported.

Markdown in AI Tools

Whether you notice it or not, the AI tools you use every day run on Markdown end to end.

Chat assistants

ChatGPT, Claude, and Gemini render their replies as Markdown — that is where the headings, bold text, and code blocks in their answers come from. Send Markdown in, and the model reads your structure the same way.

Coding agents

Cursor, Copilot, Codex, and Claude Code read Markdown instruction files (AGENTS.md, CLAUDE.md) and write Markdown commit messages, PR descriptions, and docs.

RAG & scrapers

Retrieval pipelines and AI web tools convert HTML to clean Markdown before sending it to a model — it is smaller, structured, and free of navigation and ad noise.

AI search

Perplexity, AI Overviews, and chatbot search read page content as text. A clean Markdown version (and an llms.txt) makes your site easier to quote accurately.

Best Practices

The patterns that make Markdown work well for AI, and the habits to drop.

Do

Lead with a clear heading hierarchy

Avoid

Wall of unstructured text

Headings let the model find the section it needs and weight instructions correctly. # for the top-level goal, ## for sub-tasks.

Do

Put rules and constraints in a bulleted list

Avoid

Bury constraints inside a paragraph

A list of short, parallel bullets is easier for a model to follow item-by-item than the same rules embedded in prose.

Do

Fence code, data, and quoted text

Avoid

Paste code inline with the instructions

Triple-backtick fences and blockquotes signal 'this is content, not an instruction', which prevents prompt-injection-style confusion.

Do

Use one AGENTS.md, not five tool-specific files

Avoid

Maintain .cursorrules, .windsurfrules, and more in parallel

AGENTS.md is read by 60k+ projects across most major agents. One file keeps every tool in sync.

Do

Keep llms.txt curated and short

Avoid

Dump every URL like a sitemap

llms.txt is meant to fit in a context window. List only the pages that genuinely help a model understand your site.

Quick Reference

The AI Markdown files at a glance — what to create, where it goes, and who reads it.

FileLocationRead by
README.mdRepository rootHumans first, agents as fallback
AGENTS.mdRepo root + nested per packageCoding agents (Codex, Cursor, Copilot, Jules, Aider…)
CLAUDE.mdRepo root and ~/.claude/Claude Code
llms.txtSite root (/llms.txt)AI search engines & chatbots at inference time
.cursorrules / .windsurfrulesRepo rootSingle-tool (legacy)

Frequently Asked Questions

Why do AI models like ChatGPT and Claude output Markdown?

Large language models were trained on enormous amounts of Markdown — GitHub READMEs, documentation, Stack Overflow, Reddit, and forum posts all use it. Markdown became the de facto format for technical writing on the web, so models learned to both read and produce it fluently. It is also token-efficient and preserves document structure (headings, lists, code) in plain text, which makes it the natural output format for an assistant.

Does using Markdown in my prompt actually improve responses?

Often, yes. Structuring a prompt with headings (# Role, ## Task, ## Rules), bulleted constraints, and fenced code or data gives the model an explicit hierarchy to follow. It separates instructions from content, reduces ambiguity, and makes long prompts easier for the model to parse. The effect is largest on complex, multi-part prompts; for a one-line question it makes little difference.

What is the difference between AGENTS.md and README.md?

README.md is written for humans: what the project is, how to install it, and how to contribute. AGENTS.md is written for coding agents: the exact build commands, test commands, code-style rules, and conventions an AI needs to make correct changes. Keeping them separate lets the README stay concise while giving agents a predictable place to look for machine-relevant instructions.

What is llms.txt and how is it different from robots.txt or sitemap.xml?

llms.txt is a Markdown file at the root of a website (yourdomain.com/llms.txt) that gives AI tools a curated, low-noise map of your most important pages. robots.txt tells crawlers what they may access; sitemap.xml lists every indexable page for search engines. llms.txt is different: it is a short, hand-picked summary meant to be read at inference time — when a chatbot or AI search engine is answering a question about your site — and it is small enough to fit in a context window.

Is llms.txt an official standard?

It is a proposal, not an official web standard. It was introduced by Jeremy Howard of Answer.AI in September 2024 and has been adopted by hundreds of sites and documentation tools. Major AI crawlers do not all consume it yet, but it is low-cost to add and is widely supported by docs frameworks like VitePress and Docusaurus through plugins.

Do I need a CLAUDE.md if I already have an AGENTS.md?

Not necessarily. Claude Code reads AGENTS.md, so a single AGENTS.md covers the most common case. A CLAUDE.md is useful when you want Claude-specific guidance that differs from what other agents should see, or when you want global instructions in ~/.claude/CLAUDE.md that apply across all your projects. When both exist, treat AGENTS.md as the shared baseline and CLAUDE.md as Claude-specific overrides.

How does Markdown save tokens compared to HTML?

Markdown uses single characters where HTML uses paired tags. A heading is # in Markdown versus <h1>…</h1> in HTML; a list item is - versus <li>…</li>. Across a full document the savings add up — the same content is typically 30–50% smaller in tokens as Markdown. Fewer tokens spent on markup means more of the context window is available for real content and a lower cost per request.

Should I write my website content in Markdown for AI search?

Your published pages can stay as HTML — that is what browsers render. The pattern that helps AI tools is to also expose a clean Markdown version: an llms.txt file at your root, and optionally a .md version of key pages at the same URL with .md appended (e.g. /docs/intro.html.md). This gives crawlers and chatbots a parse-friendly copy without changing the human-facing site.

Which AI coding tools read AGENTS.md?

A growing list, including OpenAI Codex, Cursor, GitHub Copilot's coding agent, Google's Jules and Gemini CLI, Aider, Amp, Zed, Windsurf, Devin, and many more — over 60,000 open-source projects ship one. It is now stewarded by the Agentic AI Foundation under the Linux Foundation, which is why it has become the shared format rather than each tool inventing its own.

Learn More Markdown

The syntax behind every example on this page is standard Markdown. Brush up on the building blocks: