robots.txt Format
“Format” is about the shape of the file rather than the meaning of each rule: how groups are structured, what encoding to use, and what a clean, valid robots.txt looks like end to end.
The overall shape
A robots.txt file is a sequence of groups plus some file-level directives (Sitemap, comments). Each group is a block of one-directive-per-line text. Blank lines separate groups for readability but aren't required.
# 1. file-level comment
User-agent: * # start of the default group
Allow: /
Disallow: /admin/
User-agent: GPTBot # a crawler-specific group
Disallow: /
Sitemap: https://example.com/sitemap.xmlFormatting rules that matter
- One directive per line, in the form Field: value.
- Field names are case-insensitive (User-agent, user-agent, USER-AGENT all work).
- Group order doesn't change behavior — each crawler picks its most specific group regardless of position.
- Rule order within a group doesn't matter either; precedence is by match length, not line order.
- Sitemap and comments can appear anywhere; Sitemap is always file-level.
Encoding and delivery
- Encode as UTF-8. Avoid a leading byte-order mark (BOM).
- Serve with Content-Type: text/plain and a 200 status.
- Either LF or CRLF line endings are fine.
- Keep it under 500 KiB — Google ignores anything past that limit.
Beware the HTML error page
A safe default template
Formatting mistakes
Multiple directives on one line
User-agent: * Disallow: / is invalid; each directive needs its own line.
Indented directives
Leading whitespace can confuse some parsers; keep directives flush-left.
Smart quotes / rich text
Save as plain text, not from a word processor that inserts curly quotes.
What should a robots.txt file look like?
A series of groups, each beginning with one or more User-agent lines followed by Allow/Disallow rules, plus a Sitemap line. One directive per line, plain UTF-8 text. See the template above for a safe default.
Does the order of rules matter in robots.txt?
No. Crawlers select the most specific user-agent group and then apply the longest-matching path rule — neither depends on the order lines appear in the file.
Can robots.txt have comments?
Yes. Anything after a # is a comment, on its own line or after a directive. Comments are ignored by crawlers but help maintainers.
Where does the Sitemap line go?
Anywhere in the file — it's file-level, not part of any group. Putting it at the top or bottom is purely a style choice.
Robots.txt Validator
Catch syntax errors and best-practice issues, with a health score.
Robots.txt Studio Editorial · Technical SEO & crawling
We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.