The robots.txt File

The robots.txt file is a single UTF-8 text file served from /robots.txt at your domain root. This guide covers exactly what belongs in it, what a healthy file looks like, and how to create one whether you run WordPress, a static site, or a custom app.

RSRobots.txt Studio Editorial Updated June 8, 2026 Reviewed against Google Search Central and RFC 9309

Where the file must live

Crawlers only check one location per host: the root. The file must be reachable at https://yourdomain.com/robots.txt and returned with a 2xx status and a text/plain content type.

  • https://example.com/robots.txt — correct.
  • https://example.com/blog/robots.txt — ignored; subdirectory files don't count.
  • https://blog.example.com/robots.txt — required separately; each subdomain has its own.
  • http vs https and www vs non-www are different hosts — each needs its own file (usually via redirects to the canonical host).

What goes in a robots.txt file

A complete, healthy file usually contains four things:

  1. A default group (User-agent: *) with any site-wide disallows.
  2. Optional crawler-specific groups (e.g. blocking AI training bots).
  3. A Sitemap line pointing to your XML sitemap.
  4. Comments (lines starting with #) explaining non-obvious rules.
A practical robots.txt
# Default policy
User-agent: *
Allow: /
Disallow: /cart/
Disallow: /checkout/

# Keep AI training crawlers out
User-agent: GPTBot
Disallow: /

Sitemap: https://example.com/sitemap.xml

Not sure what to disallow for your platform? The examples page has copy-paste files for blogs, WordPress, and ecommerce.

How to create and upload it

  • Static / custom sites: create a file named robots.txt and place it in your public/web root. Next.js, Astro, and most frameworks serve files from a public directory or a generated route.
  • WordPress: a virtual robots.txt exists by default; edit it via an SEO plugin (Yoast, Rank Math) or upload a physical file to the web root to override it.
  • Shopify: robots.txt is generated automatically; customize it with the robots.txt.liquid template.

Generate it instead of hand-writing

The Generator builds a valid file from presets and crawler toggles, so you don't have to memorize the syntax or risk a typo that blocks your whole site.

Always validate before you ship

A single misplaced slash can deindex a site. After creating the file, validate the syntax and test a few real URLs before deploying.

Run it through the robots.txt checker and confirm a key page resolves the way you expect in the URL Tester.

Common file mistakes

  • Wrong location

    Placing robots.txt anywhere but the root means it's never read.

  • Wrong content type

    Serving the file as text/html (e.g. a 200 HTML error page) makes crawlers ignore it.

  • BOM / wrong encoding

    Save as UTF-8 without a byte-order mark; some parsers mishandle a leading BOM.

Frequently asked questions
What should be in a robots.txt file?

At minimum, a User-agent: * group with any paths you want to disallow, and a Sitemap line. Add crawler-specific groups (for example, to block AI bots) as needed. If you want everything crawled, a file with just a sitemap is fine.

How do I create a robots.txt file?

Create a plain-text file named exactly robots.txt, add your rules, and place it at your web root so it's served from /robots.txt. The Generator can produce a valid file for you to download.

What file format and encoding should robots.txt use?

Plain text, UTF-8 encoded, served as text/plain. Line endings can be LF or CRLF. Avoid a leading byte-order mark (BOM).

Can a site have more than one robots.txt file?

One per host. Each subdomain (and each protocol/host variant) is treated separately and needs its own robots.txt at its root.

Robots.txt Generator

Build a valid robots.txt from presets and crawler toggles — no syntax required.

Open the Generator
Related resources
Next upHow Does robots.txt Work?
RS

Robots.txt Studio Editorial · Technical SEO & crawling

We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.