What Is robots.txt?

robots.txt is a plain-text file at the root of your domain that tells web crawlers which parts of your site they may request. It is the first thing search engines and AI bots look for before crawling — and one of the easiest files to get subtly wrong.

RSRobots.txt Studio Editorial Updated June 8, 2026 Reviewed against Google Search Central and RFC 9309

The short definition

A robots.txt file is a set of instructions for automated clients (crawlers, bots, spiders). It lives at one fixed location — the root of your domain — and uses a simple, line-based format defined by the Robots Exclusion Protocol (standardized in 2022 as RFC 9309).

A minimal robots.txt
User-agent: *
Disallow: /admin/
Allow: /

Sitemap: https://example.com/sitemap.xml

In plain English, that file says: “every crawler may access the whole site, except anything under /admin/, and the sitemap is over here.”

What robots.txt actually controls

robots.txt controls crawling — whether a well-behaved bot will request a URL. That's it. It is an access-management file, not a security or privacy mechanism.

  • Tell search engines which directories to skip (e.g. internal search, faceted filters, admin areas).
  • Point crawlers to your XML sitemap.
  • Allow or block AI crawlers like GPTBot, ClaudeBot, and PerplexityBot.
  • Reduce wasted crawling on low-value URLs (helping crawl budget on large sites).

What robots.txt does NOT do

robots.txt is not a security control

Anyone can read your robots.txt, and listing a path there advertises it. Never use Disallow to “hide” private URLs — protect them with authentication instead.
  • It does not guarantee a page stays out of Google. A disallowed URL can still be indexed (without its content) if other pages link to it. Use a noindex meta tag or header to keep pages out of the index.
  • It does not stop malicious bots. Disreputable scrapers simply ignore it.
  • It does not remove already-indexed pages by itself.

This distinction trips up almost everyone. If your goal is to keep a page out of search results, see our robots.txt for SEO guide, which covers when to use robots.txt versus noindex.

The anatomy of a rule

robots.txt is organized into groups. Each group starts with one or more User-agent lines (which crawler the rules apply to) followed by Allow and Disallow rules (which paths).

Two groups: one for Googlebot, one for GPTBot
User-agent: Googlebot
Disallow: /private/

User-agent: GPTBot
Disallow: /

For a directive-by-directive reference, read robots.txt syntax. To see exactly how a crawler chooses which group applies, read how robots.txt works.

Common misconceptions

  • “Disallow hides my page from Google.”

    It blocks crawling, not indexing. Use noindex to remove a page from results.

  • “I need robots.txt for security.”

    It's public and advisory. Use authentication for anything sensitive.

  • “Blocking CSS/JS speeds up my site.”

    Disallowing assets stops Google from rendering your pages correctly and can hurt rankings.

Frequently asked questions
Does every website need a robots.txt file?

No. If you're happy for crawlers to access everything, you can omit it — a missing robots.txt is treated as “allow all.” But most sites benefit from one to declare a sitemap and control AI crawlers.

What happens if I don't have a robots.txt file?

Crawlers receive a 404 and assume they may crawl the entire site. Nothing breaks, but you lose the chance to point them to your sitemap or restrict low-value paths.

Is robots.txt the same as a robots meta tag?

No. robots.txt controls crawling at the site level (in a file). The robots meta tag (and X-Robots-Tag header) controls indexing at the page level. To keep a page out of Google, use the meta tag, not robots.txt.

Where is the robots.txt file located?

Always at the root of the host: https://example.com/robots.txt. A robots.txt in a subdirectory is ignored. Each subdomain needs its own.

Robots.txt Generator

Build a valid robots.txt from presets and crawler toggles — no syntax required.

Open the Generator
Related resources
Next upThe robots.txt File
RS

Robots.txt Studio Editorial · Technical SEO & crawling

We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.