robots.txt Examples

Real, copy-paste robots.txt files for the most common situations. Adapt the domain and paths to your site, then validate before deploying. Every example explains what it does and why.

RSRobots.txt Studio Editorial Updated June 8, 2026 Reviewed against Google Search Central and RFC 9309

Allow everything (safe default)

The simplest useful file: let every crawler in and point to your sitemap.

robots.txt
User-agent: *
Allow: /

Sitemap: https://example.com/sitemap.xml

Block everything (staging sites)

Useful on staging or pre-launch sites. Remove it before going live, or you'll stay out of search.

robots.txt
User-agent: *
Disallow: /

The most expensive typo in SEO

A leftover Disallow: / from staging is the #1 cause of accidental deindexing. Double-check production never ships this.

Blog

Let crawlers index posts, but keep internal search results and tag/author archives out of the index path.

Blog robots.txt
User-agent: *
Allow: /
Disallow: /search
Disallow: /*?s=

Sitemap: https://example.com/sitemap.xml

WordPress

Block the admin area but allow admin-ajax.php (themes and plugins need it). Don't block /wp-content/ — Google needs CSS and JS to render pages.

WordPress robots.txt
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://example.com/wp-sitemap.xml

Ecommerce

Keep carts, checkout, accounts, and faceted filter URLs out of crawling to protect crawl budget and avoid duplicate content.

Ecommerce robots.txt
User-agent: *
Allow: /
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /*?sort=
Disallow: /*?filter=

Sitemap: https://example.com/sitemap.xml

Block AI crawlers

Keep your content out of AI training datasets while staying fully visible in search. This blocks the major AI crawlers and leaves Googlebot/Bingbot untouched.

Block AI training crawlers
User-agent: GPTBot
User-agent: ClaudeBot
User-agent: CCBot
User-agent: Google-Extended
User-agent: PerplexityBot
Disallow: /

User-agent: *
Allow: /

Sitemap: https://example.com/sitemap.xml

Want per-crawler control with explanations of what each bot does? Use the AI Crawler Manager.

Before you copy-paste

  • Forgetting to change the domain

    Update the Sitemap URL to your real domain — a placeholder sitemap helps no one.

  • Blocking assets

    Never Disallow /wp-content/, CSS, or JS; it breaks Google's rendering.

  • Shipping the staging file

    Confirm production doesn't contain Disallow: / from a staging template.

Frequently asked questions
What is a good default robots.txt?

User-agent: * with Allow: / and a Sitemap line. It lets everything be crawled and tells search engines where your sitemap is — a safe baseline for most sites.

How do I block all crawlers?

Use User-agent: * followed by Disallow: /. This is appropriate for staging sites but will remove a live site from search, so use it carefully.

How do I allow everything?

Either omit robots.txt entirely, or use User-agent: * with Disallow: (empty) or Allow: /. All three mean “crawl anything.”

How do I block AI crawlers like GPTBot?

Add a group listing the AI user-agents (GPTBot, ClaudeBot, CCBot, Google-Extended, PerplexityBot) with Disallow: /. This keeps them out without affecting search engines.

Robots.txt Generator

Build a valid robots.txt from presets and crawler toggles — no syntax required.

Open the Generator
Related resources
Next uprobots.txt Sitemap
RS

Robots.txt Studio Editorial · Technical SEO & crawling

We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.