robots.txt Examples

Real, copy-paste robots.txt files for the most common situations. Adapt the domain and paths to your site, then validate before deploying. Every example explains what it does and why.

RSRobots.txt Studio Editorial Updated June 8, 2026 Reviewed against Google Search Central and RFC 9309

Allow everything (safe default)

The simplest useful file: let every crawler in and point to your sitemap.

robots.txt

User-agent: *
Allow: /

Sitemap: https://example.com/sitemap.xml

Block everything (staging sites)

Useful on staging or pre-launch sites. Remove it before going live, or you'll stay out of search.

robots.txt

User-agent: *
Disallow: /

The most expensive typo in SEO

A leftover Disallow: / from staging is the #1 cause of accidental deindexing. Double-check production never ships this.

Blog

Let crawlers index posts, but keep internal search results and tag/author archives out of the index path.

Blog robots.txt

User-agent: *
Allow: /
Disallow: /search
Disallow: /*?s=

Sitemap: https://example.com/sitemap.xml

WordPress

Block the admin area but allow admin-ajax.php (themes and plugins need it). Don't block /wp-content/ — Google needs CSS and JS to render pages.

WordPress robots.txt

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://example.com/wp-sitemap.xml

Ecommerce

Keep carts, checkout, accounts, and faceted filter URLs out of crawling to protect crawl budget and avoid duplicate content.

Ecommerce robots.txt

User-agent: *
Allow: /
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /*?sort=
Disallow: /*?filter=

Sitemap: https://example.com/sitemap.xml

Block AI crawlers

Keep your content out of AI training datasets while staying fully visible in search. This blocks the major AI crawlers and leaves Googlebot/Bingbot untouched.

Block AI training crawlers

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: CCBot
User-agent: Google-Extended
User-agent: PerplexityBot
Disallow: /

User-agent: *
Allow: /

Sitemap: https://example.com/sitemap.xml

Want per-crawler control with explanations of what each bot does? Use the AI Crawler Manager, or read the full guide: How to Block AI Crawlers.

Before you copy-paste

Forgetting to change the domain
Update the Sitemap URL to your real domain — a placeholder sitemap helps no one.
Blocking assets
Never Disallow /wp-content/, CSS, or JS; it breaks Google's rendering.
Shipping the staging file
Confirm production doesn't contain Disallow: / from a staging template.

Frequently asked questions

What is a good default robots.txt?

User-agent: * with Allow: / and a Sitemap line. It lets everything be crawled and tells search engines where your sitemap is — a safe baseline for most sites.

How do I block all crawlers?

Use User-agent: * followed by Disallow: /. This is appropriate for staging sites but will remove a live site from search, so use it carefully.

How do I allow everything?

Either omit robots.txt entirely, or use User-agent: * with Disallow: (empty) or Allow: /. All three mean “crawl anything.”

How do I block AI crawlers like GPTBot?

Add a group listing the AI user-agents (GPTBot, ClaudeBot, CCBot, Google-Extended, PerplexityBot) with Disallow: /. This keeps them out without affecting search engines.

Robots.txt Generator

Build a valid robots.txt from presets and crawler toggles — no syntax required.

Open the Generator

Robots.txt Generator

Build these files with toggles.

Read

AI Crawler Manager

Block AI bots with one click.

Read

robots.txt syntax

Understand each directive.

Read

Next uprobots.txt Sitemap

Robots.txt Studio Editorial · Technical SEO & crawling

We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.