robots.txt Examples
Real, copy-paste robots.txt files for the most common situations. Adapt the domain and paths to your site, then validate before deploying. Every example explains what it does and why.
Allow everything (safe default)
The simplest useful file: let every crawler in and point to your sitemap.
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xmlBlock everything (staging sites)
Useful on staging or pre-launch sites. Remove it before going live, or you'll stay out of search.
User-agent: *
Disallow: /The most expensive typo in SEO
Blog
Let crawlers index posts, but keep internal search results and tag/author archives out of the index path.
User-agent: *
Allow: /
Disallow: /search
Disallow: /*?s=
Sitemap: https://example.com/sitemap.xmlWordPress
Block the admin area but allow admin-ajax.php (themes and plugins need it). Don't block /wp-content/ — Google needs CSS and JS to render pages.
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://example.com/wp-sitemap.xmlEcommerce
Keep carts, checkout, accounts, and faceted filter URLs out of crawling to protect crawl budget and avoid duplicate content.
User-agent: *
Allow: /
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /*?sort=
Disallow: /*?filter=
Sitemap: https://example.com/sitemap.xmlBlock AI crawlers
Keep your content out of AI training datasets while staying fully visible in search. This blocks the major AI crawlers and leaves Googlebot/Bingbot untouched.
User-agent: GPTBot
User-agent: ClaudeBot
User-agent: CCBot
User-agent: Google-Extended
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xmlWant per-crawler control with explanations of what each bot does? Use the AI Crawler Manager.
Before you copy-paste
Forgetting to change the domain
Update the Sitemap URL to your real domain — a placeholder sitemap helps no one.
Blocking assets
Never Disallow /wp-content/, CSS, or JS; it breaks Google's rendering.
Shipping the staging file
Confirm production doesn't contain Disallow: / from a staging template.
What is a good default robots.txt?
User-agent: * with Allow: / and a Sitemap line. It lets everything be crawled and tells search engines where your sitemap is — a safe baseline for most sites.
How do I block all crawlers?
Use User-agent: * followed by Disallow: /. This is appropriate for staging sites but will remove a live site from search, so use it carefully.
How do I allow everything?
Either omit robots.txt entirely, or use User-agent: * with Disallow: (empty) or Allow: /. All three mean “crawl anything.”
How do I block AI crawlers like GPTBot?
Add a group listing the AI user-agents (GPTBot, ClaudeBot, CCBot, Google-Extended, PerplexityBot) with Disallow: /. This keeps them out without affecting search engines.
Robots.txt Generator
Build a valid robots.txt from presets and crawler toggles — no syntax required.
Robots.txt Studio Editorial · Technical SEO & crawling
We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.