robots.txt vs sitemap
They're often mentioned together, but they do opposite jobs. robots.txt restricts what crawlers may fetch; a sitemap suggests what they should fetch. One is a gate, the other is a map — and they work best together.
Restriction vs suggestion
robots.txt is a set of rules at /robots.txt telling crawlers which paths they may or may not request. An XML sitemap is a list of URLs you want crawled and indexed, with optional metadata like last-modified dates. robots.txt removes URLs from consideration; a sitemap adds them. They aren't alternatives — most sites need both.
| robots.txt | XML sitemap | |
|---|---|---|
| Purpose | Restrict crawling | Advertise URLs to crawl |
| Format | Plain text directives | XML list of URLs |
| Location | /robots.txt (fixed) | Any URL, e.g. /sitemap.xml |
| Effect | Blocks/allows fetching | Hints at discovery & freshness |
| Guarantees indexing? | No | No — it's a suggestion |
How they work together
robots.txt can advertise your sitemap with a Sitemap directive, so crawlers discover it without you submitting anything. This is the one place the two files directly connect.
User-agent: *
Allow: /
Disallow: /admin/
Sitemap: https://example.com/sitemap.xmlDon't contradict yourself
Why you usually need both
- robots.txt alone tells crawlers what to avoid, but not what's important — large or new sites get discovered slowly without a sitemap.
- A sitemap alone can't stop crawlers wasting budget on internal search, filters, or admin URLs.
- Together: the sitemap focuses crawlers on your best URLs; robots.txt keeps them off the noise.
Examples
The Sitemap directive must be an absolute URL — relative paths are ignored:
Sitemap: https://example.com/sitemap.xmlMultiple sitemaps (or a sitemap index) are fine — one per line:
Sitemap: https://example.com/sitemap-pages.xml
Sitemap: https://example.com/sitemap-blog.xmlCommon mistakes
Thinking you must choose one
They're complementary. Most sites benefit from both a robots.txt and a sitemap.
Listing blocked URLs in the sitemap
Don't include URLs your robots.txt disallows — it's contradictory.
A relative Sitemap directive
Sitemap: /sitemap.xml is invalid. Use the full https URL.
Sitemap-only submission
The directive aids discovery, but also submit your sitemap in Search Console for reporting.
What is the difference between robots.txt and a sitemap?
robots.txt restricts which URLs crawlers may fetch; an XML sitemap lists the URLs you want them to fetch and index. robots.txt is a gate, a sitemap is a map. They serve opposite but complementary roles.
Do I need both robots.txt and a sitemap?
Usually yes. A sitemap helps crawlers discover and prioritize your important pages, while robots.txt keeps them off low-value or private sections. Together they make crawling efficient.
Should the sitemap go in robots.txt?
Adding a Sitemap directive to robots.txt is recommended — it lets crawlers discover your sitemap automatically. It must be an absolute https URL, and you can list more than one.
Can robots.txt and the sitemap conflict?
Yes. If your sitemap lists URLs that robots.txt blocks, you send crawlers a contradictory signal. Keep them consistent: don't advertise URLs you've disallowed.
Robots.txt Validator
Catch syntax errors and best-practice issues, with a health score.
Robots.txt Studio Editorial · Technical SEO & crawling
We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.