Googlebot in robots.txt
Googlebot is Google's crawler. Here's exactly what it does, whether it respects robots.txt, and the rules to control it. To apply a policy in one click, use the AI Crawler Manager.
What Googlebot is
Googlebot is the crawler behind Google Search — the single most important bot for almost every website. It fetches and renders your pages so they can be indexed and ranked. If you accidentally block Googlebot in robots.txt, you can disappear from Google entirely.
| Property | Value |
|---|---|
| User-agent | Googlebot |
| Operator | |
| Category | Search engines |
| Honors robots.txt | Yes |
| Affects search ranking | Yes |
Official documentation: Google crawler docs.
What Googlebot does
- Discovers and re-crawls your pages to keep Google's index current.
- Runs as two main variants — Googlebot Smartphone (primary, mobile-first) and Googlebot Desktop.
- Renders JavaScript via an evergreen Chromium, so blocking CSS/JS can hurt how pages are understood.
Why site owners care
- Blocking Googlebot removes you from Google Search — the most common catastrophic robots.txt mistake.
- robots.txt controls crawling, not indexing: a disallowed URL can still be indexed without a snippet if it's linked elsewhere. Use noindex (not robots.txt) to keep a page out of the index.
- Blocking resource files (CSS/JS) can degrade rendering and rankings.
How to allow or block Googlebot
Add a group targeting the Googlebot user-agent. Disallow: / blocks it from your whole site; an empty Disallow: allows it.
User-agent: Googlebot
Disallow: /User-agent: Googlebot
Disallow:This crawler affects your search ranking
How to verify Googlebot
Don't trust the user-agent string alone — it is easily spoofed. Verify with a reverse DNS lookup on the request IP (it must resolve to googlebot.com or google.com and forward-confirm), or match against Google's published crawler IP ranges.
Does it honor robots.txt?
Recommendation
Recommended: Allow
Will blocking Googlebot remove my site from Google?
Yes. Disallowing Googlebot in robots.txt stops Google from crawling your pages, which removes them from Search over time. Only block specific paths you genuinely don't want crawled, and never Disallow: / under User-agent: Googlebot or *.
Does robots.txt stop Google from indexing a page?
No. robots.txt blocks crawling, not indexing. A blocked URL can still appear in Google (without a description) if other pages link to it. To keep a page out of the index, allow crawling and use a noindex meta tag or X-Robots-Tag header instead.
How do I verify a request is really from Googlebot?
Run a reverse DNS lookup on the requesting IP; genuine Googlebot resolves to a googlebot.com or google.com host and forward-confirms. You can also match Google's published IP ranges. The user-agent string alone can be faked.
Robots.txt Validator
Catch syntax errors and best-practice issues, with a health score.
Robots.txt Studio Editorial · Technical SEO & crawling
We build robots.txt tooling and parse thousands of real-world files. Guides are written by practitioners and reviewed against the Google and RFC 9309 specifications.