Free Robots.txt Testing Tool

Test, validate, and audit robots.txt rules for Googlebot, Bingbot, GPTBot, ChatGPT-User, and other crawlers.

How it works+

What this tool does

The Free Robots.txt Testing Tool helps you test robots.txt, validate robots.txt directives, and use a robots tester or robots.txt checker workflow against Google's official Robots Exclusion Protocol (REP) guidelines. It allows you to test crawl rules for specific URLs, check whether each URL is allowed or blocked, review the matched rule, and identify potential issues before deploying changes.

How it works

  1. Input robots.txt: Paste/edit robots.txt content directly in the editor to test draft changes. If needed, open your site's robots.txt URL in a browser and copy it into the editor.
  2. Test URL: Enter a specific URL to test and select a user agent (Googlebot, Bingbot, GPTBot, etc.). The tool uses Google's exact matching logic to determine if the URL is allowed or disallowed.
  3. Rule Matching: The tool identifies the most specific user-agent group, finds the longest matching path rule, and applies Google's precedence rules (longest path wins; if equal length, Allow wins over Disallow).
  4. Validation & Linting: The tool performs comprehensive validation checks based on Google's official guidelines, flagging errors, warnings, and passed checks with detailed explanations.
  5. Visual Feedback: The matched rule is highlighted in the editor (green for allowed, red for disallowed) with line numbers, making it easy to see which rule applies to your test URL.

What it audits & lints for

  • Syntax & Format: Malformed lines, missing colons, invalid directives, UTF-8 encoding, file size limits (500 KiB)
  • Path Validation: Paths must start with "/" (warnings for wildcard-first paths like "*/pattern"), proper wildcard usage (* and $), empty path handling
  • User-Agent Grouping: Rules must be associated with User-agent declarations, proper group separation, missing global groups (User-agent: *)
  • Sitemap & Crawl-delay: Valid sitemap URLs, appropriate crawl-delay values, proper placement
  • Best Practices: Duplicate rules, overly broad wildcards, conflicting Allow/Disallow directives
Note: This tool uses backend services that may take up to a minute to start up on first use. If your request doesn't process immediately, please wait a moment and try again.
1

Test URL automatically fetches robots.txt for the URL's domain. If fetching is blocked, paste robots.txt here and test again.

Validation & Linting

Run a test to see validation and linting insights for the current robots.txt content.

Robots.txt testing FAQs

A robots.txt tester is a tool that checks whether a specific URL is allowed or blocked by the rules in a robots.txt file. This robots.txt tester can fetch a live file, validate robots.txt syntax, test crawl rules for a selected user agent, and show the matched rule that explains the verdict.

This tool combines a robots.txt checker, robots tester, and auditor in one workflow. You can test robots.txt from a live domain or pasted draft, evaluate Googlebot, Bingbot, GPTBot, ChatGPT-User, and other crawlers, review linting warnings, and see whether a URL is allowed or blocked with the exact matched rule.

The core robots.txt protocol uses User-agent and Disallow directives, and modern crawlers commonly support Allow and Sitemap as well. This tool also surfaces Crawl-delay, wildcard patterns, end-of-path anchors, malformed lines, duplicate rules, missing user-agent groups, and other practical validation signals.

Allow and Disallow rules tell crawlers which URL paths they can fetch. When multiple rules match the same path, major crawlers such as Google use the most specific matching path. If an Allow and Disallow rule have the same matching length, Allow typically wins, so testing the final matched rule is the safest way to confirm the result.

Some AI crawlers say they honor robots.txt, but behavior varies by crawler and use case. You can test crawl rules for bots such as GPTBot and ChatGPT-User here, but robots.txt is not an access-control or security layer. For private content, use authentication or server-side restrictions instead.