Glossary

Understanding The X-Robots-Tag: How It Affects Your SEO And Website Ranking

The X-Robots-Tag is a powerful HTTP header that lets you control how search engines index and crawl non-HTML resources—like PDFs, images, and APIs—so you can prevent unwanted indexing, manage crawl budget, and protect sensitive content. Understanding how to implement and use directives such as noindex, nofollow, and noarchive via X-Robots-Tag helps you fine-tune SEO for all file types, improve site visibility, and ensure search engines treat non-HTML assets the way you intend.

X-Robots-Tag

X-Robots-Tag: an HTTP response header that instructs search engine robots how to index or crawl a resource (supports directives like noindex, nofollow, none, noarchive, nosnippet, unavailable_after, max-snippet, max-image-preview, max-video-preview), usable for HTML and non-HTML files and set per-response or per-file via server configuration.

What Is the X-Robots-Tag?

X-Robots-Tag is an HTTP response header that tells search engine crawlers how to treat a specific resource (indexing, following links, caching, snippets, previews). Unlike the meta robots tag, which only works inside HTML pages, X-Robots-Tag can be applied to any file served over HTTP—PDFs, images, videos, JSON, APIs, and binary files—because it’s delivered with the server response for that resource.

Common directives supported include:

noindex

nofollow

none (equivalent to noindex + nofollow)

noarchive

nosnippet

unavailable_after

max-snippet

max-image-preview

max-video-preview

Multiple directives can be comma-separated in a single header or split across multiple headers.

Where it’s set:

Server configuration (Apache .htaccess, Nginx config)

Application code that sends HTTP responses

CDN or edge rules

Example header forms:

X-Robots-Tag: noindex, nofollow
X-Robots-Tag: noindex
X-Robots-Tag: max-image-preview:large

Why use it:

Control indexing of non-HTML assets

Enforce consistent rules across file types

Protect sensitive or duplicate resources

Manage crawl budget for large sites or media-heavy pages

Interaction with other signals: when both a meta robots tag and an X-Robots-Tag apply, search engines typically follow the most specific instruction for the resource being crawled; X-Robots-Tag explicitly controls the HTTP-served resource itself.

How to Use the X-Robots-Tag

When and where to apply

Use X-Robots-Tag on non-HTML files (PDFs, images, videos, API responses) or when you need per-response control that HTML meta robots cannot provide.

Apply globally, by file type, by path, or per response depending on goals (block indexing, prevent snippets, control previews, set unavailable_after).

Common directives and combiners

noindex — prevent indexing

nofollow — prevent following links on the resource

none — equivalent to noindex, nofollow

noarchive — prevent cached snapshot

nosnippet — prevent text/video/image snippets

unavailable_after: — expire the resource from the index after the date

max-snippet:, max-image-preview:, max-video-preview:

Combine with commas: X-Robots-Tag: noindex, noarchive, nosnippet

Examples by server / platform

Apache (.htaccess or server config)

For a single file:

Header set X-Robots-Tag "noindex, nofollow" "/path/to/file.pdf"

For all PDFs:


 Header set X-Robots-Tag "noindex, noarchive"

Nginx

For a specific location or file type:

location ~* .pdf$ {
 add_header X-Robots-Tag "noindex, noarchive";
}

For a single file:

location = /files/secret.pdf {
 add_header X-Robots-Tag "noindex, nofollow";
}

IIS (web.config)

Add a response header:

Use URL Rewrite or conditions to target file types/paths.

PHP / application-level

Send the header conditionally:

header('X-Robots-Tag: noindex, noarchive');

Use runtime logic to set per-response directives (user session, auth, query parameter).

AWS S3 / CloudFront

S3 metadata: set x-amz-meta-x-robots-tag or configure CloudFront to add the header:
CloudFront behavior → Response headers policy → Add X-Robots-Tag: noindex, noarchive

For S3 static sites, set the header via the upload tool (s3 cp --metadata-directive).

Netlify

_headers file:

/files/*.pdf
 X-Robots-Tag: noindex, noarchive

Or use Netlify Edge Functions to set headers dynamically.

Cloudflare

Use Transform Rules or Workers to add/modify X-Robots-Tag per path or content type.

CDN considerations

Ensure the CDN preserves or injects the header consistently; prefer edge rules for uniform behavior.

Avoid conflicting headers between origin and edge — the final response header wins.

Per-file vs per-response decisions

Per file (server config) for static assets and consistent behavior.

Per response (application) when control depends on user state, auth, or dynamic conditions.

Testing and verification

Use curl: curl -I — check the X-Robots-Tag value.

Use browser DevTools Network tab to inspect response headers.

Use Google Search Console URL Inspection (for HTML) and Coverage reports; use URL Inspection on canonical HTML referencing non-HTML to see how Google treated it.

Use third-party header checkers and log analysis to ensure crawlers respect directives.

Best practices

Prefer X-Robots-Tag for non-HTML assets; use meta robots for HTML.

Avoid accidental noindex: test after deploying header rules.

Use noindex for sensitive or duplicate non-HTML assets; use noarchive/nosnippet to protect content previews.

Combine with robots.txt and authentication where appropriate — robots.txt disallows crawling but not indexing if linked; X-Robots-Tag noindex is stronger for preventing indexing.

Document header rules and keep them discoverable for future site changes.

Examples quick reference

Block indexing of PDFs sitewide: X-Robots-Tag: noindex on *.pdf

Prevent cached copies and snippets for a video: X-Robots-Tag: noarchive, nosnippet

Expire an old asset after a date: X-Robots-Tag: unavailable_after: Sun, 31 Dec 2025 23:59:59 GMT

Limit snippet size: X-Robots-Tag: max-snippet:50

Rollback and monitoring

Remove or change headers carefully; monitor index status and search traffic after changes.

Track server logs and Search Console to confirm crawler behavior and index changes.