How to Prevent a Website Page From Showing Up in Search Results

May 11, 2021 · 344 words · ~2 minutes to read

To prevent a website page from showing up in search results, either set a robots meta tag or send a X-Robots-Tag HTTP header.

So you can add this tag to the page:

<meta name="robots" content="noindex" />

Or send this header for the page:

X-Robots-Tag: noindex

One benefit of the header approach is that you can use it for non-HTML content, like a PDF or JSON file.

The noindex value tells crawlers, such as Google and Bing, not to index the page, so it won’t show up in search results.

Don’t Use `robots.txt`

You might think to use the robots exclusion standard (i.e. robots.txt) to disallow crawling, but that doesn’t work because then the crawlers can’t see your directive to not index the page. You’ve instructed them not to look at the page at all! So if other websites link to your page, a crawler can still pick up and index the page.

The robots.txt file is for controlling crawling, not indexing.

Directives

There are many possible directive values, and you can specify more than one by separating them with commas:

all: no restrictions (the default behavior)
noindex: exclude the page from search results
nofollow: don’t follow the links in the page
none: the same as noindex, nofollow
noarchive or nocache: don’t link to a cached version of the page
nosnippet: don’t show a description, snippet, thumbnail, or video preview of the page in search results
max-snippet:[length]: limit a snippet to [length] number of characters
max-image-preview:[setting]: set an image preview’s maximum size, where [setting] can be none, standard, or large
max-video-preview:[length]: limit a video preview to [length] number of seconds
notranslate: don’t link to a translation of the page
noimageindex: don’t index images on the page
unavailable_after:[datetime]: exclude the page from search results after [datetime], which should be in a standard format, such as ISO 8601

However, not all crawlers support all values. For example, check out this documentation for Google, this documentation for Bing, and this documentation for Yandex.

Specifying Crawlers

If you want to use different directives based on the specific crawler, you can specify the user agent in the meta tag’s name:

<meta name="googlebot" content="noindex" />
<meta name="bingbot" content="nofollow" />

Or in the header value:

X-Robots-Tag: googlebot: noindex
X-Robots-Tag: bingbot: nofollow

How to Prevent a Website Page From Showing Up in Search Results

Don’t Use robots.txt

Directives

Specifying Crawlers

Don’t Use `robots.txt`