Danny Guo | 郭亚东

How to Prevent a Website Page From Showing Up in Search Results

  ·  373 words  ·  ~2 minutes to read

To prevent a website page from showing up in search results, either set a robots meta tag or send a X-Robots-Tag HTTP header.

So you can add this tag to the page:

<meta name="robots" content="noindex" />

Or send this header for the page:

X-Robots-Tag: noindex

One benefit of the header approach is that you can use it for non-HTML content, like a PDF or JSON file.

The noindex value tells crawlers, such as Google and Bing, not to index the page, so it won’t show up in search results.

Don’t Use robots.txt

You might think to use the robots exclusion standard (i.e. robots.txt) to disallow crawling, but that doesn’t work because then the crawlers can’t see your directive to not index the page. You’ve instructed them not to look at the page at all! So if other websites link to your page, a crawler can still pick up and index the page.

The robots.txt file is for controlling crawling, not indexing.

Directives

There are many possible directive values, and you can specify more than one by separating them with commas:

However, not all crawlers support all values. For example, check out this documentation for Google, this documentation for Bing, and this documentation for Yandex.

Specifying Crawlers

If you want to use different directives based on the specific crawler, you can specify the user agent in the meta tag’s name:

<meta name="googlebot" content="noindex" />
<meta name="bingbot" content="nofollow" />

Or in the header value:

X-Robots-Tag: googlebot: noindex
X-Robots-Tag: bingbot: nofollow

← What I Learned by Relearning HTML How to Use Git Bisect for Debugging →

Follow me on Twitter or subscribe to my free newsletter or RSS feed for future posts.

Found an error or typo? Feel free to open a pull request on GitHub.


comments powered by Disqus