Everyone knows Google is the leading search engine. But there are times when you don’t want your WordPress site to show up in Google’s search results.
So, how to prevent google from indexing your site? Well, there are different ways to do so. But not all the ways are suitable for every website. Which method you choose will depend on your specific goals. In this blog post, we will explore how to prevent Google from indexing a webpage as well as your website. But before knowing the methods, it is also necessary to understand why people want to do it.
Sometimes marketers or website owners get organic traffic on pages that don’t need to be visited frequently. For example, Thank you page, the Cancel page, and other less important pages. Unfortunately, it destroys a particular target and confuses the tracking system. So, it will be beneficial if you block Google from seeing these pages. Or simply deindexing them is recommended.
Also, blocking indexing paths for specific web pages can improve the SEO of a site. If you get more traffic on the main pages, then it will take the site’s rank up.
Methods to Prevent Google from Indexing Your WordPress Site
You can find popular and specific methods to stop Google from indexing a WordPress website. The first and essential task is to find web pages that are not important. Yes, we include various types of pages from time to time to arrange or design our website. But, not all of them actually need search traffic, as we discussed before.
So, before learning the ways to prevent Google from indexing a WordPress, list the pages. It will help you stop the search engine from temporarily indexing those pages. And of course, you can later arrange them according to necessity. Now, let’s learn which techniques you should follow to control website indexing.
Editing the Robots.Txt file
Editing robots.txt file helps to hide web pages from search engines. Or, those who use a hosting service to maintain a site can ask them to modify it. It is a plain text file that supports Robots Exclusion Standard. You can find a robots.txt file in the root domain of your site. You can also edit this file by yourself and give permission or block any crawler to list your domain or subdomain.
To apply this technique: download a copy of your robots.txt file, edit it, and then upload it again to the root domain. The file may contain one or more instructions. For example, you can block Google agent from crawling any URL by using this,
User-agent: Googlebot Disallow: /nogooglebot/
This way, you can also prevent other agents from indexing a page and a website as well. Remember, you have to create the file named robots.txt. And your site must have only one robots.txt file. There are some other rules, too; you can check the details here.
NOTE: Remember that you shouldn’t use Disallow and Noindex tags together in a robots.txt file.
Applying Google Webmaster Tools
This system supports blocking a page or URL for a fixed period. People know it as a Google Search Console also helps to maintain a site in many aspects. You can easily prevent Google from indexing your site using its remove URLs tool.To apply the method, check the below things-
- Check out your URL is placed in the Search Console property.
- Choose Temporarily Hide and add the page URL.
- Next, select Clear URL from the cache and remove it from Search. You can copy and paste the page URL easily. But, sometimes, some nested page URLs may confuse the process. For instance, several posts’ URLs may point to the same page like-
http://www.example.com/greenery/thread/007 http://www.example.com/greenery/post/156 http://www.example.com/greenery/thread/007?post=156
The solution is that you have to submit extra removal requests. For every content you see, you will submit a removal request.
NOTE: Don’t forget it is a temporary page blocking method. It lasts only about six months.
You can apply this method in two ways. First, learn the implementation below and select any method according to your convenience.
There are also two ways to use a meta tag. To stop most search engines from indexing a page, just put the following meta tag into the <head> part of your page-
<meta name="robots" content="noindex">
And if you wish to prevent only Google web crawlers from indexing a page, use-
<meta name="googlebot" content="noindex">
Besides, you can prevent a page from being listed by adding a nofollow meta tag. It is quite simple to specify the link-
<a href="example.html" rel="nofollow" />
Also, you can add it to all the pages to prevent Google from indexing your site.
<meta name="googlebot" content="noindex, nofollow">
X-Robots-Tag HTTP Header
This is an alternative way of using meta tags. You can select a meta tag or this X-Robots-Tag. You can choose an X-Robots-Tag for any URL to work as an HTTP header’s instruction. Using this simple method, you can actually block google from indexing non-html files. Files of images, videos, PDFs, etc., are non-HTML files. A directive of robots meta tag works the same for an X-Robots-Tag also. An X-Robots-Tag HTTP header can have noindex or none instruction, looking like this,
HTTP/1.1 200 OK Date: Wed, 14 August 2022 20:42:40 GMT (…) X-Robots-Tag: noindex (…)
If you don’t want Google to generate a cached page then apply this code noarchive X-Robots-Tag with an unavailable_after X-Robots-Tag.
HTTP/1.1 200 OK Date: Wed, 14 August 2022 20:42:40 GMT (…) X-Robots-Tag: noarchive X-Robots-Tag: unavailable_after: 25 Jun 2010 15:00:00 PST (…)
Again, you can put conditions to a set of X-Robots-Tag to make a page visible. And it can work for different search engines.
Stop Manually From Dashboard
Many people don’t know that preventing search engines from ranking a site is also possible from the WordPress dashboard. Maybe, many of you see the option but don’t exactly know its purpose. For example, if your website is new, you have lots of last-minute tasks to finish and need time. You will not want search engines to list that site in this case. So, this method is suitable for you.
Access to your dashboard: Settings → Reading → Find Search engine visibility → Do Check mark the box
This little task can keep your content private. It actually modifies the robots.txt file and sometimes adds a related meta tag to the header. Thus, checking the box prevents Google from identifying the site.
NOTE: People often forget to uncheck the box when they take the site live. Leaving the box unchecked for a longer time will make the site invisible to Google, and it will never rank.
A Partially Effective Way- Not Using a Sitemap Or Preventing Google Indexing it
A sitemap helps Googlebot to identify pages and rank the WordPress website. It contains all the pages’ URLs of a site. Also, in the search engine crawling process, sitemaps help to find new pages. So, a search engine has a low chance of missing any content. If you want search engines to overlook your new site for some time, then don’t add an XML_sitemap.
However, sitemaps are not guaranteed that Google will always index all the pages. It depends on various other things too. But, you can slow down the search engine indexing process by not using a sitemap.
Again, if you already have included one or more XML_sitemaps, you can prevent Google from indexing the files. You can use a plugin and also edit your site’s .htaccess file.
Using a plugin to prevent Google indexing sitemap- There are different plugins for SEO today. Some have sitemap-indexing options like Yoast SEO, Rank Math, and others. If your site has one SEO plugin, you can look for the option and turn it OFF. See the example below,
Editing .htaccess file to prevent Google indexing sitemap- You just have to add an X-Robots-Tag to your site’s HTTP responses through the configuration files. You can use X-Robots-Tag: noindex to stop google indexing the sitemap.
For one sitemap file, the code is,
<IfModule mod_rewrite.c> <Files sitemap.xml> Header set X-Robots-Tag "noindex" </Files> </IfModule>
For more than one sitemap file, the code is,
<IfModule mod_rewrite.c> <Files ~ "^(sitemap1|sitemap2|sitemap3)\\.xml$"> Header set X-Robots-Tag "noindex" </Files> </IfModule>
Now, google or other search engines will not find the XML_sitemaps and index it. Since a sitemap has URLs of all the webpages, hiding it from search engines will help not to index the website.
💡 Do Regular Website Maintenance by Implementing 307 Redirect on WordPress Website!
How You Can Re-index Your Website ?
Well, you cannot ignore the importance of Google rank of a WordPress site. You have to promote your site and attract new traffic. So, after finishing leftover website building tasks and blocking Google to index certain web pages you need to work on SEO. And good SEO means indexing and reindexing the web pages.
- Uncheck the Search engine visibility box from your dashboard
- Check the site’s root directory if you have mistakenly added any Noindex command and Robots.txt block to any important page.
- If you consciously added instructions to the bots to not index content or page and now want to reindex, repeat number two carefully.
- Add an XML_sitemap to your WordPress website.
- If you are using a plugin to prevent Google from indexing your sitemap, then uninstall it or turn on the XML_sitemap indexing option only.
- Google Search Console has a URL inspection tool. You can generate an overall URL report using it to check what Google sees. Here, you can inspect a specific URL, also submit a URL, and request an index to Google.
- Finally, do regular and good SEO for your reindexed pages and increase overall traffic for your website.
Do you know why do People Sometimes Need to Delete a Page Title?
Look here 👉 Removing Page Title in WordPress 👌
These are some ways to prevent Google from indexing your site. Select your method wisely and apply it to your WordPress site. However, generally, website owners struggle to index their sites. People don’t understand that too much SEO is bad, and they must not go overboard. Many folks don’t even know the importance and reasons for deindexing a page. The actual fact is you should prevent search engines from indexing your site sometimes to get more traffic. However, this article is all about indexing, deindexing, and reindexing web pages. Hope you will learn the main facts and apply them accordingly.