The file robots.txt

Search engines have literally transformed the world of websites. The popularity of a site largely depends on whether it ranks on the top of the search results. At least it should be listed in the first page though. There is a popular joke about this which says that the safest place to hide a dead body is in the 2nd page of the Google Search results :D. 

With the popularity of search engines web servers have come up with a file called robots.txt . Most of the Information on the Internet is public. Anyone can search and get Information from any website. The reason we can do this now is because of search engines. They  automatically index every page of a website and on the basis of which it is listed on the search results. This is all good but sometimes we might not want all the content in the website be made public. Companies might have confidential Information in their website which they would not like to reveal to the outside world. 

In this case the solution is to stop the Search Engine from Indexing that part of the website. And the way we can do this is to tell the Search Engine to not Index that particular section of the website. robots.txt is a file where we can give Instructions  to the search engine on what areas of the website it should not Index. Because as soon as the Search engine indexes the page it becomes public.

Robots.txt contains the “disallow” keyword along with the path. The path is the location which we don’t want the Search engine to Index. 

For example a statement in the robots.txt file could be disallow wp-admin/images. This statement means we don’t want the search engine to Index the Images of the website. So it means that it will index everything except Images.

One of the common reasons that a website does not rank in a search engine even though it is doing well business wise is because their robots.txt file is not configured properly. Sometimes the statement would be disallow / . This basically means that it would not allow the search engine to Index any part of the website since it has specified the root directory, which Includes all the directories. 

One must take caution while configuring the robots.txt file in order to have a good rank in the search results. A minor change in the file could drastically affect it’s performance in the search results.

Advertisements