Wednesday, July 18, 2012

Why Robot.Txt Important for Website’s

Web Site owners use the /robots.txt file to offer instructions about their site to web robots; this is called The Robots Exclusion Protocol.

The “User-agent: *” means this section applies to all robots.
The “Disallow: /” tells the robot that it should not visit any pages on the site.

Why do I need a robots.txt file?
Generating a robots.txt file will not improve your search engine positioning, but it does present robots with information regarding which files you will not allow to be crawled and indexed in the search engines.

When a Robot Crawls your site it looks for the robots.txt file. If it doesn’t find one it imagines automatically that it may crawl and index the entire site.

Not having a robots.txt file can also create avoidable 404 errors in your server logs, creation it more complex to track “real” 404 errors.

Assuming you want your whole site indexed and only want to stop the unnecessary 404 errors from stirring you has a couple of preferences.

  • Upload a Blank robots.txt file to the root directory of your Domain.
  • Upload a Simple robots.txt file to the root directory of your Domain.
What is a simple robots.txt file?
Please Note: This will allow all robots to crawl and index all files.

This Allows all Robots to Crawl all files.
User-agent: *
Disallow:


What if I don’t want a exacting file crawled?
Please Note: Disallowing a specific file to be Crawled will keep it from being indexed. The file disallowed will not show up in the Search Engines.
HOWEVER, This is only effective for Friendly Robots.
Robots can Prefer to Ignore your instructions.

This allows all robots to crawl all files except the images file.
User-agent: *
Disallow: /images/

This allows all robots to crawl all files except the images file and the stats file.

User-agent: *
Disallow: /images/
Disallow: /stats/

What if I want to disallow a Particular robot?
Rarely may you find that you would like to disallow exact robots from crawling your site or limit which files they may have entrance to.

This Denies access to Googlebot-image to any files in your Domain

User-agent: Googlebot-Image
Disallow: /

This Particularly Denies Googlebot-image to your Images file

User-agent: Googlebot-Image
Disallow: /images/

How do I create a robots.txt file?
Simply create a text document and save the new document as robots.txt Do not use a html editor to create the file unless is has the ability to create a plain text document (ASCII). Most computers will allow you to create a Text Document using Notepad.

  • Right click on your Desktop
  • Choose new
  • Choose text document
  • Open the document you just created
  • Insert instructions to robots
  • Click on save as
  • Save document as robots.txt

How do I know if I have done everything Correctly?
Once you have uploaded the file to the root directory of your domain it’s good idea to use a robots.txt validator to confirm that everything is correct. You can search Google for free robots.txt validator or try the one listed below.

What if I need more information about robots.txt files?
This Page is Intended to Cover creating a very simple robots.txt file. If you require a more detailed robots.txt file for your website there are many help resources available on the net. Seo in Ahmedabad

2 comments:

Web Designing Company Bangalore said...

The main process of adding robots.txt file on the website is to prevent the important page of website from the Google Crawlers.
Web Designing Company Bangalore | Web Development Company Bangalore

Unknown said...

Thanks for your valuable information's. thanks for sharing such an informative post.
"IVORIES" is a Multi-speciality Dentist in Ahmedabad

Popular Posts