Username: Save?
Password:
Home Forum Links Search Login Register*
    News: Keep The TechnoWorldInc.com Community Clean: Read Guidelines Here.
Recent Updates
[August 11, 2025, 02:03:44 PM]

[August 11, 2025, 02:03:44 PM]

[August 11, 2025, 02:03:44 PM]

[August 11, 2025, 02:03:44 PM]

[May 13, 2025, 02:04:25 PM]

[May 13, 2025, 02:04:25 PM]

[May 13, 2025, 02:04:25 PM]

[April 12, 2025, 01:54:20 PM]

[April 12, 2025, 01:54:20 PM]

[April 12, 2025, 01:54:20 PM]

[April 12, 2025, 01:54:20 PM]

[March 12, 2025, 03:05:30 PM]

[March 12, 2025, 03:05:30 PM]
Subscriptions
Get Latest Tech Updates For Free!
Resources
   Travelikers
   Funistan
   PrettyGalz
   Techlap
   FreeThemes
   Videsta
   Glamistan
   BachatMela
   GlamGalz
   Techzug
   Vidsage
   Funzug
   WorldHostInc
   Funfani
   FilmyMama
   Uploaded.Tech
   Netens
   Funotic
   FreeJobsInc
   FilesPark
Participate in the fastest growing Technical Encyclopedia! This website is 100% Free. Please register or login using the login box above if you have already registered. You will need to be logged in to reply, make new topics and to access all the areas. Registration is free! Click Here To Register.
+ Techno World Inc - The Best Technical Encyclopedia Online! » Forum » THE TECHNO CLUB [ TECHNOWORLDINC.COM ] » Computer / Technical Issues » Web Design / Graphics Design/ Animation » Website Development
 Using Robots.txt Files To Feed The Spiderbots
Pages: [1]   Go Down
  Print  
Author Topic: Using Robots.txt Files To Feed The Spiderbots  (Read 608 times)
Shawn Tracer
TWI Hero
**********


Karma: 2
Offline Offline

Posts: 16072


View Profile
Using Robots.txt Files To Feed The Spiderbots
« Posted: February 18, 2008, 10:54:45 AM »


Using Robots.txt Files To Feed The Spiderbots
 by: Christian Whiting

It's a Thursday evening. You are looking at your website logs to determine where your hits are coming from. You notice you are getting a ton of 404 errors records for a robots.txt file.

You might not even know what a robots.txt file is, let alone why it is missing from your website. Let take a look at this mysterious file that seems to be missing and why it's important to have it.

Search engines like Google cruise the internet by sending out their spidering software. These are commonly known as spiderbots. The spiderbots visit websites all around the internet to include them in their index listings. The first thing they look for when they visit is a file called the robots.txt file. This file normally is found in the root directory of hosted website.

This file contains a set of rules that the spiders are programmed to obey based on standard protocol. These rules help the visiting spider determine what part of your website to include or to ignore all together.

The most common rule used in the robots.txt file is to deny the search engine spiders access to restricted areas of your website that you don't want them visiting and indexing for the whole internet to view.

These restricted areas normally contain your downloads, images, or a cgi-bin directory that are used only by your website visitors or for the normal daily operations of you website.

What A robots.txt file is not....

Keep in mind that a robots.txt file is not a method to keep your information secure and safe from prying eyes. It simply is used to lock visiting spiders from indexing areas of your website.

Note that using a robots.txt file does not speed up the process of search engines indexing and getting your website in their search directories. Also, a robots.txt file is not used to tell search engine spiders what to do, only what not to do.

Benefits of using a robots.txt file:

    * If you have parts of your website that are very similar you can block them from being crawled to avoid being flagged as a spammer. This is especially useful if you have similar pages optimized for different website browsers or website connection speeds.

    * You eliminate 404 errors for missing robots.txt from your server logs by using a robots.txt file. Just create a blank robots.txt file in a basic text file editing program and upload it to your root directory.

    * Can be used to block search engine spiders from indexing part or all of your website saving valuable bandwidth

Creating A robots.xt

Creating a robots.txt file is not complicated but you should be sure to do it correctly. If your file contains incorrect rules it can completely block all spiders and prevent them from indexing your website.

You can create a robots.txt file using a simple text editing program like NotePad or you can generate a file automatically using several software programs or online website resources.

For information and rules on how to manually create a robots.txt file
visit http://www.robotstxt.org/wc/exclusion.html#robotstxt

To create a robots.txt file online visit:
http://searchbliss.com/webmaster_tools/robots-txt-text-generator.htm

Once you have a robots.txt file created upload it to your root directory of your website. Now you will be ready the next time the spiderbots come around.

About The Author

Christian Whiting is the publisher of Internet Profits. Dedicated to bringing you the best tips, tools and resources to help you make more money online. http://internetprofits.bushido.net

[email protected]

Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Copyright © 2006-2023 TechnoWorldInc.com. All Rights Reserved. Privacy Policy | Disclaimer
Page created in 0.059 seconds with 24 queries.