Username: Save?
Password:
Home Forum Links Search Login Register*
    News: Welcome to the TechnoWorldInc! Community!
Recent Updates
[April 24, 2024, 11:48:22 AM]

[April 24, 2024, 11:48:22 AM]

[April 24, 2024, 11:48:22 AM]

[April 24, 2024, 11:48:22 AM]

[April 03, 2024, 06:11:00 PM]

[April 03, 2024, 06:11:00 PM]

[April 03, 2024, 06:11:00 PM]

[April 03, 2024, 06:11:00 PM]

[March 06, 2024, 02:45:27 PM]

[March 06, 2024, 02:45:27 PM]

[March 06, 2024, 02:45:27 PM]

[March 06, 2024, 02:45:27 PM]

[February 14, 2024, 02:00:39 PM]
Subscriptions
Get Latest Tech Updates For Free!
Resources
   Travelikers
   Funistan
   PrettyGalz
   Techlap
   FreeThemes
   Videsta
   Glamistan
   BachatMela
   GlamGalz
   Techzug
   Vidsage
   Funzug
   WorldHostInc
   Funfani
   FilmyMama
   Uploaded.Tech
   MegaPixelShop
   Netens
   Funotic
   FreeJobsInc
   FilesPark
Participate in the fastest growing Technical Encyclopedia! This website is 100% Free. Please register or login using the login box above if you have already registered. You will need to be logged in to reply, make new topics and to access all the areas. Registration is free! Click Here To Register.
+ Techno World Inc - The Best Technical Encyclopedia Online! » Forum » THE TECHNO CLUB [ TECHNOWORLDINC.COM ] » Techno Articles » Website Promotion » Search Engine
 Search engine spiders and their purpose
Pages: [1]   Go Down
  Print  
Author Topic: Search engine spiders and their purpose  (Read 1604 times)
Stephen Taylor
TWI Hero
**********



Karma: 3
Offline Offline

Posts: 15522

unrealworld007
View Profile
Search engine spiders and their purpose
« Posted: July 27, 2007, 01:44:55 PM »


Search engine spiders are by far one of the most useful things to come around in the last 10 years of the internet. They are useful not only to the web sites (Google and many others) that use them, but also to people who are searching for a particular site and those who run web sites. Spiders allow your site to be seen by the millions of people who use search engines every day. In this newsletter, we will discuss what search engine spiders do, how they work, and how to set up a robots.txt file and upload that to your site to keep spiders from visiting your site.

What are spiders and what purpose do they serve? Spiders are essentially programs that “crawl” sites and report back to their superior (Google or whatever search engine they were created for) what their findings are. Their purpose is to make it easy for sites to get listed in search engines.

You might be wondering, what does it mean to “crawl” a site? Well it means to visit and site and copy the information.

How do spiders work? Spiders work by finding links to web sites, visiting those web sites, going through the content of a web site and then reporting the content of the site back to the database of the site which they are working for. Google spiders, thus, crawl sites and report the information back to Google’s database. From there, the information is added to Google’s search engine, and the site then shows up in Google search results. Much the same process happens with any other search engine spider.

How can I keep spiders from visiting my site? You might be thinking, “Why would I want to keep such a useful thing from visiting my site?” Well, the short answer is, sometimes site owners don’t want the spider to crawl on a particular part of their site. Some site owners don’t want spiders to crawl their site at all. The reasons for not wanting a spider to crawl a site or a particular part of a site vary, although most of the time it is because the site is either completely spam or features a page or two of spam.

If you’re one of those site owners, then you’ll want to create and upload something called a robots.txt file. We will briefly go over how to do this.

A robots.txt file. The whole purpose of a robots.txt file is to tell a search engine spider not to crawl the site or part of the site on which the robots.txt file resides.

Creating the file. Creating a robots.txt file that blocks out spiders is easy. First, open up notepad. Then, copy and paste the following:

User-agent: * Disallow: /

Once you’ve done that, save the file as “robots” and as a .txt file.

Uploading the file. Next, you will upload the file to the part of your site which you do not want the spider to visit. So, if you don’t want them to visit yoursite.com/news/, you’ll upload robots.txt to the news folder. If you don’t want the search engine spider to visit your site as well, upload robots.txt to your index folder. That’s all there is to it.

Using the robots.txt file to make sure search engine spiders DO visit your site

Believe it or not, the robots.txt file can be used to both disallow and allow search engine spiders to crawl your site. Here’s how to create and upload such a file.

Creating the file
Open up notepad and copy and paste in the following:

User-agent: * Disallow: You’ll notice that the only difference between this and the earlier example is that Disallow: is not followed with /. If it were, that would tell spiders to go away. Once again, save the file as robots.txt.

Uploading the file
All you’ll do is upload the robots.txt file to the part of your site that you want the robot to pay a visit to. So if you want the robot to see the whole site, just put the robots.txt file right alongside the index file. And you’re done.
Creating and uploading a robots.txt file to help make sure spiders don’t miss your site is fast and easy. So what are you waiting for? Create and upload that file now!


Terry Detty, 42 years old, finds internet marketing his passion. In addition to marketing he enjoys reading, and occasionally goes out for a short walk.

SEO Software
internet marketing software
email marketing

Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Copyright 2006-2023 TechnoWorldInc.com. All Rights Reserved. Privacy Policy | Disclaimer
Page created in 0.103 seconds with 25 queries.