الجمعة، 20 يناير 2012

Searchenginespiders


Subject: [FIRSTNAME], What Spiders Do...


Dear [FIRSTNAME],


Search engine spiders are by far one of the most useful things
to come around in the last 10 years of the internet. They are
useful not only to the web sites(Google and many others) that
use them, but also to people who are searching for a particular
site and those who run web sites. Spiders allow your site to be
seen by the millions of people who use search engines every day.
In this newsletter, we will discuss what search engine spiders
do, how they work, and how to set up a robots.txt file and
upload that to your site to keep spiders from visiting your
site.


What are spiders and what purpose do they serve?


Spiders are essentially programs that crawl sites and report
back to their superior(Google or whatever search engine they
were created for) what their findings are. Their purpose is to
make it easy for sites to get listed in search engines.


You might be wondering, what does it mean to crawl a site?
Well it means to visit and site and copy the information.


How do spiders work?


Spiders work by finding links to web sites, visiting those web
sites, going through the content of a web site and then
reporting the content of the site back to the database of the
site which they are working for. Google spiders, thus, crawl
sites and report the information back to Googles database. From
there, the information is added to Googles search engine, and
the site then shows up in Google search results. Much the same
process happens with any other search engine spider.


How can I keep spiders from visiting my site?


You might be thinking, why would I want to keep such a useful
thing from visiting my site? Well, the short answer is,
sometimes site owners dont want the spider to crawl on a
particular part of their site. Some site owners dont want
spiders to crawl their site at all. The reasons for not wanting
a spider to crawl a site or a particular part of a site vary,
although most of the time it is because the site is either
completely spam or features a page or two of spam.


If youre one of those site owners, then youll want to create
and upload something called a robots.txt file. We will briefly
go over how to do this.


A robots.txt file


The whole purpose of a robots.txt file is to tell a search
engine spider not to crawl the site or part of the site on which
the robots.txt file resides.
Creating the file


Creating a robots.txt file that blocks out spiders is easy.
First, open up notepad. Then, copy and paste the following:


User-agent: *
Disallow: /


Once youve done that, save the file as robots and as a .txt
file.


Uploading the file


Next, you will upload the file to the part of your site which
you do not want the spider to visit. So, if you dont want them
to visit yoursite.com/news/, youll upload robots.txt to the
news folder. If you dont want the search engine spider to visit
your site as well, upload robots.txt to your index folder.
Thats all there is to it.


Using the robots.txt file to make sure search engine spiders DO
visit your site


Believe it or not, the robots.txt file can be used to both
disallow and allow search engine spiders to crawl your site.
Heres how to create and upload such a file.


Creating the file


Open up notepad and copy and paste in the following:


User-agent: *
Disallow:


Youll notice that the only difference between this and the
earlier example is that Disallow: is not followed with /. If it
were, that would tell spiders to go away. Once again, save the
file as robots.txt.


Uploading the file


All youll do is upload the robots.txt file to the part of your
site that you want the robot to pay a visit to. So if you want
the robot to see the whole site, just put the robots.txt file
right alongside the index file. And youre done.


To your success,
[YOUR NAME HERE]


P.S. Creating and uploading a robots.txt file to help make sure
spiders dont miss your site is fast and easy. So what are you
waiting for? Create and upload that file now!


 



Recommended For You



ليست هناك تعليقات:

إرسال تعليق