A Few Technicalities Involved In The Search Engines

Search Engine Spiders forum discussing how to know which search engines are visiting your site by identifying their spiders. Share their IPs and spidering habits, and plan for them to get better SEO results.

A Few Technicalities Involved In The Search Engines

Postby mizatick » Thu Aug 06, 2009 6:07 am

A software spider is an unmanned software program operated by a search engine that surfs the Web, visits a Web site, records (saves to its hard drive) all the words on each page, and notes links to other sites. As the spider visits each page, it follows each and every link and will read, index, an store the other pages that the link might lead to. Google, however, sometimes will reference a site without actually visiting the page. (Google knows that because several sites link to a page, that page exist). This is a simplistic view of what happens form 60,000 feet above the action. The actual process that converts the site from a Web page to an entry on a results page is a highly sophisticated data warehousing and information retrieval scheme, which will vary from engine to engine. In fact, it is this process of retrieving documents from a database that is one key point of differentiation for most search engines. The other point of differentiation lies around the other services offered and partnerships formed by each engine..

Due to the sheer volume and size of the documents indexed, each search engine has developed its own algorithm for which pieces of data are stored and methodologies for compression that allow for rapid searching and more economical storage of huge volumes of data.
mizatick
FORUM ADDICT
 
Posts: 67
Joined: Wed Aug 05, 2009 1:17 pm
Location: India

Return to Search Engine Spiders

Who is online

Users browsing this forum: No registered users and 1 guest

cron