Thu, Feb 23, 2012

Search Engines. Who Are They To Judge Us?

searching the internet with search engines

The good news about the Internet and its most visible component, the World Wide Web, is that there are hundreds of millions of pages available, waiting to present information on an amazing variety of topics. The bad news about the Internet is that there are hundreds of millions of pages available, most of them titled according to the whim of their author, almost all of them sitting on servers with cryptic names. When you need to know about a particular subject, how do you know which pages to read? If you're like most people, you visit an Internet search engine.

A Web search engine is a tool designed to search for information on the World Wide Web. The search results are usually presented in a list and are commonly called hits. The information may consist of web pages, images, information and other types of files. Some search engines also mine data available in databases or open directories. Unlike Web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input.

As Defined by Wikipedia

Internet search engines are special sites on the Web that are designed to help people find information stored on other sites. There are differences in the ways various search engines work, but they all perform three basic tasks:

  • They search the Internet -- or select pieces of the Internet -- based on important words.
  • They keep an index of the words they find, and where they find them.
  • They allow users to look for words or combinations of words found in that index.

Exactly what is a search engine? Basically, a search engine is a software program that searches for sites based on the words that you designate as search terms. Search engines look through their own databases of information in order to find what it is that you are looking for.

To understand more about how to structure your page, your content, and how you want to be "understood" by the search engines, it's a good idea to understand the simple strategies that can be used to perform a search. Take a tour through Search Engine Strategies In Plain English

Crawler-based search engines, such as Google, (and the class of search engine we are most interested in, yet have the least control over) create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

The other 2 classes of Search Engine are Human-Powered Directories and Hybrid Search Engines. A Hybrid is a cross between Crawler and Human.
Search Engine Spiders Crawl The Internet Everyday

Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.

Spiders, or Crawlers, can not read images, or flash, or javascript, etc. Spiders are blind and for their purposes, HTML is braille. Within that braille they can understand the content, alt tags, title tags, etc. In other words, real language, human language. This is why the phrase rings loud - Content Is King.

All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results. Some of the significant differences between the major crawler-based search engines are summarized on the Search Engine Features Page at Search Engine Watch.com. Information on this page has been drawn from the help pages of each search engine, along with knowledge gained from articles, reviews, books, independent research, tips from others and additional information received directly from the various search engines.

Worthy Of Blogging

  • 1
  • 2
  • 3
  • 4

Website Magazine

Contact Info

  • Eastern Shore. Maryland. USA
  • Phone: (443) 239-2356
  • Office: (443) 295-3732
  • Toll Free: (877) 932-9416
  • Facebook: Connect