The term search engine has become the predominant term for search system or search site, but before reading any further, you need to understand the different types of search, um, thingies, you’re going to run across.
Basically, you need to know about four thingies:
1. Search indexes or search engines: These are the predominant type of search tools you’ll run across. Originally, the term search engine referred to some kind of search index, a huge database containing information from individual Web sites. Google’s vast index (http://www.google.com/ ) contains over 3 billion pages, for instance. Large search-index companies own thousands of computers that use software known as spiders or robots (or just plain bots — Google’s software is known as Googlebot) to grab Web pages and read the information stored in them. These systems don’t always grab all the information on each page or all the pages in a Web site, but they grab a significant amount of information and use complex algorithms to index that information. Google is the world’s most popular search engine.
2. Search directories: A directory is a categorized collection of information about Web sites. Rather than containing information from Web pages, it contains information about Web sites. The most significant search directories are owned by Yahoo! (http://www.dir.yahoo.com/) and the Open Directory Project (http://www.dmoz.org/). Directory companies don’t use spiders or bots to download and index pages on the Web sites in the directory; rather, for each Web site, the directory contains information such as a title and description. The two most important directories, Yahoo! and Open Directory, have staff members who examine all the sites in the directory to make sure they are placed into the correct categories and meet certain quality criteria. Smaller directories often allow people submitting sites to specify which category should be used.
3. Non-spidered indexes: I wasn’t sure what to call these things, so I made up a name: non-spidered indexes. A number of small indexes, less important than the major indexes such as Google, don’t use spiders to examine the full contents of each page in the index. Rather, the index contains background information about each page, such as titles, descriptions, and keywords. In some cases, this information comes from the meta tags pulled off the pages in the index. In other cases, the person who enters the site into the index provides this information.
4. Pay-per-click systems: Some systems provide pay-per-click listings. Advertisers place small ads into the systems, and when users perform their searches, the results contain some of these sponsored listings, typically above and to the right of the free listings.