| A web search engine is a software programme designed to search for information on the Internet. The search results are usually provided in the form of a list and are commonly called hits. The data may consist of web pages, images, data and other types of files. Some search tools also gather information available in databases or open directories. If compared with Web directories that are maintained by human editors, search engines function algorithmically or are a mix of algorithmic and human input.
Internet search engines function by storing data about many web pages which they retrieve from the INTERNET. These pages are retrieved by a web crawler, also known as a spider. It is an automatically-controlled Web browser that follows every link it discovers. The content of each page is then analyzed to decide how it should be indexed. Words, for example, are extracted from titles, headings and subheadings or special fields called meta tags. Data about web pages are stored in an index catalogue for further use in queries. Some search tools, such as Google, store the entire or part of the source page (also called a cache) and data about web pages, while others, such as AltaVista, store every word of every page they find. This cached page always comprises the initial search text, as it is the one that was actually indexed. Consequently, it can be very helpful as it contains data that may no longer be available elsewhere.
Once an Internet user has typed search words in the search field, the software programme carries out checks on its database and provides a list of best-matching web pages according to its criteria, normally with a brief summary coupled with the title of the document and sometimes excerpts from the text. Some search tools have installed an advanced option called proximity search which allows users to determine the length between search terms.
The usefulness of a search engine rests on the relevance of the results it gives back. Since there can be millions of web pages that include a particular word or word combination, web pages can be divided into relevant and irrelevant ones. The majority of search tools employ techniques to grade the results to list the "best" results first.
The way a search software program displays web pages is specific to a search engine. The methods also change in time, since the use of the Internet alters and new techniques are developed. |