WWW FAQs: How do search engines rank websites?


2003-12-11: It depends on the search engine. A little bit of history is helpful here.

Early search engines, such as the original Excite or Altavista search engines, ranked pages based on page content and formatting and based on the content of metadata tags. Special attention was paid to bold text or to keywords and summaries in the head of the HTML document.

Of course, webmasters figured this out years ago. For several years, many webmasters packed their pages with bogus material to trick the search engines into giving them a high ranking. Web pages containing a single word repeated 10,000 times, but displayed white on white so the user would not see it and the search engine would, were very common! As a result, search engines no longer trust the content of your page to determine its rank, only its relevance to a particular topic. There is very little to be gained by setting keyword and summary metadata in your pages. Of course you still should be sure to state the purpose of your page in all of the likely ways people may search for it, because search engines still must use the content of your page to determine its relevance. If your page does not have the keyword the user is searching for, it will not come up on that search at all.

Things changed for the better when Google arrived. Google ranks by how frequently other sites link to you, not by what you do or don't say on your site, although they do of course use what is on your site to determine whether your page is of relevance. When Google arrived, there was a "honeymoon period" of several years during which searches were extremely effective, because the Google rank of a site was essentially based on its true popularity, as measured by the number of sites that thought it worthwhile to link to it. At the time, no one had figured out a way to trick Google into artificially raising their rank.

However, in recent years, many advertisers have figured out that a direct, plain-text link from a site with a high Google rank will increase their own Google rank, and they have begun purchasing such links from such sites. This has resulted in a dilution of Google's effectiveness. While Google is still more effective than any other search engine, there will often be several dubious entries in the top ten of any search for something that is commercially valuable enough to justify the investment of advertising money in direct links. Of course, Google's techniques are ever-changing, and it is expected that they will find ways to discount the effect of most such links; websites must distinguish them well enough from actual content that users are not annoyed or offended, and there is a likelihood that Google will be able to adapt and take placement on the page and other factors into account when evaluating the importance of links.

Of course, on many search engines, search engine placement can be directly purchased. Currently this is not the case with Google, which clearly distinguishes "sponsored links" from search results, a policy which has generated a great deal of loyalty among users.

Other modern search engines use their own techniques which are not made known to the public. Although Google's exact technique for ranking sites is obviously not public, Google's basic approach is known thanks to academic papers published by Google's founders. Other search engine companies that are serious about competing with Google, such as Microsoft, are no doubt already incorporating many of Google's techniques to improve their own results.

Legal Note: yes, you may use sample HTML, Javascript, PHP and other code presented above in your own projects. You may not reproduce large portions of the text of the article without our express permission.

Got a LiveJournal account? Keep up with the latest articles in this FAQ by adding our syndicated feed to your friends list!