CYBERSPACE, 2005 (D.O.T.) – Exactly how big is the World Wide Web? Many Internet engineers consider that question one of those imponderable philosophical questions, like how many angels can dance on the head of a pin.
But Yahoo says it has the answer. In fact, Yahoo announced that its search engine index - an accounting of the number of documents that can be located from its databases - had reached 19.2 billion!
Because the number was more than twice as large as the number of documents (8.1 billion) currently reported by Google, Yahoo's fierce competitor – the numbers set off a major dispute between the two giant search engine rivals.
Google questioned the way Yahoo was counting its numbers and suggested that the Yahoo index was inflated with duplicate entries in such a way as to cut its effectiveness despite its large size.
"The comprehensiveness of any search engine should be measured by real Web pages that can be returned in response to real search queries and verified to be unique," said a Google executive. "We report the total index size of Google based on this approach."
But Yahoo stood by their statement. "The number of documents in our index is accurate," said a Yahoo executive. "We're proud of the accomplishments of our search engineers and scientists and look forward to continuing to satisfy our users by delivering the world's highest-quality search experience."
The scope of Internet search engines, and thus indirectly the size of the Internet, has long been a lively area of computer science research and debate. Moreover, all camps in the discussion are quick to note that index size is only loosely related to the quality of results returned.
The major commercial search engines use software programs known as Web crawlers to scour the Internet systematically for documents and index them. The indexes themselves are maintained as computer data that permit the search engines to return lists of hundreds of answers in fractions of a second when Web users enter terms like "Britney Spears" or " Iraq and weapons of mass destruction."
Researchers attempted to shed light on the debate by performing a large number of random searches on both engines. They ran a random sample of 10,012 queries and concluded that Google, on average, returned 166.9 percent more results than Yahoo. In only three percent of the cases did the Yahoo searches return more queries than Google. The group said the Yahoo index claim was suspicious.
Neither Yahoo nor Google makes public the software secrets that underlie their information collection methods. In fact, those details are closely guarded, and are at the heart of heated competition now going on between Google, Yahoo and Microsoft over who can provide the most relevant answers to a Web user's question or search topic.
So how big is the Web? According to independent search engine specialists, it’s best to remain skeptical about the ability to estimate the size of the Web as long as the search engine companies are being secretive about their methods. There’s really no way to verify their estimates, the independent experts agree, because there’s no good way of checking the numbers.
The number that is for sure, however, is that you won’t get Google and Yahoo to agree even one time on the size of the Web – not even if you run several billion Web searches!