July 26, 2003

List of Google PR 10 pages

So you're curious about what sort of pages are able to attain a Pagerank of 10 in google?

Google PageRank PR 10 Top List

Posted by Andrew at 07:40 AM | Comments (5)

July 16, 2003

Google World

Google World is a great source for all information about Google. The section on Google's issued and filed patents is particularly interesting.

Posted by Andrew at 10:40 AM

Google Papers

Google lists a large number of papers written by Google employees at labs.google.com.

Posted by Andrew at 10:30 AM

July 14, 2003

the most popular "the"

Tim Bray has recently posted about stopwords in his blog. It inspired me to do a Google search for +the which forces Google to return results for "the". I must say that I got a little smile when I found out that the Onion came out ahead of the White House.

Posted by Andrew at 08:06 PM

March 13, 2003

The Google Dance

This past weekend many website owners have sat in front of their computers and pressed their browser's refresh button repeatedly as Google has begun its monthly update. The update, now widely know as the "Google Dance," has become a monthly online festivity where webmasters closely watch as Google's new index gradually comes online and either rejoice or despair over their new rankings.

Why is it called the "Dance"?

The update period, which typically lasts a few days each month, is known as the "Dance," because the result pages at the main Google page and its two test domains (www2.google.com and www3.google.com) frequently fluctuate as the new rankings gradually come online. Until the dance begins, the results from the main domain and the test domains are mostly stable.

What's happening during the Dance

The Dance is when Google adds a new index with the results from the month's earlier "deep crawl" of the Web. I'll say a little more about the "deep crawl" a little later. During the Dance, Google uses the previously mentioned www2 and www3 domains to test the new index for any abnormalities. Once the new index has been tested, Google will steadily update each of their data centers with this new index. While this is happening, search results from the main Google domain can change on a minute-by-minute basis, since some searches will pull its results from an updated datacenter and others will pull from one that still has the old index.

Deep Crawl vs. Fresh Crawl

The Dance solely concerns itself with integrating the latest monthly deep crawl into the Google index. However, if you frequently search Google using the same keywords, you may notice that the search results fluctuate more than once a month. This is due to a relatively new feature of Google known as the "Fresh Crawl." The fresh crawl occurs almost continuously to spot frequently updated sites and to add the new content to the engine's index. This frequent fluctuation in the engine's search results has come to be known as "Everflux." Due to these slight variances, the accepted method to detect the beginning of the dance is to look for differences in the number of backlinks to major sites (such as Yahoo) between the main site and the test domains, and not by looking for changes in search results.

You can easily spot pages that have been fresh crawled by examining Google search results pages and looking for pages that have a date in the last line of their entry (between the page URL and the "cached" link). The fresh crawl uses a different spider, known as "freshbot," than the deep crawl, which uses "deepbot." These two spiders use two different blocks of IPs and can easily be differentiated, with deepbot using IPs that start with 216. and freshbot using IPs that start with 64.

With the emergence of freshbot, does the deep crawl still matter? Yes, it still matters quite a bit. The "fresh" results are not stable, with fresh pages popping in and dropping out of the results pages on a daily basis. The stable, persistent rankings are based solely on the deep crawl. Also, Google uses the deep crawl results to calculate PageRank, that magical numeric value representing a page's importance, which is displayed on the Google Toolbar via a little green bar.

Watching the Dance

As stated earlier, the surest way to know that the Dance has started is to check for differences in the number of backlinks between the main Google site and the test sites. You can do this by searching for "link:www.yahoo.com" at the main Google site and www2.google.com and www3.google.com and watching for any variance between the results. The Google Dance Tool streamlines this process by allowing you to search all three domains simultaneously, presenting all three results in a frameset. This tool also allows you to simultaneously search each of Google's datacenters.

During the Dance, you can use this tool to get a sneak peek at the new search rankings, watch for new page's inclusion into the index, and to check on the progress of the dance as the new results propagate to the separate datacenters. You can also get a sneak peek into a page's new PageRank by changing the IP to which the Toolbar points.

Further Reading:

Posted by Andrew at 05:43 PM | Comments (2)

March 01, 2003

ODP public forum

I've just registered at the ODP Public Forum, a place where the public can interact with ODP editors.

Posted by Andrew at 09:39 PM

February 23, 2003

Library Lookup

John Udell's Library Lookup allows you to automatically look to see if books mentioned on a web page are available at your local library.

A must for any bibliophile.

Posted by Andrew at 05:00 AM

January 26, 2003

PR Envy

In a WebmasterWorld thread concerning the current Google update it was suggesting that the term "PR envy" should be used to express people's disappointment over the lack of improvement in their site's PageRank.

I'm still chuckling.

I love it!

Posted by Andrew at 08:07 PM

December 23, 2002

Yahoo acquires Inktomi

In case you've been asleep today :)

Yahoo acquires Inktomi

You can read more about it here, here, here, here, here, and here.

One has to wonder is MSN will continue to use Inktomi results for its search.

This will certainly be an interesting story to follow.

Posted by Andrew at 10:13 PM | Comments (0)