Sitekeepers - Webmaster's blog

Thursday, March 23, 2006

BigDaddy news

Bigdaddy status update: almost there

We’re down to just 1-2 data centers left in the switchover to Bigdaddy. It’s possible that the Bigdaddy switchover will be complete in the next week or two. Just as a reminder, Bigdaddy is a software upgrade to Google’s infrastructure that provides the framework for a lot of improvements to core search quality in the coming months (smarter redirect handling, improved canonicalization, etc.). A team of dedicated people has worked very hard on this change; props to them for the code, sweat, and hours they’ve put into it.


Gone Supplemental

Some site owners over at WebmasterWorld have been discussing an issue where on Bigdaddy data centers, the site wouldn’t be crawled as much in the main index. That would result in Google showing more pages from the supplemental results for that site. GoogleGuy requested feedback with concrete details, and several people responded with enough details that we identified and changed a threshold in Bigdaddy to crawl more pages from those sites.

I checked in that email queue tonight to see how the “gonesupplemental” feedback looked. I looked at an emergency responder site, a truck site, a ticket site, a karate site, a silver site, a T-shirt site, a site about memory, a site selling a type of document, a boating site, and a jewelry site. All were getting more pages crawled, and I expect over time that we’ll crawl more pages from these sites and similar sites that people mentioned. The biggest site that I saw had 711K pages reported, and I saw other sites with 40,400 estimated pages and 52,700estimated pages for a site: search.

So the upshot is that if you’re one of these people who was paying attention to this issue, I think it has already improved quite a bit, and I would expect to see more pages indexed in the coming week or two. Some sites may see improvements earlier than others because of where a site happens to be in Google’s crawl cycle.