Part 1: TimesSelect Buzz Rankings

Three weeks ago, Jeff Jarvis wrote on his blog: "TimesSelect cost the paper much more in the internet age: It took the Times columnists out of the conversation and reduced their influence in America and worldwide."

This sentiment was echoed by many and challenged by few. We'd like to look at the data, but first we must understand the terms.

To review: in the broadcast model (which includes popular blogs like Jarvis's BuzzMachine), conversation is not what we think it is in everyday life, where a small group of people talk equally amongst themselves. For in the broadcast model, the "conversation" is where Big Media speaks, and other people talk about it, but Big Media doesn't necessarily respond to every one. Jarvis's post attracted 62 comments and trackbacks, yet Jarvis himself responded to only one commenter, a riff on the economics of free pizza, skipping the more substantive comments.

If people are talking with you, it's conversation; if they are talking about you, it's buzz. And buzz is not necessarily meant for any noble reasons of getting closer to truth, but closer to speaking fees and consulting gigs and book sales and other things pundits to sustain free writing.

Measuring the mentions in blogs can be a good proxy for buzz, and a number of companies today do offer blog post data which can be measured. I recently compared three of them: Technorati, BlogPulse, and Google Blogsearch. They all have data for the last 180 days. In analyzing a series of terms, I found that for phrases blogged about less than a thousand times  aday, the numbers are pretty well correlated. Above that, and Google's numbers get unreliable (that is, it takes too many clicks get a number that looks rational).

Technorati was one of first services to offer blog search, but it never meant to offer research services: it doesn't even allow searches by date. Nielsen BuzzMetrics BlogPulse is marketed as a research service, but their public data goes back only six months. BuzzMetrics has shared their data for some public research papers, but I doubted that this small project would be worth their time (though I did ask them recently).  Google BlogSearch is the easiest for any independent researcher to use: they claim to have data going back to 2000, and they allow multiple types of queries, including by date. The downside is that Google's data collection only began in earnest in June 2005.

This effort is not meant to be final effort, but the first effort. One could do this with BuzzMetrics BlogPulse data, or with the link analysis, provided one tracks down all of the column links for the non-bloggers (BlogPulse incidentally calls theirs the conversation tracker). I look forward to data challenging my results. Having pored over this data for weeks, I am fairly confident that it is a good indicator of buzz among the New York Times columnists and their pundit peers.

But before we go to the numbers, let's pick our players.