Mark Newman, of the University of Michigan, was in town to give a seminar. Newman is one of the new breed of physicists/applied mathematicians who are heavy into graph theory and social networking. Newman worked closely with Duncan Watts on characterizing various properties of large simulated and empirical networks.
For the most part, the talk was an overview to anyone who's been following the area. Towards the end though, he got into recent developments in finding communities within networks. Social network analysts, and clustering afficianados, have tools that work on relatively small graphs. But in offline discussion, Mark seems to think that his work and extensions by others could scale to sparse networks of millions of nodes.
That's using one desktop machine. With a lot of RAM. But still.
For perspective, Technorati is "only" watching somewhat over 1.8 million weblogs. blo.gs claims to be monitoring about 1.4 million. Throw a small server farm at the task and you could easily apply Newman's algorithms on a reasonable time scale. The algorithms aren't particularly difficult either. You can grab the papers from his site. If you do, there's a slightly tricky hidden point, I think, involving the determination of biconnected components, but it's actually a minor issue.
Instead of these uninformative power law graphs maybe we can start to get a better picture of what the blogosphere looks like.