home ¦ Archives ¦ Atom ¦ RSS

Anderson: Mann, Wallace, Long Tail

I just have to link to Chris Anderson's juxtaposition of David Foster Wallace and The Long Tail, simply because I love the referred to essay: E Pluribus Unum: Television and U. S. Fiction. The essay appears in the collection A Supposedly Fun Thing I'll Never Do Again, an exceedingly mind bending series of essays, of which the eponymous one is ridiculously dense and convoluted. And one which I never finished.

Anyhoo, after reading E Pluribus Unum... you'll have a whole new perspective on TV. Suffice it to say my takeaway was that TV is an endless vortex of cooptation. TV culture inhales irony for breakfast and is essentially unassailable with that device. Rebel through conformity.

Haven't quite figured out what this has to do with the long tail though.


CiteULike: Cruising Along

Richard Cameron's CiteULike, previously mentioned on this here blog, has been piling up new connections and new features. They can be read about on the developer's weblog. CiteULike is now tied into a number of archives, including arXiv and CiteSeer. A group mechanism has been added. For example, a couple of folks at Northeastern have been busy little programming language squirrels. There's also a weighted list view of users tag and author collections.

CiteULike points out how restricting the document domain helps in automatically generating semantic metadata. Every paper has authors, a publishing venue, publication date, etc. which can immediately become tags or taglike.

Best of all everything seems to spit out RSS. I wonder how much of it is captured by PubSub?


Anderson: The Long Tail Weblog

Chris Anderson, of Wired News, nailed a book contract to followup on his Wired article about the long tail. This article, more than any other bit of attention, has popularized the concept that there's money to be made working the 80% of products/items that have miniscule sales/attention. Those long forgotten fantasies of Internet and Web infinite shelf space may be coming to fruition.

As is the case these days, there's an attendant weblog to go with the book writing.


NMH: Gaming Findory

Previously I posited a Findory API for self-reporting, noting that there might be a problem with accuracy. This would be useful for allowing other tools to tell Findory what I read.

As I thought about this more, I realized it might not be a problem. If you give Findory crappy data, what do you get? Crappy personalization. Presuming Findory doesn't generate globally visible results that depend on social behavior, there's not much incentive to be inaccurate about reports.


FeedBurner: The Long View

FeedBurner has announced a number of new enhancements. I was struck by the following comment in one of their feature explanations: "for some, the feed is becoming a first-class citizen in the world of content,...".

Blogs are the writing tool that helped birth Web syndication, but webfeeds are now an ecology unto themselves. An insufficiently studied ecology I might add.


NMH: pyxmpp and PubSub

I spoke a little too soon about the difficulty of using Python for XMPP. While the pyxmpp package is completely devoid of documentation, I managed to USTL (verbing weirds language) my way to a prototype client that can connect to PubSub, get my subscriptions, and receive announcements. The design of pyxmpp is pretty easy to hook into, once you figure out what's what. And the entire kit and kaboodle is pure Python, which means any clients built on top of it should be cross platform.

Now back to fleshing out the remainder of a decent PubSub client. The major hurdle is adding support for subscription manipulation. After that, we'll be proceeding to some interesting hacks in watching the blogosphere.

Did I mention that a combination of PubSub and Gush is more addictive than crack?


Culture Online: History Headlines

Britain's Department for Culture, Media, and Sport, commissioned History Headlines, a site that turns historical education into the task of editing a newspaper. Targeted at grade schoolers it's another demonstration of how online interactivity lets folks construct their own artifacts, rather than just consuming them. From the buzz on the Web the site seems to be a hit, and something a lot of media sites could learn a thing or two from.

Wonder what it would be like to have a good Department for Culture, Media, and Sport in the US?


Bosworth: Powerful, Sloppy Content

Finally put the time in and read Adam Bosworth's Service Oriented Computing Conference featured talk. The piece reads more like a manifesto though, embracing the wave around semi-structured personal content, cheap syndication, and loose coupling as a positive humanist trend in computing. The piece captures eloquently, but not in a confrotational manner, some of the excitement around all of this "blog and RSS" stuff.


2Entwine: PubSub Query Language

2Entwine's brief tutorial on the PubSub query language hints at the power of a real-time matching engine, basic query language, and push capabilities. With PubSub, you really can watch the conversation around your own or others content.

The only downer is that programatically, to tie into PubSub you need to use a protocol layered on top of XMPP. I spent a good chunk of this afternoon looking around for a way to conveniently hook in using Python, but only found promising leads, not four square hits. Maybe other XMPP/Jabber language bindings will work better, but that was a bit disappointing given the age of Jabber Publish/Subscribe.

idavoll looks like a viable alternative, but I've been trying to avoid the Twisted inhale.


Linden: Long Tail To Knowledge

Just piecing bits together. Ben Hyde knocks the fetishizing of the long tail. All we know is that it's a jungle out here.

Meanwhile, Greg Linden succinctly encapsulates the core problem with aggregators. People have to do a lot of work filtering out junk.

My bridge between the two is that folks will gravitate to aggregator like tools to keep track of all that niche content. Think of all those sniping tools for eBay.

Here's the rub. If you want to keep an eye on the fringe, is there enough context to make Findory style personalization tools effective? If it ain't linked to, commented on, or subscribed to, how well can you index the content coming from a source? Failing that, the fall back is to have people sift. Back to ground zero.

Oh yeah, and that's only for news or recently changing stuff. Despite certain protestations about content from traditional sources instantaneously becoming fishwrap, it actually does have value long term. We just have to wait for the lens of history to sort out what's what. But our tool builders are thinking little in this direction, as far as I can tell.


HeyPix: Photo Sharing Beta

The Web based photo sharing has gotten a bit more crowded with HeyPix. Looks like the key sell is around a rich desktop interface and good blog integration. They've got the obligatory developer's blog, but it's currently content-free.

Could be competition for Flickr but the Canadians have a pretty good head start.


PubSub: LinkRank Details

I thought I mentioned PubSub's LinkRank earlier, but can't seem to find the post. In any event, thanks to BoingBoing, the topic came back onto my radar screen. In particular, the algorithmic details are available.

LinkRank doesn't rank posts but domains based upon weblog post links. Of course New Media Hack is way out in the long tail. The numeric formulas are interesting, since they don't follow the iterative approach that PageRank, and its poor imitators, use.

And just for the PubSub folks: http://psi.pubsub.com/20040413:linkranks:1


Crosbie: True Mobility

Vin Crosbie, of Digital Deliverance, has been seriously following and using mobile data solutions for years. His consultancy generates revenue telling news operations how to exploit mobile computing and communications. He travels a lot himself. He takes this stuff really seriously.

So when he says road warriors can ditch their laptops, I believe it. Now I only wish he'd talked about who he buys cellular service from. I'll ping him and see if I get a response.

Update. Vin did get back to me. I think I'm cool quoting his e-mail, but if not I'm sure I'll hear about it and end up retracting it.

Like most Americans, I used the domestic U.S. carriers, which operate

CDMA or TDMA networks. But I got tired of having an inoperable cell

phone whenever I traveled while traveling outside North America. So, I

switched to T-Mobile, shortly after it had bought VoiceStream in 2000.


-- Vin Crosbie


Zawodny: Inbox DIU

Cool idea from Jeremy Zawodny. Since your inbox probabably gets a lot of links, automatically turn the inbox into a mini-version of del.icio.us. Bonus points for tying back into the del.icio.us mothership and/or including your browser history.

+2


Holovaty: Links via Feedster

Adrian Holovaty is pushing the boundaries of kitting out Firefox again. He's written an extension which, for the Washington Post and NY Times, ships the current URL off to Feedster and places the results inline.

Cute! Although as he points out, you're at the mercy of the remote search service which is mildly problematic if you're depending on Technorati. Where's all that funding going anyhoo?


Steele: Serving Client Side Apps

Oliver Steele makes my head hurt. But in a good way!! Hard thinking combined with clear writing, in depth and with saucy figures to boot.

Steele's piece on Serving Client-Side Applications discusses a wide range of design choices in distributing functionality in Web apps. Even better it helps place The Laszlo Toolkit in context.


Bosrup: overLIB

Need to make link and image hovers dance using only DHTML? Erik Bosrup's overLIB makes it easy.


Hofmann: advas

Link parkin': Can't judge the quality, but Frank Hofmann's advas, a (pure?) Python package for advanced search, might come in handy. Looks like it collects a number of handy things such as stemming, stop listing, and tf-idf.


NMH: Findory API?

Just wishin. If Findory had an API for informing the service about what you've read, external aggregators and other webfeed/news applications could chime in. Obviously, this routes around Findory's front page which is probably an issue, but might help improve things like personalized search. Also, morons gaming the system, and the general unreliability of self-reporting, come into play.

Another way to approach this would be by using custom RSS feeds that link through Findory. Findory may already do this for all I know.


ASE: ChartDirector

So I was extolling the virtues of matplotlib previously, but I ran into a roadblock. I wanted to be able to generate imagemaps for some plots. So far I've been stumped in trying to recover the mapping from data coordinates to image coordinates.

Enter Advanced Software Engineering's ChartDirector, a commercial plotting library/component, with bindings for a bunch of languages. Even better ChartDirector will generate an HTML imagemap directly from a plot. Beyootiful, although I'd actually like the raw coordinates for other purposes. But it's easy to reverse engineer that.

Relatively straightforward and inexpensive licensing to boot.


Nova & Ortelli: rss4you

Nicolas Nova and Roberto Ortelli's project, rss4you is prototyping social navigation in webfeed aggregators, something near and dear to my heart. I ran across the project earlier, but since the site's in French I couldn't make sense of it. Their 2004 FOAF workshop paper rectifies that situation.


NSF: HSD Awards

Earlier this year, a colleague and I took a flyer on a grant application to the National Science Foundation Human and Social Dynamics program. We were one of the (many) unlucky losers. Apparently it turned out to be a bit of a stampede.

The program managers have tightened up the submission process to limit the number of applications. They've also conveniently collected the HSD grant recipients in one place, with links to abstracts. It's an interesting mix of stuff, all over the map in terms of projects combining social science and computing. I do note though that no project really intersects computing, media, and social behavior. Northwestern managed to land two fairly large grants one out of the Economics department, the other Mechanical Engineering. Congrats.

Your tax dollars at work!!


Linden: Findory Personalizing Search

On The Web, you are what you do.

Greg Linden reports that Findory is personalizing web, blog, and news searches. The personalization is based upon things you've clicked through on Findory. The deployment is still in its infancy, but this is a harbinger of the types of things that Google, Yahoo, Amazon, et. al. can do with possession of (accurate or inaccurate) profiles of your behavior.

If it turns out that these searches are better than straight, unpersonalized, keyword search, it could be the hook that entices people further into a service like Findory. Previously I evinced some skepticism that Findory could get enough click through info to build a decent profile. But enticements like improved searches could prove me wrong.


NU Library: In The Spotlight

Shoot me! I didn't know the NU Library was running a weblog, "In The Spotlight", for major announcements. With RSS feed to boot!


Dash: MT & Photos

Link parkin': A cornucopia of links on how to integrate photos and MovableType.

Via Anil Dash on the SixApart Professional Network.


Trapani: DHTML FotoNotes

Gina Trapani has combined her ad hoc DHTML image annotation techniques with the documented FotoNotes format to create a DHTML FotoNotes Viewer.

Not quite sure what the license is on Trapani's code, but it's available. Between that and FotoBuzz there should be enough rope to hang onesself, with this stuff.


Merholz: Castells Lecture

Peter Merholz did yeoman's duty taking notes on a lecture by Manuel Castells, given back in late October at UC Berkeley. The focus was on "Cities in The Information Age". It's taken me 3 or 4 reads just to digest Merholz's notes, much less the actual lecture.

There are lots of interesting nuggets, but I was struck by the numbers regarding the urbanization of the world, the continued importance of cities, the intertwingling of virtual and urban spaces, and last but not least, the two dominant models of building urban spaces. One, the Mexico City model, lawless, ad hoc informal construction of the metroplex. Two, the Barcelona model, managed urban development, with an emphasis on government development of quality public spaces.

Just a gut feeling, but the 'Net in general, Web in particular seem to be moving from the first model to the second model, except under purely, private commercial motivation, at least in the US. In the long run, this may actually contribute to a collapse of American leadership in technology innovation. Think of it as the malling of our virtual sphere. Not exactly a horrible death, but a slow slide into mediocrity.

Then again, I could be wrong.


Jones, Harrington, et. al.: Document Aesthetics

Rochester Institute of Technology's Rhys Price Jones, Xerox's Steve Harrington, and a gaggle of folks, have been investigating visual design metrics for document aesthetics and intent. Independent of content, paper and online documents can be analyzed to understand the purpose that's being transmitted by the design.

More information regarding Harrington is on the Web, including links to two recent papers.


FotoNotes: Back To Life

A Turkey day bonus. From some reliable sources, FotoNotes is decloaking and the long promised open source release is on the loose. I know podcasting is hot, but it seems to me the activity swirling around photos and photo annotation makes the scene ripe for explosion in 2005.

Memo to self: ease back on the L-tryptophan today


Janes: J

I used to complain that there needed to be an open source base for pulling off whacky aggregator experiments. Blogmatrix J


Bulaong et. al.: 24in48

Lia Bulaong's 24in48 project was a recent interesting social moblogging experiment. In a 48 hour period, 24 people in NY moblogged parts of their existence. I can't quite figure out if folks were given a mandate and/or if they had social connections, so the result is a bit opaque to me but a very interesting concept.

The grabber for me is that you can slice the results in multiple ways. The project only provides two, by person and by hour. All sorts of other whacky ideas get sparked in my head though. The images may have been GPS stamped. What about looking at the photos spatially distributed across a map of NY? If there are social connections between people and the photos somehow reify that, how about making the social network translucent in some fashion? A graph viz is the obvious first cut but there's got to be other interesting ways to show off the relationships.

Just thinking out loud.


Fletcher: Bloglines & Sleepycat

Dusting off another post stashed in my Bloglines aggregation, Mark Fletcher discussed why Bloglines doesn't use a relational DB underneath. Sleepycat, longtime purveyors of fine Berkeley DB style, associative, key indexed database software, is the engine that manages Bloglines data. Similar to the way you can punt an RDBMS if you don't really need all of its persistence and transaction management, if you're not doing relational queries, you can drop kick the model, and attendant overhead as well.


NMH: Daily Me is Here

"The future is already here -- it's just unevenly distributed"
--William Gibson.

Between Findory.com, PubSub, focused crawlers, social translucence tools, and webfeed aggregators, Negroponte's concept of The Daily Me is here. There remains a bit of gluing, spitting, and polishing to make it happen big, but it will happen. However, I don't think the apocalyptic visions of people tuning out common ground sources is on target.

As the tools get better, people will lock onto sources out in the long tail. Niche sources with small audiences. Tracking friends and family. Staying on top of info regarding media objects (text, photos, music, urls, auctions, searches) people are passionate about, but can't be covered by the mainstream press. Keeping an eye on community discussions where community is tightly knit geographically or culturally.

Daily Me tools wil be used to add and augment sources, not delete them. It'll be all about the changing stuff outside of the newspaper that people care about.


2entwine: FotoBuzz

I've been throwing students on the shoals of Flash based photo annotation. Looks like the fine Flash experts at 2entwine may have taken care of the issue for me with the FotoBuzz viewlet.

And of course it's got that stylish, 2entwine look.

Via Anil Dash on the SixApart Dev Network, said post seeming to indicate that Fotonotes has come back to life.


Hunter, et. al.: matplotlib

John Hunter and a merry band of hackers, has concocted a Matlab plotting library knockoff for Python: matplotlib. There are other nice Python modules for plotting: pyPloticus, Gnuplot.py, and disipyl. However, I can vouch for matplotlib since it's pure Python, works well, and seems to be an active project.


Whitman & Lawrence: Mining Music Metadata

Via plasticbag is an oldie but goodie that's been stashed in my aggregator for a bit. In the 2002 International Computer Music Conference, Brian Whitman and Steve Lawrence describe a scheme to determine artist similarity based upon community metadata.

In a nutshell they take an artist name, ship if off to a music search engine, mine the top 50 pages for features using NLP techniques, and then cluster based upon the features. Evaluation is done using a "ground truth" of human compiled similarity lists.

An interesting approach to constructing context without much explicit, machine readable information. Parts of this are probably applicable to analyzing the blogospheres.


Cameron: CiteULike

Richard Cameron (I think) has cooked up a del.icio.us like system called CiteULike that has a vertical focus: academic papers from the sciences (biology, medicine, computing). Need to kick the tires on this one, but I wonder if it restricts URLs to the listed academic paper sources?

This led me to another whacky idea regarding metadata gathering in a social bookmark service. del.icio.us doesn't do anything special with your bookmarks, while services like Furl.net claim to archive entire documents. In between, those two points a service could analyze the documents, they're probably in PDF or Word, gather empirical observations about documents, users, and tags, and display that to users for browsing and navigation. For example, start doing the CiteSeer thing and mining bibliographies, but now you have all this other social translucence data. If you limit the domain of documents referred to, then you can probably start to make some interesting inferences about relevance and importance.

CiteULike would be a nice service to tie together with Google Scholar


NMH: Decon/Recon textualization

Following up on the previous post regarding the effect of aggregation, there's also another effect on author's content. The message is ripped from its original context and inserted into another context.

For example, "previous post" above makes sense if you're seeing that post in the context of my reverse chronological posts. If you get it in a webfeed aggregator, that might not make much sense.

A relatively straightforward and oft discovered observation. But when do we hit the point where people start writing more for aggregation than for the monolithic site. Or is good writing/content inherently aggregable (sp?). I don't actually believe that but it's an avenue worth pursuing.

Somebody should get out there and do the experiment of being just a "webfeed" author and skip the weblog stuff.


Porter: Aggregation & Navigation

Joshua Porter elegantly discusses the tension between aggregation in general (search, webfeeds) and designing information architectures. In short, aggregation routes around home pages. Home page. What a quaint concept!

Maybe IA designers should start thinking of sites as rings, layers, semi-permeous borders? There may be an aggregation attractive fringe and then a less amenable core. Alternatively, ever page can be a "home page". Either way, thinking regarding the structure and interface of a site becomes markedly different.


MSR SCG: Raindrop

Microsoft Research's Social Computing Group has a weblog entitled Raindrop. Not exactly fast moving, but I've met a few of the contributors in person so I'm waiting to hear what new they have to say.

Memo to self, ease back on the acronyms.

© Brian M. Dennis. Built using Pelican. Theme by Giulio Fidente on github.