Link parkin': nice accessible overview of the current state of Linux virtualization by M. Tim Jones.
According to Narayanan Shivakumar, those Googlers at the relatively new Kirkland, WA labs have been quite productive hackers (PDF):
Since then we've attracted many engineers who were tickled silly about working on large clusters of several thousands of machines, not to mention shipping web and client-based consumer apps used by millions of people. In the last two years, our Kirkland engineering team has conceived and launched a dozen products ranging from core search product improvements to Ads Optimization, Sitemaps and Webmaster Central, plus such consumer applications as Google Talk, Chat, Pack, Video, Music Trends, and mobile SMS.
You can automatically turn your del.icio.us bookmarks into a Google Custom Search Engine using Vik Singh's whizzy web application. Nice hack!!
[Via Google Blogscoped]
Greg LInden reports that the Findory web services API has been greatly expanded. To quote Greg, "In fact, there is enough here in the new Findory API that, with a database for caching data and for remembering reader's histories, you pretty much could build your own version of Findory with it."
Now to find time to do something interesting with it.
Link parkin': Steve Palmer's Vienna RSS Aggregator. Mac OS X only, but it could be The Emacs of Aggregators (TM).
[Via 0xDECAFBAD]
Robert Kosara is collecting lists of influences from distinctive leaders in the visualization community. His first subject is Pat Hanrahan, who provides a very interesting list of influential texts. I agree with Kosara: "if you don't know who he is, you better find out."
I wonder if it's an experiment, but I was offered a trend view of my aggregator reader habits in Google Reader. Things presented:
- Active and inactive feeds by items per day.
- Reading, starring, and sharing activity on a per feed basis
- A tag cloud, generated from your tagged feeds, highlighting tags by items appearing (size) and items read (shade).
Wei Lin's Python Cookbook recipe uses Python's ctype module to hijack winamp plug-ins and build a micro MP3 player using standard Python 2.5. Nice hack!
Okay GReader guys. It's been over a year now since you were sussed by Niall Kennedy. That's more than enough burn-in for your internal API. I think it's pretty clear it works fine. Write the docs, open the doors, and get it over with. Real artists ship!
Guess I'm in the blog tag club, Greg Linden called "you're it" on me.
Since, I'm going to run off at the mouth, let me get to the five folks I'm tagging: Dan Ciruli (Go Bears!!), Alex Halavais, Josh Lucas, Adrian Holovaty, Lucas Gonze
Here's five things you probably couldn't discern just by reading this blog. Apologies for being a bit more expansive than most, but I'm in a divulging mood. More after the break:
1) MIT's Walker Memorial Hall provided two formative experiences. First, as a freshman, I bussed tables in the dining hall to make some spending money. My family had just enough resources to pay the tuition, room and board, but I had to come up with my own party dough. Second, I did a hip-hop show on Walker Memorial Basement Radio (WMBR) called "The Dope Jamz", in that prime spot of 11:00 PM on Friday nights. My partner, and the real instigator, was Philadelphian Larry McKay, aka DJ Chameleon, self-described friend of Jazzy Jeff and acquaintance of Will, Fresh Prince, Smith. Chameleon always claimed that Smith did get into MIT but wasn't "all that."
2) House music is my listening form of choice, particularly DJ mixes. Concurrent with being a big hip hop head in the late 80's, I caught an even worse house music bug. In parallel with hip-hop, house and Detroit techno were exploding in the northeast US. When I got to UC Berkeley in '89, the rave scene blew up and sucked me in. I still have a pair of Technics SL 1200's, a couple of Stanton cartridges, a mixer, a lot of now classic 12" vinyl, and know how to use them. For a while, under the watchful eyes of Rob Doten, I got to abuse the tables at the now defunct Primal Records store, a central East Bay DJ hangout. Brush with greatness: Curtis A. Jones, aka Cajmere/Green Velvet did one year of chemical engineering grad school at Berkeley, before launching his recording career. We used to frequent a lot of late night SF SOMA (pre-ballpark) spots together.
3) I was a passable ultimate frisbee player, never quite being serious or athletic enough to join a "real" open team, but always being a solid grinder and role player. At the hat tournament, I was the guy who would make a couple of big plays, throw to the rookies, and wouldn't complain about PT. When I was at the top of my game, I mainly played coed before coed was cool. Peak moments were playing C bracket finals on a Sunday at Potlatch 1996, rushing to SEA-TAC, flying a redeye back to New York, going to work, and then playing in the Westchester Ultimate Summer League, which featured some of the best players in the country, followed by the 2001 Monkey Foo tournament in Grenoble, France. I was in the best shape of my life, my team, the Geneva Wizards won the spirit award, and my flight back from that trip landed on September 10th. I also played in 10 Hats, Hops, and Hucks.
4) I was a barista at Nefeli Cafe in Berkeley, CA for a little over a year, surrounded on either side by a couple of years of steady patronage. I caught the cafe bug a few years earlier in Berkeley, and to this day I am a complete cafe slut, doing my best to scout out coffee shops in whatever cities I visit. I don't think Pascal, one of Nefeli's two Greek owners, took to me until we had a spot inspection by an Illy coffee supervisor, we might have been the only place serving Illy coffee in Berkeley at the time, and I managed to make a cappucino to his satisfaction. Creamy not frothy foam!!
5) I'm a bit of a sports junkie. Not a rabid sports fan, but my background noise of choice is sports talk radio and ESPN. I freely admit that this is a harmless, but horrible vice. I think the seed was planted by my dad listening to Ken Beatrice (YOOAH, NEXT!) but I got really hooked on KNBR. Heck, I even enjoy Jim Rome's radio show, a very acquired taste. The only show I had to give up on was Scott Ferrall's KNBR travesty. God was that unlistenable.
Berkeley, Greece, Santa Barbara, Vancouver, and Honolulu?. Not to mention fun sounding courses like Practical Machine Learning and Online Journalism. Boy, those Cal graduate students have hard lives. Back in my day, we were lucky to get trips to Toronto, Seattle and Albequerque!
Ha, ha. Only serious.
Good recent linkdumps [1],[2],[3] from Josh Lucas. Good stuff on web app architecture and search, not to mention the bonus collaborative filtering links.
Memo to Josh. I'll take LSU -whatever. Bowls with Notre Dame haven't been particularly pretty.
This old Georgia Tech visualization paper looks worth a read, especially given the boom in online audio and video:
We present a new user interface technique for the visualization and playback of long media streams decorated with significant events. Our Multi-Scale Timeline Slider allows users to precisely focus on a specific location in a very long media stream or set of streams based on significant events while also retaining the stream's entire context. KEYWORDS: Timeline slider control, multimedia streams, visualization, focus + context.
Authors: Heather A. Richter, Jason A. Brotherton, Gregory D. Abowd, Khai N. Truong
Via TomC's linklist
I swiped the title of Bryan O'Sullivan's blog post on doing collaborative filtering in 40 lines of code with Python. Moby hack!
You may have noticed I had some Copious Spare Time (TM) recently, although that's fast changing. In any event, I spent some of that time playing with my PSP, a device that I have previously equated with a PARCTab. In that time, I upgraded my PSP with a 2GB Memory Stick. I've had a lot of fun throwing tons of media on it and using the PSP like an oversized iPod. Note though that said "iPod" has two forms of removable media (Memory Stick and UMD), not to mention built-in WiFi. And in exchange for the size differential, you actually get pretty good battery life when using the WiFi, which normally sucks with PDA-like devices.
I also discovered that my PSP headphones also serve as a little remote control. Combined with the multitude of game playing controls onboard, the PSP would make a great platform for developing ubicomp systems and applications.
Unfortunately, I spoke too soon about being open to development. The one tool I pointed to, Adventure Maker really just creates HTML browser applications, and makes them easily accessible to the built-in PSP browser. Not a trivial achievement, but a pretty low ceiling. Apparently there are actually sophisticated tools and middleware for the PSP, but you have to be a licensed PSP developer to get a hold of them. I'm guessing that this is not an easy trick, and impossible if you can't demonstrate a multimillion dollar revenue stream for Sony.
I wonder, however, if a resarch group teamed with a previously licensed developer could make any headway. Hmmmmm....
I guess the fallback is the rich internet application route, depending on how well the PSP browser supports DHTML and/or Flash.
No, I haven't been tagged, a phenomenon seemingly started by Jeff Pulver, although there are hints that he was at least inspired by an existing wave of the game.
But, I note it as another challenge for all those blogosphere/social network analysis/search/social media hackers and researchers out there. Show me the shape of this blog tag network (tree?) and give me a picture of its dynamic growth. All the data's out there! This one'e even easier in that folks are putting explicit links in their posts.
PublicSquare is a new (to me) hosted CMS/publishing system, which looks like it might be a good fit for IT strapped journalism groups (read school projects). Incorporates workflow and community management features from the ground up.
Jeff Heer's prefuse toolkit just keeps getting better. There's a couple of new vizzes in the gallery that I hadn't noticed. Heer wrote a new viz of congressional spending, reworked the baby NameVoyager, and also reimplemented Ben Fry's zipdecode processing app. Warning, heavyweight Java applets at the end of those links. Also, Doantam Phan used prefuse to implement flowmaps.
I was wondering what was in Brad Dayley's Python Phrasebook. TechBookReport has a brief review, but it looks like a handy collection of Python idioms across a number of tasks. Could be useful.
Steve Rubel and a few other influential bloggers received an audience with Bill Gates.
I can see the post germinating in Seth Finkelstein's mind even as I type this. He's convinced this Z- lister that there's a new set of gatekeepers. However, I never bought into the blogging triumphalism, so I'm not that broken up about the situation. I normally wouldn't comment but this just seemed like the punctuation to his riff. Besides, discerning new media consumers need to keep the new/old context in mind.
Not to mention the whole Arrington, Jarvis, Winer, Denton, Calacanis, Denton, scrum. At least they're entertaining gatekeepers.
Game Girl Advance reports on Microsoft's XNA Game Studio Express development platform for the Xbox 360, and an Ars Technica article has more details from a programmer's view. Basically you can develop on an XP desktop box and deploy over an Xbox 360 subscription network. The fees aren't trivial but aren't outrageous. Low cost enough that some enterprising computer clubs or informal youth media groups could cook up something interesting.
This could also be a vehicle for some interesting research based, software development platforms. The Chicago Python User Group already kicked around the idea of using some form of Python for building apps.
Ars Technica's Ryan Paul summarizes some things to look for in the upcoming Firefox 3.0. Looks like the big push is on incorporating Cairo for graphics rendering and improved standards support. Here's hoping someone works on the resource consumption. Firefox typically requires an order of magnitude more virtual memory and resident memory than any other application I use. And I'm an Emacs bigot!!
Matthew Hurst is on the program committee for the inaugural International Conference on Weblogs and Social Media, and reports that the ICWSM submissions were numerous and of high quality.
Tim Finin of UMBC is hinting that his group at UMBC has developed a new model of blog influence, with published results forthcoming. Looks like they employ some social network analysis and a nuanced view of social roles in the network. Combined with their previous work on splogs, this could turn out interesting.
[Via Steve Rubel]
Multicore programming is the wave of the future, that's what those Intel Duo (and other chips) enable, allowing for lots of concurrent operations radically improving performance. But programming with concurrency is hard. A potential aid to programmers in dealing with concurrency is transactional memory. This good article in ACM Queue, provides a nice high-level overview of transactional memory, including pros, cons and alternatives.
[Via Lambda the Ultimate]
20 videos of invited talks given at Google, as selected by Peter Norvig and I assume others in Google Research. The videos cover a broad range of topics from chimp research, to Google history, to Python, to large distributed systems.
Jon Udell is moving from InfoWorld to Microsoft. His blog coordinates are moving from his long standing site, to a new personal location, for a short term sabbatical.
NVAC stands for the National Visualization and Analytics Center, a consortium of research groups interested in visual analytics for homeland security. About two years ago the organization produced a manifesto, "Illuminating the Path", to advance the state of the art in visual analytics. I haven't completely waded through the PDFs of the report, but at least the executive summary (PDF) makes for interesting reading. There's also a nine minute video about the production of the report, embedded in the page.
Link parkin': On Windows, SFTPDrive mounts remote filesystems using SFTP. Secure, cross platform, very convenient, probably worth the $39.
Digging through the EagerEyes archive, Robert Kosara demonstrates the little known square pie chart. I am not joking. Think pizza sliced square versus "normal" cutting. Kosara convincingly shows how a square pie chart makes it much easier to read the raw magnitude of a statistic.
Another thing that's apparent from the discussion, under the assumption that the graphics are on the same scale, these charts support comparison quite well. Magnitude directly maps to area. Square pie charts also strike me as a fairly flexible basis for an interactive visualization, being pretty straightforward to generate, scalable, and quite button like.
Link parkin: EagerEyes looks like a good blog and collection of resources regarding visualization.
[Via Tom Carden's linkfeed]
Speaking of aggregators and information overload, GTP Solutions' Feeds 2.0 would seem to be the uber synthesis of many of the themes I reiterate:
Feeds 2.0 utilizes an advanced computational intelligence personalization learning engine. With personalization the system ranks the feeds according to sources a particular user likes, authors and topics he's interested in, and brings interesting articles first. These are ranked by a score the system has assigned based on what has learned about the user's preferences. The system creates a dynamic profile of the topics the user likes and the sources he reads most. It actually begins to learn immediately from the first couple of clicks in order to figure out the user's preferences but obviously the more he uses it the better it gets.It even allegedly does item clustering.
Pretty ambitious. With all the moving parts, I wonder if it actually works.
A fresh wave of angst regarding webfeed information overload seems to be rippling through the blogsphere again. 37Signals' Mark Linderman proposes a couple of broad ways of filtering feeds: relying on publishers, relying on community, relying on friends, relying on a smarter client aggregator. The extensive comments on Linderman's post are worthy of examination as well.
From reading these types of discussions for a few years, I have one observation and one comment. The observation is that there are quite observable variations in behavior for both publishing and reading. Rigorously academic study to come up with some usefully discrete points in the spectrum would be a good project. Thus, you can immediately dismiss statements of the form, "The solution to information overload is...". Well no, there is no singular solution. Any solution is really dependent on a reader's context.
My comment is that folks almost uniformly focus on filtering, or reducing the amount of content delivered. Assume that feeling overloaded is the normal state and can never be eliminated. There's always too much stuff. Then wouldn't devising new ways of organizing, presenting, and navigating the deluge be an easier pursuit of larger benefit? Besides, throwing stuff away might be harmful, potentially eliminating larger contextual signals.
I'll also argue that overload is normal because most people are always irrationally afraid that they will "miss" something, even though they can never get a guarantee they won't. To compensate, they subscribe to "too many" sources. Also, having many sources increases the potential for serendipity, which is another effect innately sought after.
Link parkin': Backgrounder on how Marshall Kirkpatrick managed his feed reading while working for TechCrunch. I didn't think it was particularly deep, but there's a couple of interesting nuggets to be had.
Matthew Hurst is pretty excited about Swivel. According to Swivel's about page, "Swivel is a Web site for curious people to explore data." They also have a cute overview of the site.
In general, I'm of a like mind with Hurst, and I'm on record advocating social visualization as an interesting research direction. However, datasets and graphs unmoored from any particular purpose or activity doesn't seem all that often useful. Besides, it's hard enough finding visual insights within a dataset much less across them, which Swivel touts as a feature.
In short, I'm not hopeful for Swivel's success, but I'd be quite happy to be proven wrong.
Link parkin': JointRadio: "A mashup of bookmarks, mp3 files, rss, a flashplayer".
Hopefully MP3 blogging and other Web musical activities can carry XSPF, which just reached version 1.0, into the dominant standard for playlists, including video playlists. Plus, I just think the XSPF players are quite neat hacks, very "of the web".
[Via Lucas Gonze]
NewsCloud's Jeff Reifman announces that the source for the social news platform is freely available under GPL. This is another codebase an entrepreneurial newsroom could cheaply exploit.
I have to pile onto Findory with Paul Lamere. Sometimes Findory seems to get fixated on certain things and keep recommending them. For me it's not topics so much as sources. I don't want to name names, but I can easily rattle off 5 feeds that Findory keeps recommending items from, mabye even all the items, leading to a defacto subscription. I've clicked through to these sources once if at all, months ago, and can tell from the item summaries I'm not going to take further suggestions any time soon.
It's not that big a problem, since I don't have to do any work to ignore these items. I get my Findory suggestions as a custom RSS feed. On the other hand there's got to be a tiny incremental cost to suggesting unwanted things that adds up over time. Wonder if anyone in the recommender community has examined the cost of poor recommendations other than a minimal, "people didn't use our system until our recommendations didn't suck," effect.
On the third hand, I imagine it has to be tricky to take user input for this problem and yet prevent users from blowing their foot off.
My encounter with Nalanda hints at a focused crawling effort within the digital libraries community I haven't examined deeply enough. For example, Cornell's Donna Bergmark ran an extended collection building project that used the classic Mercator web crawler. Collection building aims to automatically build high quality, topic specific portals for online libraries. The group generated some interesting results, including a literature review of collection building (PDF).
As an addendum, Heritrix looks like the open source succesor to Mercator, possibly the first openly, documented high performance web crawler.