July 18, 2004
Personal information assistant

As previously mentioned, our personal information space is a shambles compared to the published information space of the web. A good reason for this is the thousands and thousands of people working to augment the public information space with searches, meta-searches, meta*-searches etc etc etc. Another good reason is that you're all alone in metadata linking your personal data, whereas you have the help of millions in making sense of the public space. That means quite simply that the search engine companies have a lot more to go on when it comes to indexing public space than they do when indexing personal space.
Everybody wants that to change and averybody is waiting for the personal information killer app. Maybe MS Longhorn will be it, but personally I have to say I doubt that very much, since I think the latter problem is much larger than the former. The metadata quality is low.

By combining a local install of Apache, the slogger firefox extension, the Swish-E indexer and a little homespun perl I've been running a "Search my browsing history" on my desktop for about a month, and I'm already drowning in data. Difficulty ranking and poor quality of metadata (or just the difficulty of using the metadata there is) rapidly degrades the content of the index.
To be fair, I spent very little time on this version 0 of the utility, and with only a a few enhancements I could solve the problem so the index would work properly for much more browsing than it does currently but there's no way it would nicely handle e.g. my > 1GB email collection without a major upgrade.
Probably the key enhancer would be linking all metadata situationally by keeping an accurate record of time with all recorded information (as also suggested by Jon Udell in references above) but my experience suggests that you will have much more limited situational recall than you expect. What you'll need is a situational equivalent of PageRank some kind of indicator that a piece of information is actually among the <1% of the stuff you have read that stuck in your mind.

Posted by Claus at July 18, 2004 07:37 PM | TrackBack (0)
Comments (post your own)
Help the campaign to stomp out Warnock's Dilemma. Post a comment.

Email Address:


Type the characters you see in the picture above.

(note to spammers: Comments are audited as well. Your spam will never make it onto my weblog, no need to automate against this form)


Remember info?