Microsoft har planer:
The company has also announced plans for blog search, to be called BlogBot, and a natural language search engine, called AnswerBot.Men sådan en har vi jo allerede. Flere gange endda: .com, .de, .net, og .org f.eks.
We knew they cared about their IP, but the linking policy of the Frauenhofer Institute (of mp3 fame) is particularly lame:
The contracting party shall inform the Fraunhofer-Gesellschaft that the link has been inserted, or that the target page has been installed on a webserver, by sending an e-mail message to email@example.com within 24 hours of setting the link. This message must contain the path (URL) via which the webpage containing the link can be accessed.
Some time real soon this idea that you have any right to define how others link to your websites must be stopped.
If it's true that Apple is considering suing RealNetworks for making an iPod compatible online music service as reported here then I'm so not getting an iPod. It's ironic and disheartening that the open source dependent, monopoly squeezed PC maker Apple is using this kind of heavyhanded IP-rights tactic in the digital player market.
Peter Gemælke kan godt se at landets største grisefarmer har overproduceret, men han kan ikke forstå at det skal bringes til ophør bare fordi det er blevet opdaget.
Godt princip. Lad os bruge det også overfor spritbilister (promillegrænsen er 20% under det du havde), hastighedsgrænser (man kan kun køre 20 km for stærkt ligemeget hvor hurtigt det går) og sidst, men ikke mindst, skatteregnskabet (du skal kun betale 70% tilbage af det du skylder).
A producer of a sunday comics section carried by some 38 newspapers suggesting they agree to drop Doonesbury "because of the many complaints" (one assumes from irate dittoheads). Enough did that Dooensbury got dropped. This kind of commercial censorship is becoming more and more common as the poltical climate gets increasingly polarized. In other news: Governor Schwarzenegger will not apologize for calling democrats girlie men.
What the hell is happening.
This is one you learn early if you work for a smallish company:
If you're staffing a project, make sure you have enough people who are adaptable to several types of work.
This is extremely true. In a small organisation there are quite simply more kinds of tasks than there are people to do it, so everybody (absolutely everybody) needs adaptability and flexibility as a basic skill.
People who think of specialization as their professional quality and take pride in their single minded focus on their core competences to the point of saying no to go outside that core might be suited to a bigger organization, but in a small group, when problems change all the time, they're just baggage.
The difficulty comes when you want to migrate that small, adaptable organization to a big, efficient organization. Then all of a sudden the skillset you needed in the little organization becomes a liability and the proud specialists are needed exactly because of the uncompromising attitude that made them unfit for the little team.
A couple of days ago I wrote that I thought that the msnbot was the dumbest bot in town, since it was the only bot fooled by my ilizer service. But I was wrong. The msnbot is no dumber than the rest of the robots. Have a look at this google search. It is the world as seen through the eyes of the Bobby accessibility checker, and the googlebot really went for this one. I have no idea why Bobby checker actually process URLs in hyperlinks so they also filter through Bobby though - I don't really see the use (comic or otherwise).
Next question: What would be a useful heuristic to identify bots like this? I'm doubting there really is one. Most likely a filter would just be a long list of known cases, and probably there are just too many filters around to make that worthwhile. Presumably most serious filters implement the robot exclusion standard to save bandwidth and clock cycles.
Den kan endnu, den gode gamle styrelizer, se bare her:
Den dersens amerikanske kongres har psykomeget vedtaget en resolution, øh, der fastslår, øh, at folkdrab finder sted i Sudan, dér. Nu igen! USA lægger op til sanktioner gennem FN, øh, men, fuck, da rulers i Sudan råder det svedigste internationale hood til at blande sig fucking uden om.
The main interest in the 9/11 commision report (downloads here) is whether or not it will help John Kerry beat Bush in the upcoming elections, but it's no wonder that CNN isn't leading with the political story but with the real life drama of the passengers who fought back.
It's a dramatic story and shows what an absolute nightmare september 11 was on board that plane. Once the passengers attacked the hijackers, the hijackers started to roll the plane first and when that proved inefficent abruptly diving and climbing. The cabin must have been a complete mess. Furthermore - the pilots actually received a warning before the hijackers attacked the cockpit, it was just such an unexpected warning that the pilots asked for confirmation. A lot could have been different with locked, steel reinforced cockpit doors and with the warning being taken seriously. But at the time, the content of the warning was just too unimaginable to be accepted at face value without double checking.
What is a little harder to estimate is the political dynamite. Danish newspapers carried a list of some 10 early indications that something was afoot where the same names cropped up again and again, but I have to say that I believe very much that this is one of thoses cases with 20/20 hindsight. I have no idea how many names and people the CIA, FBI, and NSA combined are trying to keep taps on, but it must be thousands.
It's no simple task to make sure that all the information about any specific person or name gets collated and accumulated in one place, and once it does you have to recognize what level of threat there is.
Secondly, let's assume the engineering effort of large scale monitoring of intelligence targets can be solved, then there is the matter of public oversight. It's not a very pleasing alternative to have >100K staff controlloing the whereabouts of millions of people on a daily basis. There has to be balance as well.
According to a Netcraft news story, RSS traffic is causing traffic spikes every hour on the hour because newsreaders have hourly feed refresh built in and everybody is just doing it at the top of the hour.
The solution to the problem is really, really simple: Randomize the timing of the update to an odd minute count.
That completely reverses my opinion of the redesign. We probably wouldn't have the good links without it, meaning that the redesign was a good thing since it made Adrian Holovaty write the plugin.
Microsoft plans to pay out $75 Billion to shareholders over a 4 year period in the form of dividends and stock buy backs. That is a staggering amount of money for one company to sit on. Its comparable to the entire public sector budget of Denmark for a year. While Denmark is a small country, it is also a rich country and we have an enormous public sector.
To get another, non-inflation adjusted, perspective on that figure: The Marshall Plan paid out $15 Billion to resurrect Europe after the damages of WW2.
15 $Billion is also approximately the US foreign aid budget by the way, and speaking of aid - Bill Gates has announced that his portion of the dividend payouts, some $3 Billion, will go into his foundation.
I am switching to the Sage feed reading plugin for Mozilla Firefox from my previous feedreader Syndirella. The reason to switch was that Sage does even better what Syndirella also tried to do: Integrate feed reading with web browsing.
Its even better for the following reasons:
That's my conlusion on this finding that 44% of large american corporations eavesdrop on outgoing mail. No wonder they're so eager to outsource if they see no more value in the loyalty of their own employees. I like to think this percentage is lower in Scandinavia, and not because Scandinavian managers live in the stone age but because of the quality of the workforce.
He said it on the Gillmor Gang, in a very listenable way. And he has written it down as well. Jonathan Schwarz has a hardware makers approach to open source. Free software is good since it drives the adoption of open standards (there is no incentive in free software to not follow standards as there is on closed source) and since standardization enables even more widespread adoption of technology, meaning a need for more hardware that's good business for Sun.
The software makers part of the equation is that when platform costs dwindle you can spend more time doing the business specific stuff for client X, meaning again more tech adoption meaning more business - if you have a service approach to software.
There's another reason its good: When all the stuff we're used to becomes a commodity software makers will finally have to go elsewhere and innovate instead of just living off the fat that is The Standard Office Desktop.
From the Pizza Party man page:
pizza_party [-o|--onions] [-g|--green-peppers] [-m|--mushrooms] [-v|--olives] [-t|--tomatoes] [-h|--pineapple] [-x|--extra-cheese] [-d|--cheddar-cheese] [-p|--pepperoni] [-s|--sausage] [-w|--ham] [-b|--bacon] [-e|--ground-beef] [-c|--grilled-chicken] [-z|--anchovies] [-u|--extra-sauce] [-U|--user= username] [-P|--password= pasword] [-I|--input-file= input-file] [-V|--verbose] [-Q|--quiet] [-F|--force] [QUANTITY] [SIZE] [CRUST]
The pizza_party program provides a text only command line interface for ordering DOMINOS pizza from the terminal. This program is intended to aid in the throwing of PIZZA PARTIES which are also sometimes known as ZA PARTIES
pizza_party -pmx 2 medium regular
Orders 2 medium regular crust pizzas with pepperoni, mushrooms, and extra-cheese.
Your HTML comments are propagated to the browser client.
Case in point, esselte.com:
<!--DONT RELEASE THIS TO LIVE WITHOUT CHANGING OVER THE HACK BELOW!!!!! STAGE.ESSELTE.COM >>> WWW.ESSELTE.COM - RAE -->
...angermann2. The collage-like CSS styling with -label lookalike titles. The wild font sizes. The huge images. It looks like ... something else. It's cool in a cool way, not in the all too common "Look at me I'm imitating cool guy #5" way. Add to that high quality content and the sense to go away on a real summer holiday. Bookmarked.
Bonus feature: No content about blogging.
Order an ACLU Pizza for that Total Information Awareness experience (turn on your speakers).
a) Tim O'Reilly's data collection experience:
Heck, just recently, I was shopping in Bath, England, and made a large purchase in an antiquarian bookshop. Fifteen minutes later, I was four buildings down the street in a second bookshop, tried to make another purchase, and had my card rejected. Meanwhile, back in California, my wife was receiving a call, wondering if the card had been stolen. "Why would someone halfway around the world be spending so much on books?" they wanted to know.
or: How a search engine in beta transformed the internet as we know it!
[UPDATE: The other bots are equally stupid]
In June a rather stupid service here on classy.dk was a surprise hit, and I have Microsoft to thank for the experience.
Some time ago I made a web page transformation engine that converted the text on pages ti lingiige liki this - i.e. replacing all vowels with the vowel i instead. This is inspired by a danish childrens song where you repeat the same verse once for each vowel using first only a's then only e's and so on. As a nice (but fatal) touch the service also rewrites hyperlinks so they are also redirected through the service, si yii cin livi iiir intiri lifi briwsing inli thi wirld widi wib.
It is still going on. As of this writing msnbot has crawled some 65,000 URLs transformed through my service. And boy, has it gone far! The Wayback Machine, MIT, even competitor Giigli got a visit.
Naturally I had to check if Thi Intirnit had made it into MSN search's sandbox index. It had. A lot. And then some.
Google/MS battle round X: Microsoft buys Lookout a "search your desktop" application that integrates with Outlook and - to emphasize the Google fight - quirky, bouncing, colored double O's in the company logo.
This seems to try to be (half of) the required personal search space - searching files and email fast. I wonder what ranking system they have in place though. As mentioned, search is not enough.
[UPDATE: MS decided to keep making Lookout available: Here it is. That makes Microsoft an open source distributor - well sort of. Lucene.Net is (as mentioned in comments) on an apache style license so you can legally embed it in other apps)]
Følgende scene udspandt sig for et godt stykke tid siden over midnat, ved en pølsevogn på Rådhuspladsen, på en god gå i byen aften. En beruset mand træder hen til pølsevognen og bestiller en pølse med brød:
Pølsemand: Skal der noget på?
Kunden: Der skal fart på!
Jeg lover at historien er autentisk. Tak Kresten, for at du fortalte den.
As previously mentioned, our personal information space is a shambles compared to the published information space of the web. A good reason for this is the thousands and thousands of people working to augment the public information space with searches, meta-searches, meta*-searches etc etc etc. Another good reason is that you're all alone in metadata linking your personal data, whereas you have the help of millions in making sense of the public space. That means quite simply that the search engine companies have a lot more to go on when it comes to indexing public space than they do when indexing personal space.
Everybody wants that to change and averybody is waiting for the personal information killer app. Maybe MS Longhorn will be it, but personally I have to say I doubt that very much, since I think the latter problem is much larger than the former. The metadata quality is low.
By combining a local install of Apache, the slogger firefox extension, the Swish-E indexer and a little homespun perl I've been running a "Search my browsing history" on my desktop for about a month, and I'm already drowning in data. Difficulty ranking and poor quality of metadata (or just the difficulty of using the metadata there is) rapidly degrades the content of the index.
To be fair, I spent very little time on this version 0 of the utility, and with only a a few enhancements I could solve the problem so the index would work properly for much more browsing than it does currently but there's no way it would nicely handle e.g. my > 1GB email collection without a major upgrade.
Probably the key enhancer would be linking all metadata situationally by keeping an accurate record of time with all recorded information (as also suggested by Jon Udell in references above) but my experience suggests that you will have much more limited situational recall than you expect. What you'll need is a situational equivalent of PageRank some kind of indicator that a piece of information is actually among the <1% of the stuff you have read that stuck in your mind.
This local news report, on the tabloid super scoop of a young blonde female teacher having sex with a teenage student, looks exactly like it would in Springfield - home of the Simpson family.
Sensationalist of course, it manages to mix into that an amazing montage of a Smoothie King outlet, street signs of State Road 200, Iinterstate 75 as well a local Best Buy outlet.
A community backlash against (free) required registration schemes: bugmenot is a database of working logins for websites that require you to register to view content. With convenient plugins (that pop up some usable credentials) available for Mozilla and Internet Explorer I have registered as user Cheddar Cheese from Osteby, a 93 year old albanian female CEO in the $20.000 to $25.000 income bracket for the very last time.
Obviously we need to do just a little better: The plugins should actually load the login form itself automatically, just like Password Manager does in Mozilla if you ask it to. The "submit this login to bugmenot" phase is poorly supported by the plugins. Hmm, maybe it's time to tinker.
Everybody is linking to the recently created torrent search engine and I want to do that too, so here goes.
The RIAA and MPAA lawyers must be very happy: At last there's someone to sue over BitTorrent. Now there's a new torrent search engine at bitoogle.com and yes, it does link to torrents of what just *might* be copyrighted materials.
There - that's my bitoogle link.
English is the international language and all over the world companies like to add a little English to product names to enhance the international flavour and freshness of their brand. Here we have a few German examples:
Yes. It really is quite... corny. The image is slightly out of focus, but there's a blurb on the bar that says (in german) "NEW! With genuine USA peanuts!"
Germany is one of those countries where every one is on a first name basis with each part of their entire digestive system (I just flew back from Germany and the yoghurt they served proudly claimed "Keeps both your stomach and your intestines in excellent condition! Enhances your intestinal function!") so I guess it is only natural that they consider toilets cosy.
[UPDATE 20040715: The blog feedback was fierce. AMG already fixed it so there's no longer a stupid warning and the site works in Mozilla now. Thanks god for that. It's still slow as molasses though, and less good than the old site.]
The web's best resource on music - allmusic.com - just took a major step back through a major "upgrade". It's one thing that allmusic decided to start registering users to access some of the content, but the new site only works in IE 5.5. or higher. It seems like a strange time to do IE only websites, now that IE's browser market share is actually dropping. They must not care about the growing base of Mozilla users or Safari users.
Furthermore the quality of the HTML is extremely low (The w3c validator reports over 200 errors on the front page) and the site is slower and dumbed down. Much less information is immediately available and you need to click more now to get the information you want.
This is simply a disaster of a remake, and it's sad because there is no comparable source of information.
Only good thing: Enhanced music previews.
Dit nye chip-dankort kan ikke anvendes p? internettet. Grunden er den enkle og virkelig dumme at l?betiden nu er 8 ?r, s?dan at standard kreditkortindtastnings formularerne p? mange netbutikker (f.eks. alle dem jeg lige bruger), der antager max 2 ?rs l?betid p? et visakort, ikke vil acceptere de nye kort.
I en virkelig dum udtalelse i Computerworld foresl?s det at man da bare lige kan f? alle netbutikkerne til at indrette sig efter de nye danske regler. That'll happen.
Det nye kort (som man skal finde sig i) er som bekendt prim?rt indf?rt for at bankerne kan tjene nogen flere penge p? transaktionerne. Det hedder sig at det ogs? bek?mper kortsnyd - men det er kun i butikker beskyttelsen stiger. Online betalinger er stadig bare beskyttet ved at man stoler p? dem man handler med, og s? alts? nu ved simpelthen ikke at kunne lade sig g?re.
The open source .NET project Mono has gone 1.0 which presumably means that Mono now has full copies of all the major API's included in the first .NET release or equivalents. I wonder if the Novell acquisition of Ximian sped things along in a dramatic fashion or not.
It looks (from choice of screenshot samples) as if the GUI libraries for mono aren't really cross platform. I hope that's just me not paying attention.
(Another spin off of my obsession with Area Man)
Igen kom vejret i vejen - med indtil flere regnbyer og en ny portion jordsovs under fødderne. De store oplevelser først:
Fast, cheap, permanent, dense, energy-lean - those are the five requirements we have for the memory devices in our PC's and there's an upstart promising to do all five with carbon nanotubes. I have absolutely no clue if this story is vapor or in fact solid carbon news, but the dream is a nice one.
Jon Udell thinks about a Google OS - what automated metadata generation and filtering can do for our data drowning desktops. His thinking is interesting, and relates to the famous "metadata is crap" slogan. We need to accumulate metadata as a transparent, tacit activity, not a chore. It's unclear if Longhorn and WinFS is Microsoft getting this message or missing this message.
Good find by David Weinberger - a new CMP magazine called Managing Offshore devoted entirely to speeding along your outsourcing and/or offshoring to India and other places.
With product news of new contractors for call centers, software development or business processing as well as helpful tips on multicultural management, legal problems of accountability and security and also we learn some good biz speak, namely "captive operations", meaning those you have to actually run yourself and cannot or will not outsource. As one of Weinberger's commenters points out:
Ah, pleased to know I'm known in the trade as a captive worker because I work directly for my employer... I'd only just recently got used to being referred to as "a resource" rather than "a person".
Here's an example of the kind of thinking required to do software well as opposed to just doing it. As it turns out this particular example is also becoming quite fashionable as the XML backlash (aka the "XML as programming language" backlash) continues and terms like domain specific languages and little languages get thrown around more often.
The problem at hand is that of word stemming and the solution to the problem is the Snowball language. Stemming is the act of truncating search words to a root for use in search queries (e.g. "words" -> "word"), which is useful in searches. More than 20 years ago Martin Porter created the common standard algorithm in english language stemming, now known simply as The Porter Stemmer. Over the next many years a number of implementations appeared and most of them were in fact faulty. People simply weren't capable of implementing the stemming algorithm correctly. To solve this problem once and for all, Porter designed a little language specifically suited to the definition of stemming algorithms. Along with the language he designed a Snowball to C compiler so that the snowball stemmers would be useful in common programming environments. This story is found in Porter's account of the creation of Snowball.
After the appearance of Snowball, stemmers have been submitted to the project for 11 additional languages. The brevity of the snowball stemming algorithms is testament to the usefulness of this particular little language, and the page describing the snowball implementation of the Porter stemmer from the original algorithm is good evidence as well.
So what has this got to do with how software should be done as opposed to how it is done? Simply this: Even relatively small self contained problems like word stemming take an enormous effort to do correctly. And note: By "correctly" I don't even mean "perfectly", since that is certainly not true of algorithmic word stemmers, I just mean "as intended by design". Only a very limited part of all software is written with that level of attention to detail or that amount of upfront design to guarantee a decent chance of success.
It also demonstrates quite exactly the promise of dynamic extensible languages: Good extensible languages afford the construction of little languages for specific tasks within their own programming environment, and little languages afford a clarity of implementation you can't get without domain specific languages.
Så tog jeg toget hjem, og missede derfor desværre The Hives - som ifølge førnævnte Juster havde ustoppelig energi og hold masser af vand - men sårn er der så meget.
Stort set spild af tid.
Så Blonde Redhead (dagens første band bosat i NYC med et "sjovt" umuligt navn i genren "rund firkant"), der ikke rigtig ville os noget. Sounden var til tider let mudret (undskyld meteorologiske ordspil) støjende triorock i den eftertænksomme genre - her og der krydret med automatiske loops og lidt electronica klingenede synth (altså ikke lyd som en pianist, men mere noget med klangflader). Det var ofte svært at høre nogen egentlig melodi og den kvindelige japanske sangerinde lå på irriterende vis hele tiden lige oven i synth og klangbund, så man ikke rigtig kunne høre hende. Den mandlige sanger lå bedre i lydbilledet og havde en god insisterende vokal. I den midterste halve time af koncerten var lyden mere forenklet og vistnok fra bandets nyeste plade og det var så afgjort den bedste del af koncerten med flere glimrende numre.
Gik forbi Diefenbach og hørte 2-3 numre. De lød som britisk firserrock med smiths, cure, echo & the bunnymen elementer. Altså bare forskellige stilelementer fra britisk 80errock. Men de lød ikke rigtig som noget man kunne kalde "sig selv" og jeg gik videre.
Kort besøg hos Dropkick Murphys. Direkte, halvstøjende rock med singalong egnede folkagtige melodier. Ikke min kop te.
Dagens mest direkte underholdende begivenhed var Scratch Perverts der spillede en hel masse hip hop og kåttede den åp så det var en lyst foran et publikum der var fuldstændig fortrolig med de all-star hip hop hits de spillede og naturligvis overfornøjede over de ekvilibristiske scratchparader der krydrede hitsene. De var seje.
Før jeg tog hjem så jeg så den første halvdel af TV on the Radio (dagens andet band bosat i NYC med et "sjovt" umuligt navn i genren "rund firkant"). De var desværre virkelig kedelige at høre på til trods for avishypen. Støjende, men også ret glansløs, sound med en rent ud sagt irriterende sanger i front. Han indledte alle numrene med at stå og råbenynne "ååuuhuwhøøøviiaaaarrrjjj" i et minut i mikrofonen før han begyndte sådan at synge rigtigt (det gik det heldigvis bedre med). Så var der som bedste stilelement en for genren sjælden korsanger. Men det druknede i monotoni og ligegyldighed trods alt.
Og så tog jeg hjem.
While I'm waiting to go to Roskilde I can prepare myself by enjoying this muddy remake
of Cartier Bresson's classic photo
Deep linking complaints by (c) holder in blog comments, thank you.