Thursday, February 24, 2011

Talk or Technology?

Talk or technology -- which is most expensive? Talk, it seems.

I spent the weekend with a friend of mine who runs one of the bigger semantics companies (he's a peddlar of used meanings, I like to tell him -- a kind of wholesale supplier of double entendres). He's very active in the community, and he follows the fate of all the other startups and joint ventures that have sprung up over the last decade or so, and the machinations of their customers, the tech savvy media and the analyst firms and so on. Several months ago he told me a story about Corporation X (let's call them Turnip for no particularly good reason), Startup Y (let's call them Cabbage) and a certain popular text processing framework, which, if you're sitting comfortably, I shall relay for your general delectation and personal improvement.

Now, Turnip are a megacorp, biggest publisher of one sort or another, and supplier of diverse databases and data streams to the jobbing information worker. In common with pretty much every other publisher out there (except Cory Doctorow) Turnip can see the writing on the digitally revoluntionary wall, and are casting around for ways to make their offerings more exciting than the competition. (Whether they can make their expensive, closed and stuffy stuff seem more attractive than the new free and open world is not something I'd bet the house on, but there you go.) One obvious route is to use text analysis to hook their text corpora up to conceptual models and bung the results into a semantic repository. Hey presto, all sorts of new and nifty search and browsing behaviours suddenly become possible. So publishers have been pretty keen customers of both the GATE team and my friend's company in recent years.

Turnip realised the importance of text processing in their collective future some time ago, and, after reporting work based on GATE up until a few years back, decided to take the function in-house. They bought Cabbage, one of the most active text analysis startups of the time. We assumed that they were going to use Cabbage tech to replace the stuff they'd done with GATE...

Fast forward to the present, and my friend was chatting to one of the people who run the publishing side of things at Turnip. Surprise surprise: the Cabbage stuff is nowhere to be seen, and they're still using the old and trusty Volkswagen Beetle of text processing.

Well who'd have thought it.

So, coming back to the question of talk vs. technology, we can conclude that the good people at Cabbage, who were big enough talkers to see themselves bought for a large chunk of readies by Turnip, had the right approach. Sad old technologists like me and mine just don't cut the mustard in the self-promotion stakes.

In fact, I've seen this pattern in a number of contexts. The people best at telling you about why you need them are generally too busy doing just that to really get to grips with all that inconvenient science and engineering that needs doing to actually make a practical difference. Therefore I have formulated Cunningham's Law: the quality of the work varies in inverse proportion to the quality of the slideware. (Next time you're unlucky enough to be bored to tears by one of my talks, please bear this in mind.)

To finish, a hint, free of charge, for those who have text processing problems to solve but would prefer not to spend large sums of money on cabbage and the like. You need open systems, you need to measure from the word go, and you need a process that incorporates robust mechanisms for task definition, quality assurance and control and system evolution. And you need a pool of available users and developers, training materials, etc. etc. So mosey on over to http://gate.ac.uk/ :-)

Permalink. On Blogspot.

No comments:

Post a Comment