Computing Text: text mining

Tuesday, April 17, 2012

Open access journals

It has long been obvious that the days of closed scientific publishing are just as numbered as those of all restrictive practices. In the age of the free flow of bits sharing information will only ever get easier (as Cory Doctorow is fond of pointing out).

As workers in text mining it is, of course, frustrating that we often can't apply our algorithms as widely as would be useful for scientific users of our systems because of journal access restrictions. (The results are real; see for example our recent contribution to a PLoS One paper about oral cancer, Incorporation of prior information from the medical literature in GWAS of oral cancer identifies novel susceptibility variant on chromosome 4 - the AdAPT method, in press April 2012.)

A recent report suggests that the losses associated with these restrictions are more than £100 million per year:

Text mining, for example, is a relatively new research method where computer programmes hunt through databases of plain-text research articles, looking for associations and connections – between drugs and side effects, for example, or between genes and disease – that a person scouring through papers one by one may never notice.

In March, JISC, a government-funded agency that champions the use of digital technology in UK universities for research and teaching, published a report. This said that if text mining enabled just a 2% increase in productivity for scientists, it would be worth £123m-£157m in working time per year.

But the process requires research articles to be accessed, copied, analysed and annotated – all of which could be illegal under current copyright laws.

( The Guardian, 9th April 2012.)

It is time to open up!

Permalink.

Tuesday, February 9, 2010

I love GATE users (though I couldn't eat a whole one).

Users. A bit of a nuisance. They insist on asking questions, testing limits, finding bugs. Around 5 years ago, after something like a decade of giving away software, the GATE team felt very like our old systems administrator, who had a habit of saying "the only secure network is one without any computers attached": we knew that our user community was a good idea in principle, but we really rather wished they'd all leave us alone. In fact we did our best to discourage GATE users: we stopped doing regular releases, we ignored the mailing list, and if we could have figured out how to take the thing out in the woods and bury it under a tree we probably would have.

We failed: GATE refused to die, people obstinately continued to use it, and, as we used it ourselves for all sorts of projects, more and more features were added, quality and functionality improved, and every time we decided it was all over someone would turn up with a pile of cash and a novel problem. So we conceded defeat and resolved to succeed. I think.

This is all a long-winded way of explaining our shift in emphasis over the past year or so: we are introverts no longer, but happy and well-adjusted user-friendly liveware. Text processing for ever! Forwards to world domination comrades! Oops, wrong blog.

So now we're back to actively supporting our users and growing our community. We've upgraded the documentation, we're running regular training weeks and developer sprints, and we've built up several new products and services around the core GATE code to cater for more of the cases we've seen of people trying to deploy text processing over the years (15 of which, incredibly, have passed under the bridge since we first set metaphorical pen to digital paper for GATE version 0.1). We've also revamped the website and no longer look like something that might have been produced at CERN circa 1995.

So far the response has been quite astonishingly positive... so perhaps users aren't such a bad thing after all.

Permalink.

Computing Text

Tuesday, April 17, 2012

Open access journals

Tuesday, February 9, 2010

I love GATE users (though I couldn't eat a whole one).

Share

Hamish Cunningham

Blog Archive

Computing Text

Tuesday, April 17, 2012

Open access journals

Tuesday, February 9, 2010

I love GATE users (though I couldn't eat a whole one).

Share

Hamish Cunningham

Subscribe To

Blog Archive