Thursday, December 2, 2010

NCBO Annotator versus MetaMap on GO concept detection

For the past little while I've been working on extracting candidate Gene Ontology annotations from the hypertext of articles in the Gene Wiki.  In this work I have been using two of the premier tools for concept recognition in bioinformatics - MetaMap from the National Library of Medicine and the Annotator from the National Center for Biomedical Ontology.  As elaborated on in this article, these systems work very differently and, depending on the input text, can yield substantially different results.  Since I was at a loss when I first had to decide which tool to use, I thought I'd share some of the results of my experiments in the hopes that I might help some one else along in their decisions.

Input: text from about 10,000 Gene Wiki articles (both complete sentences and the titles of pages linked to from a gene page)
Output: concepts from the GO with some form of linguistic match to the text.

Current Rounded Results:
GO concepts detected: Annotator about 20,000, MetaMap about 35,000, Intersection about 14,000 (See diagram below)

In this case MetaMap is the clear winner in terms of recall.  Based on some casual manual inspections and some less casual comparisons to other GO annotation sources, the Annotator earns a very slight edge in terms of precision (it produces slightly fewer false positives).  For my application, I'm pooling the results of both tools.

Another important consideration is speed of execution.  For my experiments, a locally installed version of MetaMap took about twice the time to run the same jobs as the Annotator Web service - despite the lag from network latency for the Annotator.  The output from both tools was fairly easy to parse.  (I found it much easier to simply parse the results myself than to work in the context of UIMA wrappers which are available for both systems.)

So there you go, same same but different.


Leon French said...

I'm curious, what score cut off did you use? also, some flags like abbreviations may matter.

For precision numbers you should checkout:
"Comparison of concept recognizers for building the Open Biomedical Annotator"

Also, the annotator uses a MetaMap like tool from Michigan named Mgrep. It would be nice if that was open source like MetaMap.

Benjamin Good said...

I let everything through for this - no cutoff for either system.

Params for NCBO that might matter:
minTermSize: 3
with Synonyms: true
longestOnly: true
wholeWordOnly: true
stopwords: protein, gene, (+defaults)

The only non-default options I used for metamap were -z for single term processing (when sending single terms like wiki page titles) and -y to turn on word sense disambiguation (when sending in sentences). i also limited it to GO, FMA, and SNOMEDCT for this run though that should only impact speed.

I did have a look at that paper you mentioned before getting into metamap but found it a bit inconclusive. (It was even linked to in the first paragraph of this post ;). Depending on the source of the input text, the precision and the apparent recall varies dramatically. I asked Nigam, the first author on that paper, and he said that mgrep would generally give higher precision and lower recall but that it was probably best to try them both and see for myself how they worked on my data.

Anonymous said...

try to compared !