Comments on i9606: NCBO Annotator versus MetaMap on GO concept detection

try to compared http://gopubmed.org/ !

2011-06-29T05:11:17.421-07:00

try to compared http://gopubmed.org/ !

I let everything through for this - no cutoff for ...

2010-12-02T21:25:21.883-08:00

I let everything through for this - no cutoff for either system.

Params for NCBO that might matter:
minTermSize: 3
with Synonyms: true
longestOnly: true
wholeWordOnly: true
stopwords: protein, gene, (+defaults)

The only non-default options I used for metamap were -z for single term processing (when sending single terms like wiki page titles) and -y to turn on word sense disambiguation (when sending in sentences). i also limited it to GO, FMA, and SNOMEDCT for this run though that should only impact speed.

I did have a look at that paper you mentioned before getting into metamap but found it a bit inconclusive. (It was even linked to in the first paragraph of this post ;). Depending on the source of the input text, the precision and the apparent recall varies dramatically. I asked Nigam, the first author on that paper, and he said that mgrep would generally give higher precision and lower recall but that it was probably best to try them both and see for myself how they worked on my data.

I'm curious, what score cut off did you use? a...

2010-12-02T20:40:20.664-08:00

I'm curious, what score cut off did you use? also, some flags like abbreviations may matter.

For precision numbers you should checkout:
"Comparison of concept recognizers for building the Open Biomedical Annotator"
http://www.biomedcentral.com/1471-2105/10/S9/S14

Also, the annotator uses a MetaMap like tool from Michigan named Mgrep. It would be nice if that was open source like MetaMap.