Wednesday, April 30, 2008

ED as a company

Too bad it wasn't ours...

This video of an interview with ZigTag sums up the basic ideas of the Entity Describer project very nicely. I kind of knew this coming because I thought this was such a good idea.. but it is a touch heart-breaking to see something you've worked on produced, packaged, and sold by some one else. Makes me wonder why I ever went to grad school and what I'm still doing here.

My favorite quote from the interview was "we'll have a better understanding of the Web then Google does.." - nothing like being bold. It has a ring of truth in that the semantic tags added by people and not by a text indexing algorithm have some major advantages - notable higher precision and the ability to tag non-textual content; however, its not going to know anything about the vast majority of the Web that remains untagged by people. There are ways their knowledge base can be used in that effort, but thats old news..

My least favorite quote from the interview relates to that knowledge base - "we want you to use tags that are actually defined by us". For most people that might be alright, as long as they can generally find the tags they are looking for and the service works for them they will probably be happy; however, it seems that many people might like to have some control over the both what semantic tags they have access to and what the semantics of those tags actually are (especially developers). By hooking up to both the ontologies of the semantic web and the topics in freebase, the entitydescriber concept keeps that aspect of the system completely open. That aspect is important and remains, perhaps, the best reason to keep going with the project.


Pedro Beltrão said...

ouch :) well, it does say that you are on a very good path. I am sure that there are many questions open to explore and you are working on a particular niche in which you have a lot of knowledge.

Benjamin Good said...

I suppose so. Perhaps it will help keep us focused on specific problems related to bioinformatics. The aspect of user control over the vocabulary is pretty big for us, but likely not for them. I'm pretty sure ED is about the only way you can choose to comfortably tag with Gene Ontology terms right now and the fact that you can easily load your own terms or whole terminologies does distinguish it. Can't say I'd be too surprised if NCBO or related institutions eventually produce something like this though.