Thursday, July 10, 2008

OntoLoki lives!


(Update June 8, 2012.  This paper was not accepted at first submission (see sad story), but can be accessed as Chapter 4 in my dissertation.  I've had enough interest in the concepts it contains that it is probably worth resubmitting it somewhere, someday, somehow...)


As noted in previous posts labeled with the tag OntoLoki, I've been working off and on for a few years now (yikes) on a program for automatic ontology evaluation.  Now, we are getting ready to submit our first paper on the subject and would like to open things up for comments.  I labeled it as a technical report in hopes of starting a tradition of such things in our laboratory.  It seems like a good way to keep the locals on track and have another chance for reviews before things go out into the scary world of official peer review.  I suppose I could drop this into Nature Preceedings again, but I'm tempted to wait until its gone through more revision cycles before I do so as that is likely to form a more permanent record than I am really ready to commit to I think.


Here is the longish abstract to whet your appetite.  I was thinking of blogging the rest of the document as distinct posts for each section - thoughts on that?  
As always I really appreciate any time you spend here and any ideas that you choose to share.

Abstract
Background: The delineation of clear, logical definitions for each class in an ontology and the consistent application of these definitions to the assignment of instances to classes are important criteria for ontology evaluation. If ontologies are specified with formal, property-based restrictions on class membership, then such consistency can be checked automatically using existing technology. If no such logical restrictions are applied however, as is the case with many current biological ontologies, there are currently no automated methods for measuring the semantic consistency of instance assignment on an ontology-wide scale, nor for inferring the patterns of properties that might define a particular class.

Objective: The aim of this study is to identify, implement, and test a new method for automatic, data-driven ontology evaluation that is suitable for the evaluation of ontologies with no formally defined restrictions on class membership. The method should quantify the consistency of instance classification within such an ontology based on patterns of properties found to be associated with the instances of particular classes.

Design: We constructed a program that takes as its input an OWL/RDF knowledge base containing an ontology, instances associated with each of the classes in the ontology, and properties of those instances. For each class, it outputs: 1) a rule for determining class membership based on the properties of the instances and 2) a quantitative score for the class that reflects the ability of the identified rule to correctly predict class membership for the instances in the knowledge base. To test the proposed method, we constructed a series of knowledge bases that varied from perfectly consistent through to completely random and evaluated each one using the implementation. In addition to this artificial control study, two other well-known biological ontologies were evaluated using public data to provide indications of the behavior of the system in realistic contexts.

Results: In the first experiment, the method produced direct quantitative assessments of the different versions of the knowledge bases that correlated directly with the known level of consistency of instance assignment for each knowledge base. The evaluations of the other ontologies indicated that the method was successful at detecting relevant patterns associated with the instances of classes in real biological ontologies based on publicly available data.

Conclusion: The results indicate that the suggested method can be used to conduct objective, automatic, data-driven evaluations of biological ontologies without formal class definitions in regards to the property-based consistency of instance-assignment. This inductive method complements existing, purely deductive approaches to automatic consistency checking, offering not just the potential to help in the ontology engineering process but also in the knowledge discovery process.

3 comments:

Anonymous said...

Hi Ben, looks interesting, have you thought of doing anything for OWLED this year along these lines?

Benjamin Good said...

Hi Duncan, probably not this year, but it would be an interesting place for this kind of work. For the moment my conference-going is really on hold until after I finish up. (I'm also on my honeymoon for the next four weeks so won't be submitting or doing anything new until the end of August).

Anonymous said...

Happy Honeymoon! (and happy scifoo too?)