OK, I'm mad and probably shouldn't write the following today. Oh well.
Now, here is why I am mad. I've had a paper rejected by the Journal of Biomedical Informatics on the basis of one review. It took two months to get this review. The review does not seem fair and certainly does not provide useful guidance about how to improve the quality of the science described in the paper. Here is JBI's response in its totality, with some embedded reactions from me in red.
Ms. No.: JBI-08-163I requested that my advisor be the corresponding author on the paper because I would be traveling right after the submission and am hoping to relocate soon.
Title: OntoLoki: an automatic, instance-based method for the evaluation of biological ontologies on the semantic Web
Corresponding Author: Dr. Mark Denis Wilkinson
Authors: Benjamin M Good; Gavin Ha; Chi Kin Ho;
Dear Dr. Wilkinson,
Experts in the field have now reviewed your paper, referenced above. Based on their comments, we regret to inform you that we are unable to accept your manuscript for publication in the Journal of Biomedical Informatics.Its a little odd that we only got to see Reviewer 2's comments. I don't know if anyone else reviewed it or not.
We have attached the reviewers' comments below to help you to understand the basis for our decision. We hope that their thoughtful comments will help you in future submissions to the JBI and in your future studies.
Journal of Biomedical Informatics, Editorial Office
525 B Street, Suite 1900
San Diego, CA 92101-4495
Phone: (619) 699-6392
Fax: (619) 699-6211
Good and his colleagues present OntoLoki, a very interesting approach for data-driven ontology evaluation. The novel idea is that the quality of ontologies can be measured automatically although ontologies without or with very few formal restrictions on class membership are used. For poly-hierarchically organized classes suitable datasets with positive examples - i.e. instances with properties - as well as negative examples are composed. Machine learning algorithms are used to determine empirically those rules (patterns of properties) that allow predicting class membership reliably. With other words: The ideal situation is to find instances of classes like "Cat" with properties like "furry" allowing their consistent assignment to the class "Cat" and discrimination to neighbour classes like "Bird".Yep, that pretty much sums up the general idea. So far, so fair.
There are a lot of inherent challenges with this approach that are addressed by the authors, e.g.OK. At this point the reviewer has pointed out that we correctly identified challenges with empirical approaches to ontology evaluation and discussed them in respect to our approach in the paper. Both of these challenges, context-sensitivity and data dependency, are fundamental to any methodology that is based on the use of data to help answer a question. Keep in mind that the main point of the paper is to describe and evaluate a method. To do so, we explain it and then test it out in a variety of different scenarios (different ontologies and different datasets). In some cases it is successful and others it is not. By describing the results from all of these experiments we faithfully represent the realities of applying the method.
- the dependence on the context (chapter 1) and on the way of determining instances and their properties (chapter 1.2.1)
- the problem of sufficient number of instances for every class for estimating a class predictor (chapter 1.2.4)
These are common problems when using empirical approaches. However, the reviewer has doubts about the suitability of the OntoLoki approach for evaluating ontologies. The authors themselves admit that especially the results of the Cellular Component experiment are suboptimal, see chapter 3.2.1 (only 17% is evaluated) and chapter 3.2.3 (results are not overwhelmingly illuminating).
The reviewer criticizes the method by pointing out our own admissions regarding problems encountered with the dataset assembled for the evaluation of the cellular component branch of the gene ontology without actually saying anything about the method itself. Perhaps, criticism could fairly be placed on our data collection methods for that particular ontology. However, the point was not to evaluate that ontology it was to evaluate the proposed method. That 17% number resulted because we didn't collect enough instances to evaluate the other classes. If we collected more data, the number would have been higher, but that is completely irrelevant to the utility of the method and our evaluation of it. In fact, by including data like that, we much more accurately present both the positive and negative aspects of the method. Perhaps next time we should simply obscure any negatives to avoid such criticism.
The MAIN PROBLEM the reviewer has with this approach:Now, here is where this starts to get ridiculous. "the ontologies should be enriched by such formal definitions". Well, we couldn't agree more! That is one of the main reasons we did this! OntoLoki provides a starting point for doing just that!
OntoLoki tries to solve a structural classification problem empirically that originates in poor defined ontologies. Instead of (suboptimally) trying to determine the consistency of ontologies with no formally defined restrictions on class membership the ontologies should be enriched by such formal definitions on class membership, see http://bioinformatics.oxfordjournals.org/cgi/content/abstract/22/14/e530.
As the reviewer seemed to understand in the summary of the paper above, the method is intended to be applied to ontologies (or whatever you want to call class polyhierarchies used in classification situations) that aren't necessarily formally defined. The rules that are learned could be used to suggest possibilities for formal class restrictions that are based on the data the classes are already associated with.
As a matter of fact, the OntoLoki method can actually be used on formally defined ontologies to identify candidate expansions of other definitions. We recognize the importance of these definitions and the reasoning they allow for, that is why the reference so generously provided above was one of the main citations in the paper! In fact, the ontology described in that paper was used as a benchmark of quality for other ontologies - thus providing us with a means to evaluate our method.
In the introduction Jeremy Rogers and later on Barry Smith are referenced as proponents of ontology evaluation. However, these and other researchers in the field of biomedical ontology are mostly concerned with the quality of explicit formal definitions and the structure of ontologies, see http://ontology.buffalo.edu/evaulation.html.What exactly does the "however" mean here? Indeed, both of these scholars are involved in ontology evaluation and I would say are, in fact, proponents of the idea. Why the contrasting "however"? The quality of formal definitions and the structure of ontologies (which can of course result directly through inference applied to those formal definitions) are certainly aspects of relevance to the domain of ontology evaluation.
The structure of ontologies must certainly have something to do with the inferred or asserted class hierarchies they produce. The OntoLoki method is designed for evaluating these hierarchies. So.. why this statement ? Perhaps you could argue that the method is not useful in achieving the task, but it doesn't make any sense to say that the task is irrelevant as seems to be implied here.
The very idea of ontologies is to have explicit criteria for deciding class membership of instances opposed to ambiguous language terms denoting those classes. If there are artefacts with no formally defined restrictions they should not be called ontology.Alright, now we've got to essence of this so-called "review". The reviewer doesn't believe that the things the method was built to evaluate should be called ontologies. So they don't believe the Gene Ontology is an ontology and they don't believe that most of the ontologies in the OBO foundry are ontologies. OK, fine. Perhaps the reviewer should have suggested that we change the title and used a different word to describe whatever it is these things are. The complaint has absolutely nothing to do with the manuscript! The maddening thing is that we have been (sometimes very lonely) proponents of the expanded use of axiomitized, property-based definitions in biological ontologies for years and are still very much of this view. To be criticized for the community's fairly slow uptake of these methods makes my head feel like its going to explode.
However, the machine learning methods are very interesting for supporting different purposes in the context of REAL ontologies WITH formal restrictions on class membership", see chapter "Making use of OntoLoki" in the discussion section. The whole paper should be rewritten oriented to those other supporting purposes in the context of developing, using and evaluating ontologies.Well thanks. It seems that some of the applications of the method (and the software we developed) are "very interesting" but only in the context of "REAL ONTOLOGIES". As it turns out, the method and implemented code could be applied directly to REAL ONTOLOGIES without alteration. (Note that the capitalization is from the reviewer).
This paper, submitted as a paper in the Biomedical Informatics Journal, is a copy of a Technical report, see http://bioinfo.icapture.ubc.ca/bgood/OntoLoki_14.pdf.That this is even mentioned as a presumed negative is outrageous. The report (which does in fact contain the same content as the submission) is not a peer-reviewed publication, it is simply a very informal pre-print. Posting it is perfectly in accordance with Elsevier's rules when it comes to pre-prints, rules that it is clear the reviewer is not aware of.
It is far to long and should conform to the editorial guidelines of the journal.First, I actually agree that it is probably a bit too long. We discussed this at some length before deciding to submit the full version and, in the end, decided that the length was warranted in this case in order to present the argument and experiments in completion. We could shorten it, and likely will when we resubmit to a different (open-access) journal, but that was actually one of the reasons we chose JBI - they explicitly state that there is no "arbitrary limit on the length of individual articles". The submission was well within the editorial guidelines of the journal - guidelines which the reviewer, again, was clearly not familiar with.
Ok, my rant is over now, the red has drained out of my face and I can no longer hear my heart beating in my ears, so I will switch back out of the red to conclude.
So, Reviewer #2, who are you?
One of the more impressive people I met at SciFoo told me that he has been signing his reviews for years to "keep himself in check". If reviewers had to sign their reviews it seems that perhaps they might be forced to do a better job. Good quality reviews (either arguing for reject or accept) would provide another form of publication - another way for scientists to get credit for the work that they do. Are you up to it? Sign your next review.