Thursday, 26 July 2012

Why choosing ontologies should not be like choosing Pepsi or Coke

I've just returned from the International Conference on Biomedical Ontology (ICBO) which is the biggest conference in the field of bio-ontology. I was invited to sit on a panel which was somewhat provocatively titled 'How to deal with sectarianism in biomedical ontology' during which we discussed how we might better get along and defuse some of the issues that have plagued the community over the last decade or so.

Fish-Tree-Pepsi-Coke. I know what you're thinking but
just go with me on this and read the post.
The range of views of the panelists was interesting, in part because they were not as extreme as one might have expected, or at least as they might have been five years ago. I'll attempt to summarise the panelists thoughts based on their initial two minute 'pitch' slide, I've included a link to the participants slide:

Alan Rector - Alan mentioned the need for humility and to understand what a given ontology is designed to do before we criticise it as they are can be made for different purposes. He also mentioned the need for proper evaluation.

Chris Stoeckert - Chris stated that sectarianism is inevitable and that he had chosen his sect which was BFO/realism. Ultimately, he said the biggest sect wins and that this is the OBO Foundry, which, as a community effort, we should join.

Barry Smith - Barry suggested that any ontology of any worth should be developed by an ontologist that has signed up to a 'code of ethics' which includes principles of reuse, aggressive testing in multiple real world applications and of  'thinking first' before adding a term or definition.

My own stance was that in general, I don't think a sectarian approach is very useful, not only because it causes political divides within our community, but because it also alienates us from other communities who, from the outside looking in, may be less likely to engage with us. And that hurts us because above all else we need users, more than they need us. I also think competition is fine. This is in general how science has worked for quite some time, moreover, if it didn't then we would never have made leaps forward by listening to the minority voices on issues such as evolution and Copernican heliocentrism. 

But underlying everything I said is my desire to see ontology engineering become a first class citizen and mature as a discipline. My job, in part, entails building ontologies for millions of data points with much diversity; 1,000 species, 30,000 experiments, 250,000 unique annotations. If people are willing to call out that I should be using ontology a instead of ontology b, then I need to know why, and this can not be based on subjective or political opinions. I want to see the development of formal, objective metrics to determine whether or not one ontology is better than another, so that we can really measure these artifacts and have something scientific to base our judgements on.

Alan Rector also rightly points out ontologies are built for different purposes so we need to factor that in. As Einstein said "if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid." If Amazon used an ontology to power their website, it would be hard to argue that particular fish is not a good artifact as the Amazon application seems to work pretty well.

I've also heard many comments from certain quarters about an 'ontology crisis' wherein ontologies of poor quality are now everywhere to be seen, polluting the pool. This sort of comment is similar to comments made during the software crisis of the 1960s, and, given that funding for ontologies can be hard to come by, we can ill afford to overrun. They reacted to this by developing software engineering processes and methods which, over time, helped enormously, though they did not resolve all the issues, cf. no silver bullet. Whatever your stance, it is hard to argue against wanting proper processes and methods for building in quality; nobody wants a blue screen of death on a plane's fly-by-wire system during a transatlantic flight. Similarly, nobody wants a medical system using an ontology to give incorrect results. An ontology for your photo collection, we care less so.

So what do we need? Here's my list;
  1. A formal set of engineering principles for systematic, disciplined, quantifiable approach to the design, development, operation, and maintenance of ontologies
  2. The use of test driven development, in particular using sets of (if appropriate, user collected) competency questions which an ontology guarantees to answer, with examples of those answers - think of this as similar to unit testing
  3. Cost benefit analysis for adopting frameworks such as upper ontologies, this includes aspects such as cost of training for use in development, cost to end users in understanding ontologies built using such frameworks, cost benefits measured as per metrics such as those above (e.g. answering competency questions) and risk of adoption (such as significant changes or longer term support).
In a sentence; making public judgements on ontologies should be a formal, objective and quantifiable process and less like deciding whether you'd prefer a Pespi or a Coke.

Incidentally, I prefer Coke.

No comments:

Post a Comment