Monday, 2 July 2012

Has the War of the Words alienated Google?

A great wordsmith once wrote;

What's in a name? that which we call a rose
By any other name would smell as sweet

This is taken from Shakespeare's play Romeo and Juliet and it captures two themes I'd like to briefly explore in this post. Firstly, that the essence of what a thing is does not rely on its name alone. Secondly, that feuding about a thing can unintentionally damage a thing.

Your search did not match any documents.
Did you mean Brazillian Shakespeare Horse Jupiter Mountain?
'And what does that have to do with Google?', you might well ask. In May 2012 Google announced the Knowledge Graph search enhancement which they headlined as 'things not strings'.  This led to much discussion in various press outlets in the last month or so about how Google were going to use this new way of indexing web pages to give you more intelligent searching, such as dealing with homonyms - words that have multiple, different meanings (e.g. tire - car wheel, tire - sleepy). This sounds great and, in fact, I think it is. But the idea is nothing new to anyone who has been working in ontology, semantic web or more specialist cases like biomedical data curation. It's one of the driving use cases of ontologies - I refer you to my blog on what an ontology does. So how does Google's new Knowledge Graph differ? I noted with particular dismay that Google's blog did not contain the words 'ontology' and 'semantic'. Various press stories which talk about this hint at it without saying it and many proclaim this to be fundamental new technology, with a tip of the hat to Yahoo's 2009 paper.

Credit to Google though - they have actually implemented something and they are using it, that is more than a lot of practitioners do. But the question remains - why are the words 'ontology' and 'semantic web' missing from these articles, including Google's own? An ontology by any other name is still an ontology - concepts, relationships, graph nodes, edges, types, instances, whatever you call it.

I think the answer may lie in my second theme; the War of the Words. In biomedical ontologies, a field in which I am closely involved, there is an undercurrent of strong opinions and cutting debate with the aim of building consensus. Undercurrent is probably inaccurate because it's actually highly visible - it's more like a tsunami. See the 2010 Merrill and Smith papers for a peek at some of this. Ontologies, even from within the community, divide opinion, engender indignation and entrench viewpoints and to those on the outside this must sometimes seem, well, problematic. It's not always this way of course - there are many great things happening in these communities and some times they unite opinion, reduce division and bridge viewpoints. Collaborative work from many different communities continues and I have been party to several such efforts, with mixed success, but then getting everyone to agree is intrinsically hard. The worry is that, perhaps, the success stories are overshadowed by the war of words. The punch is mightier than the handshake, sadly and perhaps this is the root of my disappointment. I've heard the words 'if ontologies/sem web were really that good Google would be using them' often from those outside these communities.

There is also a feeling from certain quarters that the Semantic Web, as it was originally cast, has also failed to live up to the hype and that what I consider to be a simplified version (avoiding grand gestures) - Linked Data - is similarly floundering. I should add this is not a one-sided argument though and many believe it has and is succeeding though perhaps it needs to do more. Indeed I personally believe that we are now in a better position than ever to exploit these technologies and I am already involved in a project here at EBI which is doing just that. I'll report that in the future.

It would seem apparent that Google are using something akin to ontologies, and possibly Semantic Web technologies, but are unwilling to beat the drum about these overloaded and much travelled buzz words. These words come with baggage, high expectations and a strongly opinionated community. It may just be an omission by Google of course, accidental in nature, and in the coming months they will begin to champion the cause and recognise the work that goes on in the ontology and semantic web communities. If they want a success story then I refer them to the Knowledge Graph that is the Gene Ontology, circa 1999.

Perhaps Google's Knowledge Graph is the killer app that everyone has been 'searching' for; the ontology is dead long live the Knowledge Graph. Let us not, then, kill their efforts with semantics, they're just words after all and in the end, what's in a name?

