Thursday, May 20, 2010

New home with Gene Wiki

Hello from my new home away from home at the Genomics Institute of the Novartis Research Foundation (henceforth known only as 'GNF')! I have come back into the warm, comfortable folds of mother Science as a postdoc in Andrew Su's group where I will be working on the Gene Wiki (no not WikiGenes, not WikiProteins, not WikiPathways and not any of the other biowiki wonders).

The broad purpose of the Gene Wiki effort is to describe the function of all human genes. While most other thrusts in this direction emphasize structure and control for capturing these annotations, the Gene Wiki project is guided by the undeniable fact that:

"Data without structure is still valuable, but structure without data is not"
(Andrew Su)
Based in part on this premise, the creators of the Gene Wiki made the choice to start their work directly in the context of the mother of all Wikis and one of the largest distinct sources of unstructured data on the Web - Wikipedia itself. This, of course, presents distinct advantages and disadvantages. On the plus side, there were already many users, pages, and most importantly editors before the Gene Wiki had a name, the Wikimedia foundation handles all of the infrastructure, and the wiki articles profit from ridiculously high amounts of Google Karma - more or less ensuring that the vital flow of people through the pages will continue. On the minus side, the lack of direct control over the technical infrastructure sharply limits the introduction of new interfaces for adding or interacting with content beyond the fairly tight constraints of WikiText.

Now, you may find it odd that some one like myself that has spent most of the past half decade trying to figure out how to add more structure to bioinformatics resources on the Web and complaining vociferously about the lack of interest in doing so (at least according to the latest Web standards) would now be working in the amoebic, almost structureless world of the wiki but fear not - the central aim of my work here is to figure out how to get more structured data out of the articles in the Gene Wiki...