Thursday, December 3, 2009

Heading to China

ChinaImage via Wikipedia
Tomorrow I am departing to attend the Asian Semantic Web Conference in Shanghai, China.  I'll be manning the demo for my former labmate's project CardioSHARE (which I played no real part in building).

Looking forward to getting caught up on the latest from the semantic web research community.  Out of the accepted papers, I am most looking forward to hearing "Merging and Ranking answers in the Semantic Web: The Wisdom of Crowds".

Hope to see you on the other side of the Great Firewall.
Reblog this post [with Zemanta]

Wednesday, November 18, 2009

Birth and Rebirth

Its now been 4 weeks, one hour and about 51 minutes since my son was born.  Its also been about  6 months since I finished my PhD and about 6 1/2 years since I had a 'real' job.  Perhaps its time to get on with things and sort out what I'm really going to be doing with myself.  Here's an update on what I've been up to, opinions on my next steps would be welcome...

Having grown somewhat disillusioned with the academic world, I've spent the time since graduation working completely outside of it.  Working with my father and his partner I learned how to use the Google App Engine while building a website/database for their company TrueIDapps.  I've also been involved with the development of a semantic web based startup that is now going by the name FreeForm Information.  Sadly I have not done any work in the domain of bioinformatics for a long time now.

While my post-grad projects have provided some knowledge of fun new techniques (I love the Cloud like everyone else now) and I am beginning to get a glimmer of an understanding of the process of starting and running a business, I have as yet to make a single penny on either project.  Now, faced with the screaming, squirming reality of responsibility I am feeling the pressure to make some decisions about how I should be spending my - now much more limited - work time.

Do I :

  1. Continue to try to keep myself involved with both FreeForm and TrueIDapps - in the hope that one of them will eventually pan out?
  2. Focus my attention on TrueIDapps because I have two other fulltime coworkers versus one part-time partner, because it always feels good to support the family business and because it seems that it is close to making money?
  3. Focus my attention on FreeForm because I find the project more interesting, more relevant to my past experience and more inline with the work I would be trying to do if I managed to find a real job?
  4. Drop both projects and figure out a path towards a real* job?
  5. Commit to being Dr. Daddy, buy some baby formula and get Dr. Mommy back to work ?
I have the benefit of my wife's savings and great family support so I could certainly last for a while before I hit the financial danger point but..  the pressure is mounting. 

Your insights are most welcome!

*real job = a job with a salary, paid vacation, an office, - and a boss who tells you what to do.
Reblog this post [with Zemanta]

Monday, October 12, 2009

mass perception

If you've been to a concert or other live performance in the last few years, you have probably noticed a phenomenon like the one in the photo to the right. As the show gets started, a glowing school of digital cameras emerges out of the night like a swarm of fireflies and persists until the lights come back on. (Not quite as pretty as lighters, but at least they don't burn your thumbs.) When I see this I wonder:

  1. What does the show look like from that guy's camera over there? I wish I could tune in and see, now.
  2. What kind of creation will emerge when it becomes possible for artists to access all of those different electronic eyes and ears at the same time?
When live, phone-to-phone-to-Web video streaming becomes widely used, I think we will see some very new takes on live performance and I'm really looking forward to it.

---
Notes:
  • I took the photo at a fireworks show in the Butchart Gardens near Victoria, British Columbia.
  • If you are interested in live streaming technology that can work from phone to phone, check out a friend's company ZygoDigital.
  • If you want to read some extraordinarily prescient stories involving near future technologies like this - the ubiquitous internet connectivity aspect in particular - see Vernor Vinge's "Rainbows End".
  • Thanks to the comment on my last post, I am trying out Zemanta on this one.
Reblog this post [with Zemanta]

Sunday, September 27, 2009

Semantic media retrieval service? please?

Here is an application of semantic Web technologies that I would like to have. Please make it for me so that I don't have to. When I finish writing this post,
  1. I would like to press a button that said "Enhance?".
  2. When I pressed the button, the application would read through the text and identify terms, phrases, or other conceptual nuggets that it 'understood'.
  3. These concept nuggets would then be used to find stock / open access images (and videos, etc.)
  4. Where a likely candidate set of images was identified, they would be displayed such that I could quickly choose which, if any, that I liked
  5. When I agreed to keep one, it would be embedded in a reasonable location in the text and I would very rapidly go on with my life, but with the added joy of having authored a much more entertaining piece of online personal history.
This thought crept into my mind after reading through Joey de Villa's post about joining Microsoft which is shot full with entertaining media enhancements to the text - which likely took a non-insignificant amount of time for him or his team of personal assistants to put together.

Pictures are indeed worth many words, but how many $$$'s? Perhaps you might even be able to make money with such an app by using it to sneakily sell professional photos and other content.

While you are at it, could you please provide the same text-to-media service in a non-embedded application so that when I needed a clever portrayal of a concept like 'failure', 'success', or 'mass collaboration', for a presentation I could quickly look one up. I might even be willing to by it if the content was good and the price was reasonable -> in a world where I could almost certainly find what I needed by spending a little more of my own valuable time looking for it.

Tuesday, September 15, 2009

bottle rockets and businesses

Last night I attended a Demo Ignite Camp hosted by Microsoft as part of their techdays conference, organized by Boris Mann of Bootup Labs, and headlined by Joey de Villa (Developer Evangelist at Microsoft in Toronto).

Joey initiated the event with the following video which I insist that you watch immediately.


For the (non-obvious) explanations of why this video was the perfect start to the session, see his article about joining Microsoft and the blog post about the video that inspired him to do so.



Monday, September 14, 2009

Strange reality of academic workspace

My sister forward this job posting to me.

We are looking for a researcher or consultant to participate in the
collaborative development of Neural ElectroMagnetic Ontologies (NEMO). This
NIH-sponsored position would involve developing ontologies for
representation of patterns in event-related brain potentials (ERP) data
that reflect various aspects of language processing.

The Associate would work with members of our International NEMO Consortium
(experts in EEG and MEG studies of language) to develop, manage, and curate
ontology and database structures for this project. Please see the website
for more information, and contact Gwen Frishkoff if you have any questions.
The ideal candidate would have a background in cognitive neuroscience,
experience with database design and curation, and/or familiarity with
Protege/OWL ontology development
software...
Sounds like an interesting job that matches my background pretty well, but read on..
Salary will be commensurate with experience in the range of $30-$40K per year.
Perhaps I'm just greedy...


Sunday, August 2, 2009

Freebase authors semantic web book

I just noticed this post on the freebase dev blog - announcing the release of a new book about "programming the semantic web" written by freebasers Toby Segaran, Colin Evans, and Jamie Taylor. I haven't picked it up yet, but based on the description there and in the amazon reviews, it looks like a(nother) nice explanation of the concepts involved in the semantic web as well as a set of practical programming examples based on the W3C standards (OWL, RDF, SPARQL, etc.).


I found it surprising that freebase was not mentioned anywhere in the brief description given in the post or on Amazon.

Friday, June 5, 2009

publisher removal

In my last post, I mentioned offhand that I could not access a PDF (about an ontology for autonomic license management) without paying a $29 fee to Springer.  Though the post was not a direct request for help running around this paywall, I have now received the pdf from 5 different people - several of whom I have never met before.


Clearly, the (micro) community that read that post believe that research articles should be shared in an open-access fashion and that it is both wrong for publishers to charge access fees and right to sabotage the publishers via peer-to-peer exchange of such articles.

I'm wondering if this micro-community (that is you) would feel any differently if the fees paid for such articles went directly to the researchers that produced them rather than to an apparently irrelevant publisher ?  

?

(Also, I wonder about the Radiohead style "tip jar" approach.  This would allow you to read the article first and then contribute a payment afterwords if you felt that the research in the article was worthy of supporting.)

Tuesday, June 2, 2009

licenses and linked data

One of the major downsides of working outside of academia is that I now have to pay much more attention to licenses.  No longer can I just grab whatever data I like, do something fun with it, and try to publish what I did and move on.   Now I need to know - very specifically - what I am allowed to do with what so I can reduce the possibility of being sued and so that I can put up appropriate "powered by bla bla" messages.  Not being fond of reading legal agreements, this is a drag.


So I thought to myself, "there should be a legal ontology for linked data!".  That way I could tell my data harvesting program to ignore (or hide I suppose) any data that I wasn't legally allowed to put into my new for-profit-maybe-someday-I-hope mashup and thus never have to bother reading those agreements again.

I am not the first to think of this of course.  Here is a nice-looking abstract attached to a paper that I would like to read but am no longer allowed to read for free.
"The license agreement can be seen as the knowledge source for a license management system. As such, it may be referenced by the system each time a new process is initiated. To facilitate access, a machine readable representation of the license agreement is highly desirable, but at the same time we do not want to sacrifice too much readability of such agreements by human beings. Creating an ontology as a formal knowledge representation of licensing not only meets the representation requirements, but also offers improvements to knowledge reusability owing to the inherent sharing nature of such representations. Furthermore, the XML-based ontology languages such as OWL (Web Ontology Language) can be user friendly for the non-developers who are often those responsible for implementing and managing such license agreements. This paper shows our use of ontology to represent the license agreement in a development prototype. The ultimate goal is to build ontology for the license management domain that will facilitate autonomic knowledge management. Knowledge based on such ontology can then be shared and utilized by many types of license management system. "
What do you think?  Is it worth it to pay Springer $29 for the 1592kb in that paper?  


Monday, May 11, 2009

CWA at the YMCA

Somewhere high in the air between New York and Minneapolis, my first stop on my way home, I feel compelled to explain a few things to myself.  Why on Earth have I just spent the last several nights living in the YMCA in Flushing, New York?  Why did I decide to go on my first self-funded professional excursion at a time when I have no income and very little savings?  What did I hope to get and what did the trip deliver? 

The inspiration for this minor adventure was the inaugural meeting of the Concept Web Alliance (CWA) at the New York Hall of Science.  The mission statement of the CWA (written partly at this meeting) is as follows:

To enable an open, collaborative environment to jointly address the challenges associated with high volume scholarly and professional data production, storage, interoperability, and analyses for knowledge discovery

The idea is to form an alliance of like-minded researchers and science publishers interested in sharing knowledge in a computationally accessible fashion (i.e. not plain text and such that information from multiple sources can easily be integrated and interacted with).  The basic building block envisioned for these efforts is the ‘triple’ – a Concept-Relation-Concept structure.  (The word ‘triple’ and the interesting new verb ‘triplification’ - meaning to convert some non-triple-structure like text into a set of triples - were almost certainly the most commonly uttered words in the presentations at the meeting.) 

For those familiar with semantic Web standards such as RDF (a generic triple-based language for representing and sharing information) and OWL (a set of languages for representing knowledge in the form of ontologies) it is perhaps most interesting to consider what is not present in a CWA triple and what was not discussed at all in the public portions of the meeting.  The following words never came up ‘description logic’, ‘axiom’, ‘class’, ‘reality’.

The intended materialization of the Concept Web vision - at the moment - thus seems to be an open collection of informal (non logic-based) concept representations, identified by URIs retrievable on the Web, that can be linked together to form semantic networks.  This graph-structure could be queried (e.g. using SPARQL) for the ‘facts’ that it would contain where each such fact would be linked to extensive information about where it came from (who (or what algorithm) suggested it, when, and with what confidence).  Interestingly, this is very similar in its flexibility, lack of built-in reasoning, and its strong notion of provenance tracking to the Freebase model. 

While some of you who like to work with reasoners and OWL or who think that it is better to talk about ‘universals’ and ‘particulars’ than it is to talk about ‘concepts’ may find this lack of formality a little disappointing, I am growing more and more enthusiastic about it because fits the publish-then-filter nature of the Web perfectly.  We see again and again that once information is out there on the Web, its value increases tremendously.  (In fact, many very smart people seem to think that when there is enough text and other unstructured data online that is all we will really need to solve most of our information problems.) By providing a very low barrier for entry and then focusing computer science efforts on handling the noise, complexity, and the conflicts that will inevitably arise (the filter part), I think this triple-publishing approach has great potential to push research in a productive direction.  In particular, I think it will push people to spend more time working on other, more flexible modes of inference that don’t just die when a logical conflict is detected - all that squishy probability stuff that the semantic Web has managed to ignore for so long and that happens to be the stuff that makes almost all interesting AI-like technology work now.  Furthermore, the triple-focus absolutely does not stop groups that participate in the CWA from making use of approaches grounded in formal logics in their own development. 

While there may or may not be good reasons to take one philosophical stance over another when creating knowledge bases, the fact will always remain that there will be conflicts of opinion about this.  When dealing with a small group, it may be possible to convince or force acceptance of a particular world view, but it is not IMHO going to be possible to enforce something as arguable as the philosophy of the representation of the nature of being on the scale of the Web.  By focusing on the smallest possible units, the triples, and leaving the more precise formalizations and the philosophy out of the vision as much as possible, the CWA might make it possible for a diverse, interoperable ecology of knowledge bases to emerge and co-exist.  Ideally, those who wish to make use of, for example - description logic reasoning, should be able to benefit from the pool of URIs in the Concept Web if for nothing other than for the many multi-lingual labels and textual definitions that will be associated with each of them. 

It is still very early days for the CWA – probably far too early to really speculate too far about the consequences of its basic technological approach as even this approach is still very much up for debate.  Still..  I’m not sure exactly how to express this, but the meeting smelled good.  There were enough capable, powerful, enthusiastic people together in that room that seemed to have enough of a shared vision that I think it is very likely that something of that vision is likely to come to life.

So, was it worth it?  I think that it was in the end.  I got an early look at something that might provide solutions to many of the problems that I’ve spent the past several years of my life thinking about (the social construction of a biosemantic web).  I got to reconnect with old friends and make some new ones.  I had a chance to see New York for the first time (the scale of which blew my mind).  And, last but not least, it just might be the last such academic event I get to take part in.  Depending on the choices I make and the dictations of the wheels of fate I may be in the process of losing the privilege of working in the ivory tower.  If it was indeed my goodbye to the community of scholars, it was a good one.

So yes, it was a worthwhile trip and it remains an exciting time to be thinking about the Web - concept or otherwise.  (and the YMCA wasn’t really so bad in the end ;).

You can follow - and perhaps influence - the evolution of the Concept Web on their blog.