June 11, 2007

The Semantic Web Programming Service Provider

(some thoughts while doing a mental retrospective of the European Semantic Web Conference)

  • It seems obvious that there is an increasing trend towards the global integration of structured data. In my mind there is no doubt that this integration will happen to an ever larger degree over the next years (and has been over the past years).
  • It is unclear what kind of integration this will be. Whether it will be a closed, centralized approach (as exemplified by Google Base),  centralized but open (like Freebase) or decentralized and open (the semantic web).
  • Assuming that the semantic web way is the right way, I'm not sure whether RDF is the right data model to base this on (yes, we could try to do it only with XML) - but it sure looks like its worth trying. 
  • For the semantic web to have any chance to take off, we need a semantic web programming service provider.

Programming Service Provider (PSP for short): The logical extension of "Application Service Provider" - instead of delivering applications it offers the infrastructure to build, run and deploy applications. Ning and Yahoo Pipes are two existing programming service provider. This model of PSP's is very important for the semantic web because the decentralized nature is imposing a burden on anyone that wants to build an sw application - she has to worry about network latency, crawling,  keeping an index up to date etc.. PSP's can take care of these problems.

So, what does a Programming Service Provider for the Semantic Web contain?

  • First of all: a local and (reasonably) up to data copy of the entire Semantic Web. This local copy needs to be ranked and as SPAM free as possible.  
  • An API to access this data (in particular this includes a way to discover URI's based on lexical resources and a way to discover subgraphs that contain information about a particular URI).
  • An environment to create applications that use this API (although access should also be possible remotely) - similar for example to the Yahoo Pipes editor.

The building blocks for this vision are starting to fall into place - PingTheSemanticWeb as a way to keep an index up to date, the Sindice lookup index presented at the ESWC or the recent DERI work about joins in very large RDF stores and the SWSE search engine ... But only if it all comes together will the semantic web have a chance to compete against Google Base/ Freebase, because only then will it become simple to write applications that use the semantic web.

Sadly I'm currently not in a position to really contribute much work towards this vision - but I'll try.

But this post wouldn't be complete without a short discussion about what has no place in this vision.

  • There is no place for heavyweight ontologies (or rules for that matter). Sure, these technologies have their merits, an important role to play and imho will become important parts of database technology. However - there are no inference technologies available or even at the horizon that can deal with web scale data; that can deal with the size, rate of change and semantic heterogeneity to be expected on the web. This is true for rules just as much as it is for ontologies. It is an interesting research challenge to try to develop new kinds of inference mechanisms that some day could - but for now we don't even have an agreed upon model of what should be inferred - much less can we compute it in reasonable time. And even worse still - there really is no compelling use case (at least none that is not AI-complete).
  • And Semantic Web Services and NLP .... I'll leave that for another time - now I need to figure out what happens BETWEEN 8:00 AM  AND 9:00 AM :)

Labels:

0 Comments:

Post a Comment

<< Home