February 20, 2008

Collaborative Knowledge Formalization Beyond Lightweight - Tackling the Curse of Prepayment; Part I

The Curse Of Prepayment
The Curse of Prepayment is also often referred to as the Chicken-Egg problem of Semantic Technologies: Semantic Technologies promise great functionality once a great amount of knowledge is formalized. And because knowledge formalization is difficult, often not well supported and cumbersome you need to make a great up-front investment before you see any functionality. Now this insight is not new at all, there are already numerous approaches that try to address it; of particular interest here are approaches that try to harness web2.0 ideas for this task. These web2.0 approaches to knowledge formalization can be roughly separated into two groups

  1. The first group is based on the observation that lots of people are successfully creating structured data with tagging applications. These approaches then try to extend these systems with a bit more structure, a bit more formality. Our own soboleo system, GroupMe, Int.ere.st, Bibsonomy and gnizr are examples for these kinds of systems.
  2. The second group of systems start from the observation that people are spending large amounts of time creating semi-structured data in wikis. These system then try to give people the tools and the support such that they can create data with more structure, more formality. The Semantic Media Wiki, Freebase, IkeWiki and MyOntology are example for these kinds of systems.

Making Every Penny Count, Immediately
What makes these systems interesting, what gives them a chance to tackle the Curse of Prepayment are five closely related properties:

  • Simple: Formalization is simple, can be done with little training, little effort and not only by logic experts.
  • Collaborative: Formalization can be done jointly in a group - in this way the cost is spread over multiple persons; the prepayment needed from every person is reduced. 
  • Incremental: Not everything needs to be formalized at once, formalization can be done incrementally.
  • Partial: The tools can work with data stores that are only partly formalized, that contain data at different levels of formality.
  • Immediate: Formalized data can be used immediately, immediately brings some benefit to the user.

Together these five properties can be summed up as: "Making Every Penny Count, Immediately". There is an immediate benefit for formalizing even small parts; and because these systems are simple and collaborative, formalizing these small parts is relatively cheap.

The exact nature of this 'immediate benefit' differs between the systems mentioned above, for example it is:

  • Tables and less redundant data: The unique selling point of the Semantic Media Wiki: as soon as just a few attribute values have been specified, these can be used to create tables and overview pages that before had to be maintained manually.
  • Hierarchical Organization: In systems like Soboleo or Bibsonomy tags can be organized hierarchically, this allows for more effective maintenance of the tag repository as well as for more effective navigation and retrieval. This works after having just one such relation.
  • Advanced Search: For example in the SOBOLEO system adding just one synonym for a tag/concept will already improve the search experience, searching for this synonym will then also consider the documents annotated with the topic.

This post is the first in a series of three posts, the next will focus on the challenges for collaborative knowledge formalization we encounter when moving beyond the very lightweight formalisms currently employed in the tools mentioned above. 

Labels: ,

0 Comments:

Post a Comment

<< Home