March 5, 2008

How Much Is That Ontology?

Ontologies are expensive to build. By now thats known to everyone and we have lots of people thinking about how they can justify the cost of building an ontology for their enterprise. Entirely wrong question - most companies don't need an ontology at all, they should go and bugger the data wharehousing companies. And another misconception is that when people think of 'expensive ontologies' they think its the formalization that makes it costy - na, for all meaningfull ontologies its creating the shared model of the domain; writing it down doesn't add that much and might even help.


And I just realized that the machine in 'shared machine understandable model of a domain' can mean soo much more than just being able to use reasoners with it - just have a look at the project to create the international barcode of life (here at Wikipedia, or watch the TechTalk embedded below)



Labels:

March 2, 2008

Rules as a Simple Way to Model Knowledge - Closing the Gap between Promise and Reality

There is a considerable gap between the potential of rules bases to be a simpler way to formulate high level knowledge and the reality of tiresome and error prone rule bases creation processes.
Based on the experience from three rule base creation projects this paper identifies reasons for this gap between promise and reality and proposes steps that can be taken to close it. An architecture for a complete support of rule base development is presented.

A publication of mine accepted for this years ICEIS conference, you can read the whole thing here.

Labels: ,

Collaborative Knowledge Formalization Beyond Lightweight - Tackling the Curse of Prepayment; Part II

This is the second in a series of three posts - you may wish to start with the first.

'Knowledge' Does Dot Equal 'Knowledge'

When the collaborative knowledge formalization community talks about 'knowledge' they mean something quite different from what most of the Uppercase Semantic Web community or knowledge based systems community think. The collaborative knowledge formalization community thinks of taxonomies, thesauri, skos or of structured data; the other communities are thinking of Logic Programs, Description Logics, OWL or First Order Logic. Current collaborative knowledge formalization approaches just don't support the formalisms that are commonly associated with knowledge formalization.
Now you might argue that this must be this way - that highly formal representations are just not well suited to be edited in the web2.0 style collaboration that is the topic of the collaborative knowledge formalization community. Indeed this may be the case, but its surely worth trying. There is no definite argument proving that highly formal representations cannot be edited in this way and I believe that trying to bring knowledge formalization with more powerful and more complex formalisms to the crowd will at the very least bring advances in robust reasoning and usable knowledge formalization interfaces.

The Challenges Of Using More Heavyweight Formalisms

There are, however, many challenges entailed in moving to more heavyweight formalisms. Challenges such as:

  • Usability / Debuggability: Formalisms such as OWL or First Order Logic are harder to understand, in particular errors are much harder to find.
  • Robustness: A single faulty statement added to a knowledge base with a million of axioms may break everything. Unless this problem is tackled, open collaborative knowledge formalization is impossible.
  • Performance and the  Language Expressivity / Performance tradeoff: Current reasoners for representation languages such as OWL or FOL could not dream of supporting a continuously updated knowledge base of even a fraction of the size of Wikipedia; hence something would have to give: there would have to be restrictions on language expressivity, reasoning algorithms that do not achieve soundness and/or completeness, or languages that are not purely declarative would have to be used.
  • Mixed Formality: the kind of collaborative knowledge formalization approaches discussed here rely on incremental and partial formalization- hence the data store is never fully formalized, contains data at different levels of formality. Current reasoning approaches are not well suited to tackle this.

The Curse of Prepayment - Again

All of the problems in the previous section are real and important - but there is one that trumps them all - the question of what is the immediate benefit of formalizing even small parts of a data store? What do I get from spending time and/or money on bringing a part of my data store to a more formal level? Having answered this question then allows me to decide the tradeoffs needed to address the challenges described in the previous section.

Here the collaboration knowledge formalization community has the same problem as the wider Semantic Web community: "what exactly do I get in extra benefit from using OWL? And is this worth the effort?". I believe there is an answer to that questions - but I'll describe it in the next installment of this series*.

* The first ever cliffhanger on this blog ;)

Labels: ,