Valentin's Blog: December 2006

Sporadic Link Post

Some links from the past week(s):

Simplicity (of Software) is highly overrated
Tesla, Tagging for the Desktop by Microsoft (link IE only)
Tag Your Desktop Stuff With Tag2Find
Results of Nature's open peer review trial
More than 90% of emails SPAM
Delta Scan: The Future of Science and Technology, 2005-2055
Everything You Hate About IT - Reader Poll Results

Google's Evil Scale
Industrial Light & Magic - Beautiful introduction to the role of computer images in film

And as usual: you can find all links at del.icio.us, the newest 15 are also always shown in the sidebar of this blog.

Labels: links

The Real Difference Between Semantic Web And Web 2.0

From the Swoogle homepage:

Q: Do you have any plans to commercialize Swoogle?

No. Swoogle is a research project. We have no interest in commercializing the ideas or technology.

Labels: SemanticWeb

There has been some work on views on rdf / ontology data - but actually most is not very useful and complicated in a strange way. I'm mystified why nobody has ever trivially transferred the ideas from relational databases (or maybe I just haven't found the work):

A view on a RDF graph is an RDF graph.
A view is defined by a SPARQL construct query.

Then we can either materialize these views or just compute the parts necessary to answer a query on the view (The mechanisms needed to decide which parts of a view need to be computed should be pretty straight forward extension of ideas from deductive databases). And there is is even a nice subset of SPARQL queries that could be used to define updateable views (queries for which every variable in the select parts also appears in the construct part and that do not contain unions and stuff).

Ahh well, but actually implementing this would take a while ... but I would love to have it :)

Labels: SemanticWeb

The Price For The Strangest Workshop Page Goes To

... the Workshop on "Advances in Conceptual Knowledge Engineering", the page is here and you need to have sound enabled to really appreciate it.

More On The DRM and IP madness ...

First an enjoyable and interesting Google Tech Talk about IP rights by James Boyle:

And then there is an interesting piece about the cost caused by Windows Vista's content protection here. Really a shame. I actually like quite a bit of Microsoft's Software. I love the new Office 2007 - courageous of Microsoft to change the UI so much - but it really turned out well! After seeing what they can achieve it's particularly sad to see these great engineers wasting their time trying to defend stupid IP ideas and making our lives as computer user miserable in the process.

Labels: Web

German Quaero Now Theseus?!

It seems the German - French cooperation to build an Internet search giant has been terminated. Apparently the french wanted to focus more on traditionally search while the Germans wanted to focus on "Semantic Technologies". The German project (which - unlike the french part - hasn't officially started yet) looks set to go ahead anyway - but now named Theseus.

Strange. Even though the FZI is planning to participate in this project I hadn't heard anything about this before ...

Labels: SemanticWeb

Cyc Google TechTalk

Google Video has a video of a talk given by Douglas Lenat, the President and CEO of Cycorp. It's more than 70minutes long, but worth the time of anyone interested in AI. I want to highlight two parts that I found particular interesting:

It's been my believe for a while that general purpose reasoners and theorem provers are only good for very few tasks (such as proving the correctness of a program) and that most real world tasks rather need faster, task specific reasoners or heuristics. For me this thought was always motivated by ideas from cognitive psychology (see for example the research into "Fast and Frugal heuristics" by the ABC Research Group in Germany). However, I always lacked good computer science arguments to back up this point - now at least I can say that Cycorp sees it the same way:

There is a single correct monolithic reasoning mechanism, namely thorem proving; but, in fact, it's so deadly slow that, really, if we ever fall back on our theorem prover, we're doing something wrong. By now, we have over 1,000 specialized reasoning modules, and almost all of the time, when Cyc is doing reasoning, it's running one or another of these particular specialized modules.(~32:20)

I also think that humans are almost constantly reorganizing the knowledge structures in their head - most of the time becoming more effective in reasoning and quicker in learning. An example for this process is the forming of "thought entities". There seems to be a limit on the number of thought entities that humans can manipulate in their short term memory. This limit seems to fixed for live and seems to be somewhere between 5 and 8. What does change with experience is the structure and complexity of these thought entities. A famous example for the effect of experience on the thought entities is the ability to recall chess positions in expert chess players and amateurs. If you show the positions of chess pieces from a normal game to expert chess player and amateurs, the expert players will be much better at recalling the exact positions. But when you place the pieces in a random manner both will perform equally bad. The common explanation for this phenomena is that the expert has more complex though entities at her disposal. In normal chess positions she can find large familiar patterns - like "white has played opening A in variant B". These large and complex thought entities allow the expert to fit the position of up to 32 chess pieces into the available 8 slots. When the chess pieces are placed in a random manner, these structures familiar to the experts don't appear anymore and the expert loses its advantage.
And now I always wondered what could be equivalents to this knowledge reorganization process in logic based systems, Cyc has one interesting answer:

Often what we do in a case like this, if we see the same kind of form occurring again and again and again, is we introduce a new predicate, in this case a relational exists, so that what used to be a complicated looking rule is now a ground atomic formula, in this case a simple ternary assertion in our language (~21:15)

Labels: AI

Sporadic Link Post

I haven't made a link post for a long time and the number of links amassed in this time is just to large to post them here - so here is a selection (you can find all links at del.icio.us, the newest ones are always shown in the sidebar of this blog).

AT&T accurately predicts the future, incorrectly picks delivering company.
OLPC usability - a look at the UI of the 100$ laptop (really quite different)
Riya's Like.com Is First True Visual Image Search.
Google Ad revenue 'to surpass TV'
Amazons Elastic Compute Cloud
Pope says that AI researchers risk fate of Icarus

The Future of Entertainment? Very well written article on the making of Lonelygirl15 and what it might mean for the future of entertainment.

Labels: links

Ontology Maturing with Lightweight Collaborative Ontology Editing Tools

Another publication. Will be presented at the Workshop on Productive Knowledge Work : Management and Technological Challenges (ProKW), 4th Conference on Professional Knowledge Management - Experiences and Visions (WM 2007).

Authors are Simone Braun, Andreas Schmidt and me.

Ontology building is an important prerequisite for state-of-the-art semantic technologies for knowledge worker support. But ontology engineering methods have so far neglected the early phase of ontology building where a conceptualization only exists rather informally and underlies continuous evolution through collaboration and interaction within the community. We have to view ontology building as a maturing process that requires collaborative editing support and the integration into the daily work processes of knowledge workers. In spirit of current Web 2.0 applications, we present an AJAX-based lightweight ontology editor as a first approach to this problem.

I won't be at the conference, but the other two will be. My role in writing the paper was rather small anyway. I did, however, do most of the work in defining and implementing one "lightweight collaborative ontology editing tool" presented in the paper. A rather nice AJAX application. A collaborative editor for a subset of SKOS. The cool thing about the editor is that it really support truly collaborative work - Google Spreadsheet style; i.e. two people can really change the same concept at exactly the same time and nothing will break. Users see almost realtime* updates of the changes other people do to the same taxonomy.

The paper is not yet online, but someday you'll find it here.

* depending on configuration and connection - but maybe a third of a second.

Labels: publication, SOBOLEO

Ask City And The Semantic Web

Ask City is the new local search portal released by ask, and no - it's not a Semantic Web application. But it should be.

For me one of the main new ideas I took home from this years International Semantic Web conference was that for many Semantic Web technologies there is only a limited window of opportunity to move into the mainstream. If the sw-technologies don't make it on time, other technologies will have been used to solve most of the problems that they where conceived for. The other technologies may not solve the problem as complete or as elegant - but their existence makes sw-technologies a harder sell.

Take Ask City as an example. In a way its a traditional mashup - it integrates data from (at least) CitySeach, Yelp, Judysbook, Ticketweb and Urban Mapping. Exactly the kind of data integration challenge that the Semantic Web wanted to solve. However, its probably not created with rdf or owl because other technologies where more mature, more tools existed, people understood them better...

And there is the "window of opportunity" closing a little bit - SW technologies could solve this problem in a more elegant and flexible manner - but it just got a little bit harder to convince people of that. Its gotten a little bit harder to show a visible(!) added benefit when people already see large scale web information integration happening without rdf.

Labels: SemanticWeb

The BAsAS Architecture For Semantic Web Annotations

A poster I presented at the 1st Semantic Web Authoring and Annotations Workshop at the ISWC 2006.

We describe a generic architecture for the (semi-automatic) creation, storage and querying for annotations of web resources. Our BAsAS architecture uses recent advances from the Semantic Web and Web 2.0 communities to make Semantic Web annotations a reality. The BAsAS architecture makes it easy for users to start to annotate and easy for
developer to use the annotations that get created.

Besides describing the general architecture we will also detail an implementation of this architecture build for a Semantic Web community portal.

Think of it as Annotea but better. The presented system addresses some of the most important shortcomings of Annotea: that there are only plugins for the firefox browser (shutting out the majority of web users) and that there is no query language for annotations.

Actually I'm still quite annoyed that it only got accepted as poster. It was not "innovative enough", the changes to Annotea not big enough. Ahh well, I put it down to my bad writing. In a way I even agree that we don't need another Semantic Annotation Paper - we need applications that come with a nice user interface and are usable "out of the box" (in particular without the need for the user to worry about finding a server - something you've to do with current Annotea tools).

The long version of the paper is here.

Labels: publication, SemanticWeb

It's been a long time ...

Yea, it's been a long time since the last post. I've started to be a bit more serious about sports and ended up going to training/playing every day - which cut down the amount of time I can spend on stuff like this Blog ... but such a long time without posts won't happen again ... probably

Labels: administration

Exploiting Usage Data for the Visualization of Rule Bases

In this paper we describe novel ideas and their prototypical implementation for the visualization of rule bases. In the creation of the visualization our approach considers not only the structure of a rule base but also records of its usage. We also discuss the challenges for visualization algorithms posed by rule bases created with high level knowledge acquisition tools. We describe the methods we employ to deal with these challenges.

Authors are Valentin Zacharias and Imen Borgi. It was published at the SWUI workshop at the ISWC 2006. The entire paper is here.

Labels: publication

An extendable Java Framework for Instance Similarities in Ontologies

We present the conceptual basis and a prototypical implementation of a technical framework for computing syntactical and semantical similarities between instances within an ontology. The focus of this work did not only comprise the implementation of specific, ontology-based similarity measures, but also their flexible and efficient combination and extensibility.

Authors are Mark Hefke, Valentin Zacharias, Ernst Biesalski, Andreas Abecker, Qingli Wang, Marco Breiter.

Published at 8th International Conference on Enterprise Information Systems, 23 - 27, May 2006, Paphos - Cyprus

The entire paper is here.

Labels: publication

A Metadata Registry For Community Driven E-Learning Sites

We present the architecture and the interface of a metadata registry for a large e-learning site. The metadata registry if very simple to integrate by content and application providers and thereby tries to motivate more members of the community to contribute. It takes its inspiration from currently successful semantic web architectures and aims to be an evolutionary change to the web – using long established standards where possible.

Author is just "Valentin Zacharias", published at IAWTIC 2005.

The entire paper is here.

Labels: publication

A Topic Hierarchy On The Web

We present the architecture and interface of a metadata registry for a large e-learning site. The metadata registry is very simple to integrate by both content and application providers. It takes its inspiration from currently successful metadata architectures and aims to be an evolutionary change to the web – using long established standards where possible.

Poster at the ISWC 2005, authors are Valentin Zacharias and Stephan Grimm.

The entire paper is here.

Labels: publication, SemanticWeb

Semantic Announcement Sharing

This paper stems from the idea that maybe the painstainkingly slow adoption of the Semantic Web into the mainstream www can be accelerated by taking clues from these tiny Semantic Weblets already present today.
We have identified RSS as one particularly successful Semantic Weblet, formed an opin-ion on why it was successful and have than tried to include all its success factors into a new Semantic Web application.
This paper argues that in order to build a successful Semantic Web application, considering only technical aspects is not enough; econom-ics, the motivation of the actors, necessary changes and available know how is also important.

Authors are: Valentin Zacharias and Mike Sibler

Published in the Proceedings of the Fachgruppentreffen Wissenamangement 2004.

The entire paper is here.

Labels: publication, SemanticWeb

Incremental broadcasting as a strategy for multi agent communication

This was actually my master project and I don't have a pdf of the paper version at hand. Not to proud of this work anyway.

Published in ASME Press Series on Intelligent Engineering Systems Through Artificial Neural Networks (ANNIE 2003), Smart Engineering System Design, eds C. Dagli et al, (2003) November, Missouri, USA

Authors: I.Valova, N.Gueorguieva, V.Zacharias

Labels: publication

KAON - Towards a large scale Semantic Web

The Semantic Web will bring structure to the content of Web pages, being an extension of the current Web, in which information is given a welldefined meaning. Especially within e-commerce applications, SemanticWeb technologies in the form of ontologies and metadata are becoming increasingly prevalent and important. This paper introduce KAON - the Karlsruhe Ontology and Semantic Web Tool Suite. KAON is developed jointly within several EU-funded projects and specifically designed to provide the ontology and metadata infrastructure needed for building, using and accessing semantics-driven applications on the Web and on your desktop.

In Kurt Bauknecht and A. Min Tjoa and Gerald Quirchmayr, E-Commerce and Web Technologies, Third International Conference, EC-Web 2002, Aix-en-Provence, France, September 2-6, 2002, Proceedings, volume 2455 of Lecture Notes in Computer Science, pp. 304-313. Springer, 2002.
ISBN: 3-540-44137-9

Very long list of authors (there is actually a funny story behind one name in the authors list ... but not something to write on a website. ask me)

The entire paper is here.

Labels: publication, SemanticWeb

On Knowledgeable Unsupervised Text Mining

Text Mining is about discovering novel, interesting and usefil patterns from textual data. In this paper we discuss several means that introduce background knowledge into unsupervised text mining in order to improve the novelty, the interestingness or the usefulness of the detected patterns. Germane to the different proposals is that they strive for higher abstractions that carry more explanatory power and more possibilities for exploring the input texts that is achievable by unknowledgeable means.

Andreas Hotho, Alexander Mädche, Steffen Staab and Valentin Zacharias: Text Mining Workshop Proceedings, Springer, 2002

The entire paper is here.

Labels: publication, SemanticWeb

Clustering Ontology-based Metadata in the Semantic Web

The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Recently, different applications based on this vision have been designed, e.g. in the fields of knowledge management, community web portals, e-learning, multimedia retrieval, etc. It is obvious that the complex metadata descriptions generated on the basis of pre-defined ontologies serve as perfect input data for machine learning techniques. In this paper we propose an approach for clustering ontology-based metadata. Main contributions of this paper are the definition of a set of similarity measures for comparing ontology-based metadata and an application study using these measures within a hierarchical clustering algorithm.

A. Mädche and Valentin Zacharuas, Proceedings of the Joint Conferences 13th European Conference on Machine Learning (ECML'02) and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'02).

Entire Paper is here.

(older paper, just posted it for completeness)

Labels: publication, SemanticWeb

Valentin's Blog

December 30, 2006

Sporadic Link Post

December 24, 2006

The Real Difference Between Semantic Web And Web 2.0

Q: Do you have any plans to commercialize Swoogle?

RDF Views

December 23, 2006

The Price For The Strangest Workshop Page Goes To

December 22, 2006

More On The DRM and IP madness ...

December 18, 2006

German Quaero Now Theseus?!

December 12, 2006

Cyc Google TechTalk

December 8, 2006

Sporadic Link Post

Ontology Maturing with Lightweight Collaborative Ontology Editing Tools

December 7, 2006

Ask City And The Semantic Web

The BAsAS Architecture For Semantic Web Annotations

It's been a long time ...

Exploiting Usage Data for the Visualization of Rule Bases

An extendable Java Framework for Instance Similarities in Ontologies

A Metadata Registry For Community Driven E-Learning Sites

A Topic Hierarchy On The Web

Semantic Announcement Sharing

Incremental broadcasting as a strategy for multi agent communication

KAON - Towards a large scale Semantic Web

On Knowledgeable Unsupervised Text Mining

Clustering Ontology-based Metadata in the Semantic Web

About

Newest Links

Publications

Archive