Valentin's Blog: March 2007

SOBOLEO

We present SOBOLEO, a system for the webbased collaborative engineering of SKOS ontologies and annotation of web resources. SOBOLEO enables the simple creation, extension and maintenance of taxonomies. At the same time, it supports the annotation of web resources with concepts from this taxonomy.

And another publication. I talked about SOBOLEO before, but now it got its own demo paper. SOBOLEO is a rather nice tool that brings together Semantic Search, a lightweight annotation tool (AJAX bookmarklet) and a collaborative real time taxonomy editor. In the coming days SOBOLEO (and all other demos) will also be available for the workshop participants to try out - I'm curious how that'll work out; how well it will be able to holds its own against the likes of collaborative Protégé and Bibsonomy (I always like it when the other approaches that you dismiss in the "related work" section are actually present at the same workshop :)

It's also published at the Workshop on Social and Collaborative Construction of Structured Knowledge @ WWW. Authors are the developers of SOBOLEO - Valentin Zacharias and Simone Braun. You can read the entire (demo) paper here.

Labels: publication, SOBOLEO

Ontology Maturing: a Collaborative Web 2.0 Approach to Ontology Engineering

Most of the current methodologies for building ontologies rely on specialized knowledge engineers. This is in contrast to real-world settings, where the need for maintenance of domain specific ontologies emerges in the daily work of users. But in order to allow for participatory ontology engineering, we need to have a more realistic conceptual model of how ontologies develop in the real world. We introduce the ontology maturing processes which is based on the insight that ontology engineering is a collaborative informal learning process and for which we analyze characteristic evolution steps and triggers that have users engage in ontology engineering within their everyday work processes. This model integrates tagging and folksonomies with formal ontologies and shows maturing pathways between them. As implementations of this model, we present two case studies and the corresponding tools.

This paper was accepted to the workshop on Social and Collaborative Construction of Structured Knowledge @ WWW2007 (authors are Simone Braun, Andreas Schmidt, Andreas Walter, Gabor Nagypal and me). You can read the complete text here.

It got pretty good reviews, so you might actually want to read it :)

Labels: publication, SOBOLEO

Sporadic Link Post

Some particularly interesting /enjoyable links from the past weeks:

Another War We're Not Winning: Us vs. SPAM
Kids, the Internet, and the End of Privacy: The Greatest Generation Gap since Rock and Roll
Google exec confirms phone in the labs
Adobe Apollo launched
Game 3.0?
Alternatives To Second Life and Virtual worlds set for a shake up
The Wright Stuff (about Will Wrights revolutionary Spore game)

Expert system convicted for practicing law without a license

And as usual: you can find all links at del.icio.us, the newest 15 are also always shown in the sidebar of this blog.

(and yes, recently there has been a relatively large number of links related to online games / virtual worlds - but I do think that these developments will be a part of/shape any future web and so they fall squarely within the area of "Web Science")

Labels: links

Semantic Web Advertisements?

This post makes a pretty weak argument why the Semantic Web will fail. One of its main arguments is that it relies on the cooperation between business that just isn't going to happen. Other have already pointed out that this statement is clearly false (here and here) and I just wanted to point to RSS, ATOM, iCal and Sitemap as examples for data standards jointly supported by different companies - there's just no reason to assume that this list cannot grow.

However, I do agree that the Semantic Web community too often just naively assumes that everyone "wants to share"; that it ignores the business cases. For example there is no major work on the question of how I can monetize an investment in ontology building - even though everyone agrees that a formal ontology is difficult and expensive to build. Wikipedia like approaches are only getting as so far - most metadata will only get created if the creator sees a monetary advantage in doing so. Finding that advantage is more difficult on the Semantic Web because the data will most probably by used by a computer agent - so I can't fund the data's creation by placing ads at its side.

So - what is the equivalent funding mechanism to ads that works for the Semantic Web? Or - alternatively - how can we place ads on the Semantic Web?

Readers interested in this questions may also want to look at my Semantic Announcement Sharing paper from 2004. There we take a holistic look at the factors that made RSS a success - including the motivation of people to contribute the data. We then use this factors to identify a different domain and to create a metadata standard for it (the sharing of information of events). Back then I lacked the time to follow through and actually promote this standard - but its still the best standard for metadata about events and events are still a great domain for Semantic Web technologies :)

Labels: SemanticWeb

Soon, We Can All Have Our Own TV-Station

From I, Cringely about Neokast - an application that makes broadcasting of streams several magnitudes cheaper by creating a peer2peer network between the computers receiving the stream.

Had there been no peers up and running other than mine, the video would have streamed straight from the server in Chicago, but with enough peers operating, the load on the originating server is several orders of magnitude less than for typical one-stream-per-user distribution.

For content creators this is key: the more people who watch your Neokast the more efficiently will your server bandwidth be utilized. According to Birrer, under normal circumstances the server bandwidth should plateau at 3-4 times that of a single stream NO MATTER HOW MANY VIEWERS ARE BEING SERVED.

But the news implications of somebody setting up a webcam from their window in Baghdad or Darfur and serving a truly global audience is what appeals to me.

That - the live broadcasting of events to a large audience from your home computer - really was the last (technical) frontier for citizen journalism. Now add a server that takes video calls from UMTS phones and sets up a Neokast for whatever is send and for a few hundred dollars we can all have "portable news links" - and the next revolution in citizen journalism.

Update: Alright, after reading through the comments at Cringely's site I must admit that similar technologies have been around for a while (for example PeerCast) - but still, Neokast looks like an idea who's time has come and that might move into the IT and blogging mainstream.

Labels: Web

On The Parallel Future Of Programming

I wrote about it before, but it deserves to be repeated a couple of times:

Processors are not getting faster at processing single threaded programs anymore. In the past you could be sure that the next CPU generation will execute any program faster - this is not true anymore.
CPU development centers around building more and more processing cores - hence all computing intensive applications that want to be fast need to be multithreaded.
Current programming languages and tools are mostly not well suited for concurrent programs. In the next years we will see a lot of development to address this shortcoming.

At FZI we just bought our first QuadCore machines - but obviously 4 is not going to be the limit - Intel already demoed a 80 core chip.

To learn more about this you can read the posts at O'Reilly Radar here and here.

Google Video also has TechTalks about a proposal to add better control abstractions to Java (could be a simple step improve concurrent programming with Java) and about MapReduce - a control abstraction Google uses to more easily take advantage of multiple processors.

There's also an enjoyable video about how a modern computer game takes advantage of multiple cores (about Alan Wake, the new game from the makers of Max Payne).

Labels: AI, SemanticWeb

Publication Blues

I thought that only ever publishing accepted papers gives a wrong impression of my work (and probably those of many scientists). So, here: three papers that didn't get published (yet).

The Verification of Rule Bases - this I put together rather hastily for the PING symposium. It tries to describe the software engineering challenges posed by rule languages and how my dissertation addresses them. Didn't work out, apparently it was completely impossible to understand what I wrote. Luckily it was only an extended abstract so I'll just throw it away.

A Semi-Automatic Debugger for Subject Matter Experts (with Anthony Jameson) - well, the title says it all: a semi automatic debugger for F-logic, created for Project Halo. In this case the workshop where we submitted it to, AADEBUG, got canceled - not that many people interested in automated debugging systems these days (will change, these kind of tools really work better with Semantic Web languages than with procedural and object oriented languages). The things described in this paper are still valid and new - I'm still looking for a venue to publish it. Sadly the evaluation is (so far) to weak to have a chance at a decent conference.

Finally an ESWC submission about test driven development of rule bases by domain experts (together with Michael Erdmann and Anthony Jameson). This was a really annoying reject, because a lot of work went into the paper and the reviews where superficial at best. The paper had its limitations and submitting it to ESWC was a longshot, but still ... you didn't get the impression that the reviewers took the time to understand it.

And now, hopefully, the next publication posts only about accepted papers :)

Labels: publication

Query Refinement @ Google

Notice the "See results for: flora margarine" line.

So, this was a completely useless refinement proposal for me (I was looking for the flora inference engine) but - since when does Google offer such refinement proposals? I knew the "Did you mean" proposals - but hadn't seen this kind of offer to restrict the number of search results.

On a second thought this is probably build on the same basis as the "Did you mean" things - they have a list of very common multiword phrases and just propose these when someone enters a part of it ... but still.

Sporadic Link Post

Some particularly interesting /enjoyable links from the past weeks:

SecondLife: Please stop doing that to the cat.
Herding the Mop - social sites become more important and a more interesting target for spammers.
Web 2.0 as a story to be destroyed by hackers.
Bookmarklets - the evil lurking in your browser - often underrated: what a bookmarklet can do.
Java 2007 - what lies ahead for the Java language?
Video about evil Google master plan.

And as usual: you can find all links at del.icio.us, the newest 15 are also always shown in the sidebar of this blog.

Labels: links

Valentin's Blog

March 30, 2007

SOBOLEO

Ontology Maturing: a Collaborative Web 2.0 Approach to Ontology Engineering

March 21, 2007

Sporadic Link Post

Semantic Web Advertisements?

March 16, 2007

Soon, We Can All Have Our Own TV-Station

March 8, 2007

On The Parallel Future Of Programming

Publication Blues

Query Refinement @ Google

March 5, 2007

Sporadic Link Post

About

Newest Links

Publications

Archive