December 26, 2007

Accessing SPARQL Endpoints from within Yahoo Pipes

Well, at least until the 'Semantic Web Pipes' are ready for prime time: a webservice that allows to query sparql endpoints from within Yahoo Pipes. Look at the example below: It shows a simple pipe that takes a name as input, uses it to query the dblp sparql endpoint and returns the result as web page, JSON and RSS. You can try the pipe here. Surely getting an RSS feed for the publications from dblp could have been achieved without RDF-SPARQL-Pipes, however, we can now access all kinds of SPARQL endpoints and have the entire functionality of Yahoo Pipes at hand to combine it with other (possibly non-SemWeb) content.

sparqlr

Let me quickly explain the pipe: The 'Please enter the name' element defines the 'name' input to the pipe. The 'String Builder' block uses this name to build a sparql query and the 'Item Builder' combines the query and the endpoint URL (http://www4.wiwiss.fu-berlin.de/dblp/sparql, in this case) into an item that will be send to the web service. The web service (that lives at http://soboleo.fzi.de:8080/PipesSparqlr/sparql [1]) takes the query and endpoint URL, sends the query to the endpoint and translates the answer to a simpler JSON format[2]. Any error encountered is simple returned instead of a result - so you are able to see it in the debugger view of Yahoo Pipes.  The last operator, the Regex element, removes anything but characters from the item's title - sadly that's necessary because somewhere along the line the character encodings get mixed up and this is tripping up Yahoo Pipes so badly, that no result is returned as soon as one of the titles contains something like for example a German 'ä' or 'ö'. I'll try to fix this someday. The source code for the webservice (all ~100 lines of it ;) ) is available here - feel free to use it anyway you like. You'll need the JSON library and Java 1.5+ to compile, and some servlet container (I use tomcat 5.5.something) to run it. 

[1]: Feel free to use this webservice but don't count on it staying there forever.
[2]: Just passing through the SPARQL query result XML caused problems with Yahoo Pipes which expects either JSON or RSS.

Labels: , ,

December 18, 2007

Did You Know?

More than 50% of U.S. 21 year olds have created content on the web
More than 70% of the U.S. 4 year-olds have used a computer
Every day more text messages are send then are people on this planet

These numbers and many more are in a video titled "Did you Know" available here and embedded below:

(via information aesthetics)

Labels:

December 9, 2007

Defining How An Application Can Be Semantic

There seems to be quite a bit of confusion about the different meaning "Semantics" can have in computer science - as you can see for example from read Read/Write's 10 Semantic Apps to Watch; an interesting article but one that starts with a nonsensical classification of Semantic applications into 'top down' and 'bottom up'. So - an attempt to give a better classification of the different ways in which an application can be 'Semantic'.

A Semantic application is one that tries to improve some computing task by explicitly considering the meaning and context of the symbols it is manipulating. This is still very nonspecific, but will become clearer when we consider the four ways in which this can be instantiated:

  1. Semantics as in "The Semantic search engine Powerset". These approaches use natural language processing  techniques to give context to words in texts; e.g. to understand that a string "SAP" in a document refers to the company as opposed to a striver.
  2. Semantics as in "The lowercase semantic web". These approaches try to build the web of data by using machine understandable markup and establishing information interchange formats; e.g. by embedding <a  href= "http://technorati.com/tag/SemanticWeb" rel="tag"> in this page I can associate it with the topic Semantic Web; in this way give this document some context. I've used microformats for this example, but many applications of RDF are Semantic is this sense.
  3. Semantics as in "The Semantics of OWL 1.1" - these approaches define the meaning of symbols by associating them with a mathematical theory that exactly defines what follows from any collection of symbols.
  4. Semantics as in "Semantic Portal". These approaches use technologies that allow to flexibly represent data without a fixed schema; technologies such as RDF that make it easy to represent diverse data that is interconnected in a myriad ways. Twine is an example for an application that's semantic (mainly) in this sense, TripIt and Freebase as well.

(and yes, many applications are semantic in more than one sense).

Labels:

December 8, 2007

CfP - Social Aspects of the Web

IMG_2781 I thought this might interest readers of this blog: the 2nd workshop on social aspects of the web - SAW 2008 (for which I happen to be on the PC) is looking for contributions to be submitted on the 12th of January. Topics include privacy in the social web, communities on the web, large scale social web mining and empirical studies, social software on the Semantic Web... the full CfP is available here.  The workshop is held in conjunction with the 11th International Conference on Business Information Systems (BIS2008), which- colleagues who've been there tell me - is a good conference.

The picture to right is from the scenic Grossglockner Hochalpenstrasse - which is close to Innsbruck and hence to the location of workshop and conference. Most beautiful mountain road I've ever driven.

Labels:

The Real Significance Of Amazon's Kindle

2056416079_1c898fbae1Amazon's Kindle is the first really transparently Internet enabled appliance. Its the first mainstream appliance that has mobile Internet access, without the user needing a contract or worrying about it in any way. For years we have been talking about the rise of ubiquitous computing and a myriad of connected devices, and now - finally - this starts to become reality.

I do not have any insider knowledge about the cost of necessary hardware or that of the deal between Amazon on the Sprint for supplying the network connection - but its probably safe to say that its not too expensive to include similar chips in the next generation of cars; enabling them to relay data about their status and to receive software upgrades. What we seem to be witnessing is that its becoming cheap to build and run devices that are potentially always connected but only rarely need to transmit data. Combine this with steadily decreasing cost for the bandwidth, necessary hardware and cell based location technology (e.g. MyLocation) and you get similar chips embedded in your heating automatically calling technicians, 10$ fire detectors capable of alarming the fire department, trash cans informing the central office they are full ... all ideas that have been tossed around for a decade or so - but Amazons Kindle reminds us that these are slowly becoming realistic. And maybe this development will even pick up speed considerably, if Google manages to buy its own spectrum for a 'open' mobile network.

If you want to read more about the Kindle, its possible impact and its relation to social media there is an interesting series of articles at O'Reilly's Radar: Kindle Fundamentals, Kindle Economics and Kith and Kindle.  You might also want to have a look at the 'Most Unusual Books' - from where I took the picture accompanying this post.

December 2, 2007

We Need A New RDF Schema Language

More or less all people with less than a year of Semantic Web experience misunderstand RDF(S). They try to say something like that an animal can have an attribute number_of_legs of type (positive) integer and end up saying something like that everything that has a positive number of legs is of type animal. The common response to such mistakes is to lecture them about logics, the open world assumption and about open architecture - when it should be to go and design a schema language that conforms to their expectations. I'm not saying that RDF(S) should be discarded, but that there is a clear need for another language, an RDF DTD, that allows to restrict what an RDF document should look like. In a future of automatic interoperability through formalized background knowledge this may not be needed; but with the current state of the Semantic Web where almost all applications rely on RDF data to conform to some schema in this DTD sense - such a language seems to be urgently needed.

Labels: