Home Campus Koblenz Fachbereich 4: Informatik Institute for Web Science and Technologies Semantics scales in the Cloud: University of Koblenz wins Billion Triples Challenge

Semantics scales in the Cloud: University of Koblenz wins Billion Triples Challenge

Everyone knows Web 2.0: Wikipedia, Google Maps, directory structures as in WordNet, Flickr, location names and much more are visible via a few clicks. However, it remains tedious and difficult to connect pieces of such information using traditional information systems. There, the Semantic Web comes to the rescue, it interlinks information and allows for intriguing questions, e.g.: Where do I find streetart in Berlin and how does it look like on photos?

However, semantic technologies did not have the very best reputation until now. Prejudices involved complaints about complexity of the technology and a lack of speed. In order to reveal that new developments of Semantic Web technologies have made such rumors become outdated, Peter Mika (Yahoo!) and Jim Hendler (RPI) have initiated the Billion Triples Challenge at the 7th Int. Semantic Web Conference. They constructed the challenge of managing a huge amount of over one billion ill-structured facts harvested from public sources such as Wikipedia and semantic home pages and making this information and its relationships available for easy access and intuitive interaction by the lay user.

The research group “ISWeb – Information systems and Semantic Web” from the Universität Koblenz-Landau led by Prof. Steffen Staab outperformed the eight international competing teams. The ISWeb system “Semaplorer” offers traditional search for keywords as well as content-based navigation along factual relationships. Searching, e.g., for “Berlin” displays a map, in which one finds related information, such as celebrities, sights or points of interest from and in Berlin. From this point onwards, one may refine search based on other keywords or based on the semantics of terms: the „Reichstag“ (house of the German parliament) is linked to fotos from its vicinity including information about their creators as well as information about the political function of the Reichstag included from Wikipedia.

As the sheer size of the information source as well as its multitude of heterogeneous structures and relationships threatens to strangle every server, Simon Schenk from ISWeb has developed a novel semantic technology, the “networked graphs”, that allows for distributing the billion facts onto 25 computers – or more. Since such computational power has not been available locally, the team has exploited Cloud Service computing by AmazonTM. Thus, computing power could be provided partially from University of Koblenz and partially from Amazon, USA.

Prof. Staab has commented about the success: „With Semaplorer we could demonstrate for the first time that the vision of Web 3.0 – an advancement of Web 2.0 that includes semantics – actually works. So far, this had been done only with toy examples.” Even first commercial interests were uttered during the Koblenz LocalBit fair, where the Semaplorer system had been presented to the public simultaneously with the presentation at the billion triple challenge.

The project team of „Semaplorer“ includes the leader of the research group ISWeb, Prof. Dr. Steffen Staab, the project coordinator and multimedia expert Dr. Ansgar Scherp, the semantics experts Simon Schenk and Carsten Saathoff as well as the computer science students Anton Baumesberger, Frederic Jochum and Alexander Kleinen. The prize was awarded by an international jury during the 7th International Semantic Web conference, with more than 620 attendees the most successful in the series. The prize involves a check of €1000.

The research work on networked graphs infrastructure, semantic annotation platform and semantic multimedia expertise has been funded by EU through the ICT projects NeOn, k-space and WeKnowIt, respectively.

Further information is available at:

http://btc.isweb.uni-koblenz.de


Date of news Oct 30, 2008 08:00 AM
last modified Feb 18, 2010 04:17 PM

Kontakt