Kolloquium Softwaretechnik: SemaPlorer - Winning the Billion Triple Challenge
Carsten Saathoff, Simon Schenk
SemaPlorer is an easy to use application that allows end
users to interactively explore and visualize a very large, mixed-quality
and semantically heterogeneous distributed semantic data set in realtime.
Its purpose is to acquaint oneself about a city, touristic area,
or other area of interest. By visualizing the data using a map, media,
and different context views, we clearly go beyond simple storage and retrieval
of large numbers of triples. The interaction with the large data
set is driven by the user. SemaPlorer leverages different semantic data
sources such as DBpedia, GeoNames, WordNet, and personal FOAF files.
It intriguingly connects with a large Flickr data set converted
to RDF. SemaPlorer's storage infrastructure bases on Amazon's
Elastic Computing Cloud (EC2) and Simple Storage Service.
We apply NetworkedGraphs as additional layer on top of EC2, performing as a
large, federated data infrastructure for semantically heterogeneous data
sources from within and outside of the cloud. Therefore, the application
is scalable with respect to the amount of distributed components working
together as well as the number of triples managed overall. Hence,
SemaPlorer is exible enough to leverage for exploration almost arbitrary
additional data sources that might be added in the future.
Beteiligte: Simon Schenk, Carsten Saathoff, Anton Baumesberger, Frederik Jochum,
Alexander Kleinen, Steffen Staab, and Ansgar Scherp