Home Campus Koblenz Fachbereich 4: Informatik Institute for Web Science and Technologies Scientific theses Extracting Knowledge from the Web, Web 2.0, and Semantic Web

Extracting Knowledge from the Web, Web 2.0, and Semantic Web

In the context of this work, the extraction of knowledge from different sources such as Web, Web 2.0, and Web 3.0 shall be considered and applied. The different data sources will be analyzed and relevant information extracted. Extracted information are events that occur in the real world, persons and objects participating in events, roles, organizations (and their units), date and time, places, and so on. Considered data source can be among others:

  • Web 1.0: Professional content such as BBC, CNN, etc.
  • Web 2.0: User-generated content (UGC) such as Last.fm, Blogs, YouTube, Yahoo!   Answers, Flickr, etc.
  • Web 3.0 + Linked Open Data: Semantic description of Web 3.0 content by means of RDFa and Microformats. Systematic analysis of data sources provided in the web like DBpedia, GeoNames, as well as professional content like the BBC Programm (http://linkeddata.org/).


Different tools shall be applied for extracting the information from the web such as entity detectors and for creating semantic networks from natural language text. These can be applied in a row and connected into a process chain. For representing the extracted information existing ontologies for multimedia data and events shall be used.

last modified Sep 29, 2009 07:08 PM

Kontakt