Extracting Knowledge from the Web, Web 2.0, and Semantic Web
In the context of this work, the extraction of knowledge from different sources such as Web, Web 2.0, and Web 3.0 shall be considered and applied. The different data sources will be analyzed and relevant information extracted. Extracted information are events that occur in the real world, persons and objects participating in events, roles, organizations (and their units), date and time, places, and so on. Considered data source can be among others:
- Web 1.0: Professional content such as BBC, CNN, etc.
- Web 2.0: User-generated content (UGC) such as Last.fm, Blogs, YouTube, Yahoo! Answers, Flickr, etc.
- Web 3.0 + Linked Open Data: Semantic description of Web 3.0 content by means of RDFa and Microformats. Systematic analysis of data sources provided in the web like DBpedia, GeoNames, as well as professional content like the BBC Programm (http://linkeddata.org/).
Different tools shall be applied for extracting the information from the web such as entity detectors and for creating semantic networks from natural language text. These can be applied in a row and connected into a process chain. For representing the extracted information existing ontologies for multimedia data and events shall be used.
Kontakt