Abstract
There exists a wide gap between the information that people and computers respectively can operate with online. Because most of the web is in plain text and the Semantic Web requires structured information (RDF), bridging the two worlds is an important current research topic. Here we propose a web service that uses a Random Indexing (RI) semantic space trained on the plain text of the one million most central Wikipedia concepts. The space provides us with vectors for each of the equivalent DBpedia concepts and vectors for any text or webpage. It can also provide a hashed version of the RI vector that works as unique handler like URIs do, but with the additional advantage that it represents text meaning. As a result, any page (previously readable only for humans) is now integrated with the Semantic Web graph using links to one of its most central parts, DBpedia.
Original language | English (US) |
---|---|
Pages (from-to) | 47-58 |
Number of pages | 12 |
Journal | CEUR Workshop Proceedings |
Volume | 611 |
State | Published - 2010 |
Externally published | Yes |
Event | 2nd Workshop on Inductive Reasoning and Machine Learning on the Semantic Web, IRMLeS 2010 - Held in Conjunction with the 7th Extended Semantic Web Conference, ESWC 2010 - Heraklion, Greece Duration: May 31 2010 → May 31 2010 |
Keywords
- Identifiers
- Literals
- RDF
- Resources
- Statistical semantics
- Structured information
- Text mining
ASJC Scopus subject areas
- General Computer Science