Entity Extraction: From Unstructured Text to DBpedia RDF triples
In this paper, we describe an end-to-end system that automatically extracts
RDF triples describing entity relations and properties from unstructured
text. This system is based on a pipeline of text processing modules that includes a
semantic parser and a coreference solver. By using coreference chains, we group
entity actions and properties described in different sentences and convert them
into entity triples. We applied our system to over 114,000 Wikipedia articles and
we could extract more than 1,000,000 triples. Using an ontology-mapping system
that we bootstrapped using existing DBpedia triples, we mapped 189,000
extracted triples onto the DBpedia namespace. These extracted entities are available
online in the N-Triple format.
An archive of extracted entities in N3 format is available:
Exner, Peter, and Pierre Nugues. "Entity Extraction: From Unstructured Text to DBpedia RDF Triples." Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference (ISWC 2012) Boston, USA, November 11, 2012.