Interlinking DBpedia with other Data Sets
Linked Data is a method to publish data on the Web and to interlink data between different data sources. Linked Data can be accessed using Semantic Web browsers, just as traditional Web documents are accessed using HTML browsers. However, instead of following document links between HTML pages, Semantic Web browsers enable surfers to navigate between different data sources by following RDF links. RDF links can also be followed by robots or Semantic Web search engines in order to crawl the Semantic Web. See Linked Data – The Story so far and How to publish Linked Data on the Web for more information about Linked Data.
The DBpedia data set is interlinked with various other data sources (see voiD description). The diagram below (Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ ) gives an overview of some of these data sources:
Data Set | Description | Number of Links | |
Amsterdam Museum | Information about cultural heritage objects related to the city of Amsterdam. | 630 | |
BBC Wildlife Finder | Information about wildlife biota, habitats, adaptations and ecozones. | 450 | |
Book Mashup | Provides information about books. | 9,000 | |
Bricklink | Unofficial Lego marketplace. | 10,000 | |
CORDIS | Information on all EU programmes and projects. | 300 | |
Dailymed | Provides information about drugs. | 900 | Eli Lilly and Company |
DBLP Bibliography | Provides information about scientific publications. | 200 | Tim Berners-Lee |
DBTune | Provides freely available data concerning music. | 840 | |
Diseasome | Provides information about diseases and genes. | 2,300 | Asthma |
Drugbank | Provides information about drugs and genes. | 4,800 | ZNF3 |
EUNIS | Information on species, habitat types and sites. | 11,000 | |
Eurostat (Linked Statistics) | Covers a number of areas from economy over demographics to trade and transport data. | 250 | |
Eurostat (WBSG) | Provides information about European countries and regions. | 140 | France |
CIA World Factbook | Provides information about countries. | 550 | France |
flickr wrappr | A wrapper around flickr that tries to generate a photo collection for each DBpedia concept. | 4,000,000 | Brandenburg Gate |
Freebase | A open-license database about millions of things from various domains. | 3,900,000 | Tetris |
GADM | Spatial database of the location of the world's administrative areas. | 39,000 | |
GeoNames | Provides information about geographic features. | 425,000 | Cambridge |
GeoSpecies | Information on biological orders, families, species as well as species occurrence records and related data. | 16,000 | |
Global Health Observatory | Provides access to statistical data about health problems. | 200 | |
Project Gutenberg | Provides information about authors and open access to their work. | 2,500 | John Bunyan |
Italian Public Schools | Provides information on public schools in Italy. | 5,800 | |
LinkedGeoData | Spatial knowledge base. | 104,000 | |
LinkedMDB | Provides information on movies. | 14,000 | |
MusicBrainz | Provides information about artists and music. | 23,000 | Portishead |
New York Times | Links between NYT subject headings and DBpedia concepts. | 9,700 | South Korea |
OpenCyc | A open-license version of the Cyc Ontology. | 27,000 | Woody Allen |
OpenEI (Open Energy Info) | Provides energy-related information. | 680 | |
Revyu | Universal reviews. | 6 | |
Sider | Provides information about side effects of drugs. | 2000 | Claudication |
TCMGeneDIT | Information on traditional Chinese medicine, genes and diseases. | 900 | |
UMBEL | A lightweight, subject concept reference structure derived from Cyc. | 900,000 | Place |
US Census | Provides US Census data. | 12,600 | Los Angeles |
WikiCompany | Provides information on companies. | 8,300 | |
Wikidata | Structured data related to Wikipedia items. | 5,200,000 | |
WordNet | W3C RDF/OWL representation of the WordNet ontology. | 470,000 | Air France |
YAGO | Cross-domain knowledge base. | 2,900,000 instance links, 41,000,000 type statements |
The W3C Linking Open Data Community Project
DBpedia is part of the W3C Linking Open Data community project, an effort to publish and interlink various open data sources. As of September 2011, this effort has built a Web of interlinked data sources that amounts to more than 31 billion RDF triples. Please refer to the project's data sets page for a list of all published data sets.
Linking to DBpedia from Your Dataset
The Silk Link Discovery Framework can be used to generate new links to DBpedia based on user-provided link specications which are expressed using the Silk Link Specification Language (Silk-LSL).
Linking to DBpedia from Your FOAF Profile
As Wikipedia contains articles about many general-purpose concepts, DBpedia can also be seen as a huge ontology that assigns URIs to plenty of concepts and backs these URIs with with dereferenceable RDF descriptions.
If you have a FOAF profile and you need terms for describing your interests or your location, you might consider using DBpedia URIs. This will allow RDF browsers like Disco, Tabulator, or the OpenLink Data Web Browser, to browse from your FOAF profile into DBpedia. The links also allow clients like the Semantic Web Client Library to answer SPARQL queries over both data sources.
The example below shows an RDF link from Richard Cyganiak's FOAF profile which states that he is based near Berlin.
<http://richard.cyganiak.de/foaf.rdf#cygri> foaf:based_near <http://DBpedia.org/resource/Berlin>
You can use the Disco browser to follow this link by clicking here.
DBpedia URIs can also be used to express your interests within your FOAF profile. For example:
<http://richard.cyganiak.de/foaf.rdf#cygri> foaf:topic_interest <http://DBpedia.org/resource/Tetris> . <http://richard.cyganiak.de/foaf.rdf#cygri> foaf:topic_interest <http://DBpedia.org/resource/Semantic_Web> .
Another use case for DBpedia URIs could be to categorize or tag blog posts, wiki pages, or other documents. For example:
<http://news.cnn.com/item1143> dc:subject <http://DBpedia.org/resource/Iraq_War>; foaf:primaryTopic <http://DBpedia.org/resource/Iraq_War>; foaf:topic <http://DBpedia.org/resource/Middle_East>;
An interesting project that allows you to review anything that has a URI is the RevYu project run by Tom Heath. A RevYu review about a film in DBpedia could look like this:
@prefix rev: <http://purl.org/stuff/rev#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . <> a rev:Review; rdfs:label "Review of Cold Mountain, by Alice"; foaf:primaryTopic <http://DBpedia.org/resource/Cold_Mountain_%28film%29> ; rev:text "This movie sucks. Miss it."; rev:rating 1; rev:minRating 1; rev:maxRating 5; rev:reviewer <http://example.com/alice/foaf.rdf#me> .
Inlinks to DBpedia
DBpedia is being linked to from a variety of datasets. The sum of inlinks is 39,007,478.