Linked data and Semantics

By Sergio Garcia, March 3, 2010 11:03 am

Author: Sergio García.   Telefónica I+D

For decades, the World Wide Web has evolved as a network of documents connected through hypertext links.  These documents have usually been conceived to be read by humans and during the last decade, the semantic web initiative emerged to develop a set of languages and tools for computers to understand the web content. Based on the Semantic  Web standards, the Linked Data initiative consists in a set of best practices to publish structured data on the Web, establishing a Web of Data.

Standards and Principles

Linked Data promotes to publish data using the RDF (Resource Description Framework) language, together with other languages used to specify complex vocabularies or ontologies, as RDF Schema (RDFS) or OWL (Ontology Web Language). The way of publishing and interlinking this data is based on four design issues defined by Tim Berners-Lee:

  1. Use URIs as names for things.
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL).
  4. Include links to other URIs so that they can discover more things.

Following these principles, it is possible to publish and link pieces of RDF data and browse them in the same way we browse through HTML documents. To achieve that, it is necessary to assign URIs to concepts or things (and people, places, etc.). These new type of resources are called non-information resources, while HTML files, images, etc. are known as information resources.

When a Linked Data consumer asks a web server for an RDF representation of a non-information resource (e.g. http://dbpedia.org/resource/Valladolid), the web server de-references the URI into an information URI (e.g. http://dbpedia.org/data/Valladolid) that points to the RDF description of the requested resource. The provided RDF description of the resource may contain triples that point to new non-information resources:

<rdf:Description rdf:about=”http://dbpedia.org/resource/Valladolid”><owl:sameAs xmlns:owl=”http://www.w3.org/2002/07/owl#” rdf:resource=”http://sws.geonames.org/6362308/”/></rdf:Description>

The Linked Data Cloud

The Linked Open Data project started in 2007 and one of the first attempts to build up a cloud of interlinked RDF data started with the DBpedia Project, in which structured data from the Wikipedia is published following the Linked Data Principles.  Many other initiatives have provided more data and tools to help in this process. The following picture shows the state of this cloud as of March of 2009.

Publishing Linked Data

There are a number of free and commercial tools that can be used to publish Linked Data from different data sources.  For instance, D2R Server and OpenLink Virtuoso are well known tools to publish RDF data from relational databases. It is also possible to publish raw RDF files, to implement other types of RDFizers and wrappers of well-known Web 2.0 APIs. Furthermore, there are a number of tools to browse Linked Data as Tabulator or Disco . This document describes the best practices to publish Linked Data on the web.

Applications

As the Linked Data cloud grows a number of interesting applications are been brought out. To mention some of them: the BBC Music service retrieves and combines information from DBpedia and MusicBrainz; Faviki is a social bookmarking tool that takes advantage of DBpedia for tagging; the LinkedGeoData project provides location based Linked Data information over OpenStreetMap maps; etc.

One Response to “Linked data and Semantics”

Leave a Reply

Follow comments:

Panorama theme by Themocracy