Archive for the 'Writing' Category

Bebo White Reviews the Linked Data Book for Journal of Web Engineering

I recently had an email giving advance notice that a review of the Linked Data Book (aka “Linked Data: Evolving the Web into a Global Data Space“) would appear in Volume 11(2) of the Journal of Web Engineering, published by Rinton Press (ISSN: 1540-9589). As some people won’t have easy access to the journal, the review is republished here, with permission. It’s by Bebo White of Stanford University and beyond — thank you Bebo for the thoughtful review, and to Rinton Press for allowing it to be republished here.

Web Engineering has been described as encompassing those “technologies, methodologies, tools, and techniques used to develop and maintain Web-based applications leading to better systems, [thus to] enabling and improving the dissemination and use of content and services though the Web.” (Source: International Conference on Web Engineering)

An especially interesting aspect of this description is “dissemination and use of content.” Semantic Web technologies and particularly the Linked Data paradigm have evolved as powerful enablers for the transition of the current document-oriented Web into a Web of interlinked data/content and, ultimately, into the Semantic Web.

To facilitate this transition many aspects of distributed data and information management need to be adapted, advanced and integrated. Of particular importance are approaches for (1) extracting semantics from unstructured, semi-structured and existing structured sources, (2) management of large volumes of RDF data, (3) techniques for efficient automatic and semi-automatic data linking, (4) algorithms, tools, and inference techniques for repairing and enriching Linked Data with conceptual knowledge, (5) the collaborative authoring and creation of data on the Web, (6) the establishment of trust by preserving provenance and tracing lineage, (7) user-friendly means for browsing, exploration and search of large, federated Linked Data spaces. Particularly promising might be the synergistic combination of approaches and techniques touching upon several of these aspects at once.

For Web Engineering practitioners interested in being a part of this Web transition, Linked Data – Evolving the Web into a Global Data Space by Heath and Bizer will provide a valuable resource. The authors have done an excellent job of addressing the subject in a logical sequence of well-written chapters reflecting technical fundamentals, coverage of existing applications and tools, and the challenges for future development and research. The seven important approaches mentioned earlier are described in a consistent way and illustrated by means of a hypothetical scenario that evolves over the course of the book. The size of this book (122 pages) is deceiving in that it does not reflect the quality and density of its content. The authors have succeeded in presenting a complex topic both succinctly and clearly. It is not a “quick read,” but rather a volume to be used for references, definitions, and meaningful and instructive code examples.

This book is available in digital format (PDF). It is the first in a planned series of books/lectures. The quality of this book should make the reader/practitioner look forward to the upcoming series volumes that promise to further explain the exciting future of this topic.

The Linked Data Book: Draft Table of Contents

Update 2011-02-25: the book is now published and available for download and in hard copy:

Original Post

Chris Bizer and I have been working over the last few months on a book capturing the state of the art in Linked Data. The book will be published shortly as an e-book and in hard copy by Morgan & Claypool, as part of the series Synthesis Lectures in Web Engineering, edited by Jim Hendler and Frank van Harmelen. There will also be an HTML version available free of charge on the Web.

I’ve been asked about the contents, so thought I’d reproduce the table of contents here. This is the structure as we sent it to the publisher — the final structure my vary a little but changes will likely be superficial. Register at Amazon to receive an update when the book is released.

  • Overview
  • Contents
  • List of Figures
  • Acknowledgements
  • Introduction
    • The Data Deluge
    • The Rationale for Linked Data
      • Structure Enables Sophisticated Processing
      • Hyperlinks Connect Distributed Data
    • From Data Islands to a Global Data Space
    • Structure of this book
    • Intended Audience
    • Introducing Big Lynx Productions
  • Principles of Linked Data
    • The Principles in a Nutshell
    • Naming Things with URIs
    • Making URIs Defererencable
      • URIs
      • Hash URIs
      • Hash versus
    • Providing Useful RDF Information
      • The RDF Data Model
        • Benefits of using the RDF Data Model in the Linked Data Context
        • RDF Features Best Avoided in the Linked Data Context
      • RDF Serialization Formats
        • RDF/XML
        • RDFa
        • Turtle
        • N-Triples
        • RDF/JSON
    • Including Links to other Things
      • Relationship Links
      • Identity Links
      • Vocabulary Links
    • Conclusions
  • The Web of Data
    • Bootstrapping the Web of Data
    • Topology of the Web of Data
      • Cross-Domain Data
      • Geographic Data
      • Media
      • Government Data
      • Libraries and Education
      • Life Sciences
      • Retail and Commerce
      • User Generated Content and Social Media
    • Conclusions
  • Linked Data Design Considerations
    • Using URIs as Names for Things
      • Minting HTTP URIs
      • Guidelines for Creating Cool URIs
        • Keep out of namespaces you do not control
        • Abstract away from implementation details
        • Use Natural Keys within URIs
      • Example URIs
    • Describing Things with RDF
      • Literal Triples and Outgoing Links
      • Incoming Links
      • Triples that Describe Related Resources
      • Triples that Describe the Description
    • Publishing Data about Data
      • Describing a Data Set
        • Semantic Sitemaps
        • voiD Descriptions
      • Provenance Metadata
      • Licenses, Waivers and Norms for Data
        • Licenses vs. Waivers
        • Applying Licenses to Copyrightable Material
        • Non-copyrightable Material
    • Choosing and Using Vocabularies
      • SKOS, RDFS and OWL
      • RDFS Basics
        • Annotations in RDFS
        • Relating Classes and Properties
      • A Little OWL
      • Reusing Existing Terms
      • Selecting Vocabularies
      • Defining Terms
    • Making Links with RDF
      • Making Links within a Data Set
        • Publishing Incoming and Outgoing Links
      • Making Links with External Data Sources
        • Choosing External Linking Targets
        • Choosing Predicates for Linking
      • Setting RDF Links Manually
      • Auto-generating RDF Links
        • Key-based Approaches
        • Similarity-based Approaches
  • Recipes for Publishing Linked Data
    • Linked Data Publishing Patterns
      • Patterns in a Nutshell
        • From Queryable Structured Data to Linked Data
        • From Static Structured Data to Linked Data
        • From Text Documents to Linked Data
      • Additional Considerations
        • Data Volume: How much data needs to be served?
        • Data Dynamism: How often does the data change?
    • The Recipes
      • Serving Linked Data as Static RDF/XML Files
        • Hosting and Naming Static RDF Files
        • Server-Side Configuration: MIME Types
        • Making RDF Discoverable from HTML
      • Serving Linked Data as RDF Embedded in HTML Files
      • Serving RDF and HTML with Custom Server-Side Scripts
      • Serving Linked Data from Relational Databases
      • Serving Linked Data from RDF Triple Stores
      • Serving RDF by Wrapping Existing Application or Web APIs
    • Additional Approaches to Publishing Linked Data
    • Testing and Debugging Linked Data
    • Linked Data Publishing Checklist
  • Consuming Linked Data
    • Deployed Linked Data Applications
      • Generic Applications
        • Linked Data Browsers
        • Linked Data Search Engines
      • Domain-specific Applications
    • Developing a Linked Data Mashup
      • Software Requirements
      • Accessing Linked Data URIs
      • Representing Data Locally using Named Graphs
      • Querying local Data with SPARQL
    • Architecture of Linked Data Applications
      • Accessing the Web of Data
      • Vocabulary Mapping
      • Identity Resolution
      • Provenance Tracking
      • Data Quality Assessment
      • Caching Web Data Locally
      • Using Web Data in the Application Context
    • Effort Distribution between Publishers, Consumers and Third Parties
  • Summary and Outlook
  • Bibliography