Previous Up Next
Validating RDF data

Foreword by Phil Archer

“Anyone can say anything about anything,” says the mantra for the Semantic Web. More formally, the Semantic Web adopts the Open World Assumption: just because your data encodes a set of facts, that doesn’t mean there aren’t other facts stated elsewhere about the same thing. All of which is fine and part of the design of RDF which supports the creation of a graph at Web scale, but in a lot of practical applications you just need to know whether the triples you’ve ingested match what you were expecting; you need validation. You might think of it as a defined subset of the whole graph, or maybe a profile, providing a huge boost to interoperability between disparate systems. If you can validate the data you’ve received then you can process it with confidence, using more terse code, perhaps with more performant queries. I don’t accept that RDF is hard, certainly no harder than any other Web technology; what is hard is thinking in graphs. Keeping in your head that this node supports these properties and has relationships with those other nodes becomes complex for anything other than trivial datasets. The validation techniques set out in this book provide a means to tame that complexity, to set out for humans and machines exactly what the structure of the data is or should be. That’s got to be helpful and, incidentally, ties in with new work now under way at W3C on dataset exchange. In my role at W3C I watched as the SHACL and ShEx camps tried hard to converge on a single method: they couldn’t, hence the two different approaches. Both are described in detail here with copious examples, which is just what you need to get started. How can you choose between the two methods? Chapter 7 gives a detailed comparison and allows you to make your own choice. Whichever you choose, this is the book you need to make sense of RDF validation.

Phil Archer, Former W3C Data Strategist
July 2017

Previous Up Next