Shape Expressions (ShEx) is a schema language for describing RDF graphs structures. ShEx was originally developed in late 2013 to provide a human-readable syntax for OSLC Resource Shapes. It added disjunctions, so it was more expressive than Resource Shapes. Tokens in the language were adopted from Turtle [80] and SPARQL [44] with tokens for grouping, repetition and wildcards from regular expression and RelaxNG Compact Syntax [100]. The language was described in a paper [80] and codified in a June 2014 W3C member submission [92] which included a primer and a semantics specification. This was later deemed “ShEx 1.0”.
The W3C Data Shapes Working group started in September 2014 and quickly coalesced into two groups: the ShEx camp and the SHACL camp. In 2016, the ShEx camp split from the Data Shapes Working Group to form a ShEx Community Group (CG). In April of 2017, the ShEx CG released ShEx 2 with a primer, a semantic specification and a test-suite with implementation reports.
As of publication, the ShEx Community Group was starting work on ShEx 2.1 to add features like value comparison and unique keys. See the ShEx Homepage http://shex.io/ for the state of the art in ShEx. A collection of ShEx schemas has also been started at https://github.com/shexSpec/schemas.
Strictly speaking, a ShEx schema defines a set of graphs. This can be used for many purposes, including communicating data structures associated with some process or interface, generating or validating data, or driving user interface generation and navigation. At the core of all of these use cases is the notion of conformance with schema. Even one is using ShEx to create forms, the goal is to accept and present data which is valid with respect to a schema.
ShEx has several serialization formats:
These are all isomorphic and most implementations can map from one to another.
Tools that derive schemas by inspection or translate them from other schema languages typically generate ShExJ. Interactions with users, e.g., in specifications are almost always in the compact syntax ShExC. As a practical example, in HL7 FHIR, ShExJ schemas are automatically generated from other formats, and presented to the end user using compact syntax. See Section 6.2.3 for more details.
ShExR allows to use RDF tools to manage schemas, e.g.,
doing a SPARQL query to find out whether an organization is using
dc
:
creator
with a string,
a
foaf
:
Person
, or even whether an organization is consistent about it.
Example 26 below contains a very simple ShEx schema.
:
User
. Nodes with that shape must satisfy the following constraints on their properties.schema
:
name
which must be a
xsd
:
string
.schema
:
birthDate
with type
xsd
:
date
.schema
:
gender
whose value is
schema
:
Male
or
schema
:
Female
or some string.schema
:
knows
whose value must be an IRI and conform to the
:
User
shape.PREFIX : <http://example.org/> PREFIX schema: <http://schema.org/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> :User { schema:name xsd:string ; schema:birthDate xsd:date? ; schema:gender [ schema:Male schema:Female ] OR xsd:string ; schema:knows IRI @:User* } |
All the nodes in the following RDF graph conform to
:
User
shape.
:alice schema:name "Alice" ; # Passes as a :User schema:gender schema:Female ; schema:knows :bob . :bob schema:gender schema:Male ; # Passes as a :User schema:name "Robert"; schema:birthDate "1980-03-10"^^xsd:date . :carol schema:name "Carol" ; # Passes as a :User schema:gender "unspecified" ; foaf:name "Carol" . |
The nodes
:
alice
,
:
bob
and
:
carol
have shape
:
User
.
:
alice
conforms because it contains
schema
:
name
and
schema
:
gender
with their corresponding values.
It does not contain the property
schema
:
birthDate
but that property is optional, as indicated by ‘
?
‘.
It also has the property
schema
:
knows
with the value
:
bob
which has
:
User
shape.:
bob
conforms because it contains the properties and values of the
:
User
shape.
Note that the order in which triples are expressed in the example does not matter. These are parsed into an RDF graph and RDF graphs are unordered collections of triples.:
carol
conforms because it has property
schema
:
name
with a
xsd
:
string
value,
schema
:
gender
with another
xsd
:
string
value and an extra property
foaf
:
name
.Notice that
:
carol
conforms even if it has other properties apart of those mentioned by the
:
User
shape definition (in this case
foaf
:
name
).
ShEx shapes are open by default, which means that they constrain neither the existence nor the value of the properties not mentioned in the shape.
This behavior can be modified using the
CLOSED
qualifier as we will explain in Section 4.6.8.
Given the following RDF graph:
:dave schema:name "Dave"; # Fails as a :User schema:gender "XYY"; # schema:birthDate 1980 . # 1980 is not an xsd:date *) :emily schema:name "Emily", "Emilee" ; # Fails as a :User schema:gender schema:Female . # too many schema:names *) :frank foaf:name "Frank" ; # Fails as a :User schema:gender: schema:Male . # missing schema:name *) :grace schema:name "Grace" ; # Fails as a :User schema:gender schema:Male ; # schema:knows _:x . # \_:x is not an IRI *) :harold schema:name "Harold" ; # Fails as a :User schema:gender schema:Male ; schema:knows :grace . # :grace does not conform to :User *) |
If we try to validate the nodes in the following graph against the shape
:
User
, the validator would fail for all of the nodes:
:
dave
fails because the value of
schema
:
birthDate
is
1980
(an integer) which is not an
xsd
:
date
.:
emily
fails because it has two values for property
schema
:
name
.
Unless otherwise specified, the default cardinality is “exactly one” (which can also be written as “
{1}
” or “
{1,1}
”).:
frank
fails because it does not have the property
schema
:
name
.:
grace
fails because the value of
schema
:
knows
is a blank node and there is a node constraint saying that it must be an IRI.:
harold
fails because the value of
schema
:
knows
is
:
grace
and
:
grace
does not conform to the
:
User
shape.At the time of this writing, we are aware of the following implementations of ShEx.
There are also several online demos and tools that can be used to experiment with ShEx.
The ShEx compact syntax (ShExC) was designed to be read and edited by humans. It follows some conventions which are similar to Turtle or SPARQL.
PREFIX
and
BASE
declarations follow the same convention as in Turtle.
In the rest of this chapter we will omit prefix declarations for brevity.#
and continue until the end of line.
a
identifies the
rdf
:
type
property.<
>
and prefixed names (a shorter way to write out IRIs) are written with prefix followed by a colon “:”._
:
label
notation.'
,
"
,
''
'
,
""
"
) as in Turtle.a
) are not case sensitive.
Which means that
MinInclusive
is the same as
MININCLUSIVE
.
A ShExC document declares a ShEx schema.
A ShEx schema is a set of labeled shape expressions which are composed of node constraints and shapes.
These constrain the permissible values or graph structure around a node in an RDF graph.
When we are considering a specific node, we call that node the
focus
node
.
The triples which have the focus node as a subject are called
outgoing
arcs
; those with the focus node as an object are called
incoming
arcs
.
(Typical RDF idioms call for constraints on outgoing arcs much more frequently than on incoming arcs.)
Together, the incoming and outgoing arcs are called the
neighborhood
of that node.
Shape expression labels can be IRIs or blank nodes but only IRI labels can be referenced from outside the schema.
In the previous Example 26,
:
User
is an IRI label.
Node constraints declare the shape of a focus node without looking at the arcs.
They can declare the kind of node (IRI, blank node or literal),
the datatype in case of literals,
describe it with XML Schema facets (e.g., min and max numeric values, string lengths, number of digits),
or enumerate a value set.
Figure 4.4.1 signals the node constraints that appear in Example 26 which are:
xsd
:
string
and
xsd
:
date
(datatype constraints),
[
schema
:
Male
schema
:
Female
]
(a value set),
IRI
(a node kind declaration) and
@
:
User
(a value shape).
Node constraints will be described in more detail in Section 4.5.
Triple constraints define the triples that appear in the neighborhood of a focus node. They usually contain a property (or inverse property), a node constraint, and a cardinality declaration which is one by default.
For example,
schema
:
name
xsd
:
string
is a triple constraint.
The
:
User
shape from Example 26
was formed by four triple constraints.
Triple constraints will be described later in Section 4.6.1.
Triple constraints can be grouped using the semicolon operator
;
to form
triple expressions.1
Shapes are enclosed by curly braces
{
}
and contain triple expressions.
Shapes are the basic form of shape expressions, although more complex shape expressions can be formed by combining the logical operators
AND
,
OR
and
NOT
which will be later described in Section 4.6.
Shape expressions are identified by shape expression labels.
Figure 4.4.1 shows a compound shape expression formed by combining the shape reference
@
:
User
with a shape that contains a single triple constraint
:
teaches
@
:
Course
using the
AND
operator.
The full ShEx BNF grammar is specified at http://shex.io/shex-semantics/\#shexc.
In Example 26, we tested several RDF nodes (
:
alice
,
:
bob
, ...
:
harold
) against the shape
:
User
.
ShEx validation takes as input a schema, an RDF graph, and a shape map, and returns another shape map.
The input shape map (called fixed shape map) contains a list of
nodeSelector@shapeLabel
associations separated by commas,
where
nodeSelector
is an RDF node and
shapeLabel
is a shape label.
Both use N-Triples notation.
A fixed map would look like:
<http://data.example/#alice>@<http://schema.example/#User>, <http://data.example/#bob>@<http://schema.example/#User> |
Although shape maps use absolute IRIs for RDF nodes and shape labels, we will use prefixes to abbreviate them in our listings:
:alice@:User, :bob@User |
Note that during evaluation, the processor may need to check the conformance of other nodes against other shapes.
If we define the following schema:
:User { schema:name xsd:string ; schema:knows @:User* } |
and the RDF graph:
:alice schema:name "Alice"; schema:knows :carol . :bob schema:name "Robert" . :carol schema:name "Carol" . |
when we invoke a ShEx processor with the fixed shape map:
:alice@:User, :bob@:User |
the result shape map is:
:alice@:User, :bob@:User, :carol@:User |
The reason is that in order to check that
:
alice
conforms to
:
User
, the processor must check that
:
carol
also conforms to
:
User
and hence, it adds the association
:
carol@
:
User
to the result shape map.
Figure 4.5 depicts the validation process.
There are many use case-dependent ways to compose a fixed shape map.
ShEx defines a common one called query shape map which uses triple patterns to select nodes.
Triple patterns use curly braces and three values that represent the subject, predicate and object of a triple.
They can contain the value
FOCUS
to identify the node we want to select and
_
to indicate that we do not constrain some value.
The following query map selects all subjects of
schema
:
name
, all objects of
schema
:
knows
and nodes that have
rdf
:
type
with value
schema
:
Person
.
{FOCUS schema:name _}@:User, {_ schema:knows FOCUS}@:User, {FOCUS rdf:type schema:Person}@:User |
Section 4.9 describes fixed shape maps and query shape maps in greater detail.
In the previous example, validating
:
alice
as a
:
User
entailed validating
:
carol
as a
:
User
.
Unless the validation engine has some sort of state persistence,
it would be more efficient to validate once with a shape map like:
:alice@:User,:carol@:User |
than to validate
:
alice
and
:
carol
separately.
Validating a shape map with multiple node/shape pairs allows the engine to leverage any pairs that it has already tested.
In Section 4.4.1, we described shape expressions as being composed of node constraints and shapes.
These can also be combined with the logical operators
And
,
Or
and
Not
.
And
and
Or
expressions in turn contain two or more shape expressions.
When we refer to a
shape
expression
, we mean one of the following.
And
of two or more shape expressions (called
ShapeAnd
).Or
of two or more shape expressions (called
ShapeOr
).Not
of one shape expression (called
ShapeNot
)This recursive structure forms a tree which has node constraints and shapes as leaves. Figure 4.6 represents the ShEx data model.
Node constraints and shapes are described in the following sections while the logical operators are discussed in Section 4.8 and external shapes in Section 4.7.3.
The shape expression might be selected by label or it might default to a special shape called the
start
shape.
A schema can have one more shape expression called the
start
expression.
This serves as start here advice from the schema author
and is useful when describing a graph with a single purpose.
For instance, the medical data protocol FHIR
(see Section 6.2) has specific schemas for resources like
Patient
.
Consider the following code:
start = @<Patient> <Patient> { ... } ... |
In the compact syntax, the directive
start
=
@
<
Patient
>
declares that the shape expression
<
Patient
>
will be used by default if a shape is not explicitly provided in the shapes map.
In shape maps, it is possible to declare that a node must be validated against the shape map by using the keyword
START
.
For example, the following shape map:
:alice@START, :bob@<Doctor> |
would validate
:
alice
against the start shape expression (in the previous example, it would be
<
Patient
>
) and
:
bob
against
<
Doctor
>
.
Node constraints describe the allowed values of a node. These include specification of RDF node kind, literal datatype, string and numeric facets, and value sets.
Node constraints can appear as a labeled shape expression or as part of triple constraints.
Any place one does not want a node constraint, can be marked with a period (
).
This is analogous to the period which matches any character in regular expressions.
The following example lists the properties that a
"."
:
User
must have but it does not specify any constraint in their values:
:User { schema:name . ; schema:alternateName . * ; schema:birthDate . ? } |
Given the following RDF graph:
:alice schema:name 23 . # Passes as a :User :bob schema:name "Robert" ; # Passes as a :User schema:alternateName "Bob", "Bobby", <Bob> ; schema:birthDate "Unknown" . |
If we provide the shape map
:
alice@
:
User
,:
bob@
:
User
the ShEx processor would return that they both conform.
Node constraints usually appear as part of value expressions in triple constraints.
The following example declares that nodes with shape
:
User
must have a property
schema
:
url
whose value must be an
IRI
.
:User { schema:url IRI } |
Node constraints can also appear as top level shapes.
The following code defines two shapes,
:
HomePage
and
:
CanVoteAge
,
which are defined as node constraints.
The first one declares that nodes must be
IRI
s and the second one that they must be
xsd
:
integer
values greater than 18.
:HomePage IRI :CanVoteAge xsd:integer MinInclusive 18 |
If we provide a ShEx processor the shape map
<http://example.org/alice>@:HomePage, 23@:CanVoteAge, 45@:HomePage, 14@:CanVoteAge |
The result would be that the first two nodes are conformant while the last two nodes are non-conformant.
It is also possible to combine top-level node constraints with more complex shapes.
The following declaration of shape
:
User
says that nodes conforming to shape
:
User
must be
IRI
s and have a property
schema
:
name
with an
xsd
:
string
value.
:User IRI AND { schema:name xsd:string } |
In this case, the external
AND
can be omitted, so the previous shape is equivalent to:
:User IRI { schema:name xsd:string } |
Table 4.1 gives an overview of the main types of node constraints with some examples and a short description.
Name Description Examples Anything The value can be anything .
Datatype The value must be an element of that datatype xsd
:
string
xsd
:
date
cdt
:
distance
…Node kind The value must have that kind IRI
BNode
Literal
NonLiteral
Value set The value must be an element of that set [:
Male
:
Female
]
Shape reference The value must conform to <
User
>
@
:
User
Node kinds describe the kind that a value must have.
There are four node kinds in ShEx:
Literal
,
IRI
,
BNode
, and
NonLiteral
which follow the rules defined in RDF 1.1 for such terms.
Value Description Examples Literal
Any RDF literal
"Alice"
"Spain"
@en
42
true
IRI
Any RDF IRI <
http
://
example
.
org
/
Alice
>
ex
:
alice
:
bob
BNode
Any blank node _
:
x
[]
NonLiteral
Any IRI or blank node <
http
://
example
.
org
/
alice
>
_
:
x
The following example declares that the value of property
schema
:
name
must be a literal and the value of
schema
:
follows
must be an IRI.
:User { schema:name Literal ; schema:follows IRI } |
:alice schema:name "Alice"; # Passes as a :User schema:follows :bob . :bob schema:name :Bob ; # Fails as a :User schema:follows _:x . # :Bob is not a literal and \_:x is not an IRI *) |
Like most schema languages, ShEx includes datatype constraints which declare that a focus node must be a literal with some specific datatype. ShEx has special support for XML Schema datatypes [9] for which it checks that the lexical form also conforms to the expected datatype.
The following example declares the datatypes that must have the values of
schema
:
name
and
schema
:
birthDate
properties.
:User { schema:name xsd:string ; foaf:age xsd:integer ; schema:birthDate xsd:date ; } |
:alice schema:name "Alice"; # Passes as a :User foaf:age 36 ; schema:birthDate "1981-07-10"^^xsd:date . :bob schema:name "Robert"^^xsd:string ; # Passes as a :User foaf:age "26"^^xsd:integer ; schema:birthDate "1981-07-10"^^xsd:date . :carol schema:name :Carol ; # Fails as a :User foaf:age "14" ; # :Carol is an IRI *) schema:birthDate "2003-06-10"^^xsd:date . # and "14" a string *) :dave schema:name "Dave" ; # Fails as a :User foaf:age "Unknown"^^xsd:integer; # invalid lexical forms *) schema:birthDate "Unknown"^^xsd:date . |
As we said, for XML Schema datatypes,
ShEx also checks that the lexical form matches the expected datatype.
For example, the
foaf
:
age
of
:
dave
is
"Unknown"
^^
xsd
:
integer
and although it declares that
is an integer
and some RDF parsers allow those declarations,
"Unknown"
does not have the integer’s lexical form and the ShEx processor will complain.
The same happens for the value of
"Unknown"
schema
:
birthDate
.
Although the most common use case is to use XML Schema datatypes,
RDF data can use other datatypes.
In the following example, a picture contains the properties
schema
:
width
and
schema
:
height
using a hypothetical custom datatype for distances (
cdt
:
distance
).
:Picture { schema:name xsd:string ; schema:width cdt:distance ; schema:height cdt:distance } |
:gioconda schema:name "Mona Lisa"; # Passes as a :Picture schema:width "21 in"^^cdt:distance ; schema:height "30 in"^^cdt:distance . :other schema:name "Other picture" ; # Fails as a :Picture schema:width "21 in"^^xsd:string ; # expected cdt:distance *) schema:height 30 . |
The datatype
rdf
:
langString
identifies language-tagged literals (see [25, Section 3.3]), i.e., RDF literals that have a language tag.
:Country { schema:name rdf:langString ; } |
:italy schema:name "Italia"@es . #Passes as a :Country :france schema:name "France" . #Fails as a :Country |
XML Schema provides a useful library of string and numeric tests called facets [9]. These facets are listed in Table 4.3 with a sample argument and some passing and failing values.
Facet and
argumentPassing values Failing values MinInclusive
1
"1"
^^
xsd
:
decimal
,
1
,2
,98
,99
,100
"1"
^^
xsd
:
string
,
-1
,0
MinExclusive
1
2
,98
,99
,100
-1
,0
,1
MaxInclusive
99
1
,2
,98
,99
100
MaxExclusive
99
1
,2
,98
99
,100
TotalDigits
3
"1"
^^
xsd
:
integer
,
9
,999
,0999
,
9.99
,99.9
,0.1020
"1"
^^
xsd
:
string
,
1000
,01000
,
1.1020
,.1021
,0.1021
FractionDigits
3
"1"
^^
xsd
:
decimal
,
0.1
,0.1020
,1.1020
"1"
^^
xsd
:
integer
,
0.1021
,0.10212
Length
3
"123"
^^
xsd
:
string
,
"123"
^^
xsd
:
integer
,
"abc"
"12"
^^
xsd
:
string
,
"12"
^^
xsd
:
integer
,
,
"ab"
"abcd"
MinLength
3
,
"abc"
"abcd"
,
""
"ab"
MaxLength
3
,
""
,
"ab"
"abc"
,
"abcd"
"abcde"
/^
ab
+/
Regex pattern,
"ab"
,
"abb"
"abbcd"
,
""
,
"a"
,
"acd"
"cab"
,
"AB"
,
"ABB"
"ABBCD"
/^
ab
+/
i
Regex pattern
withi
flag,
"ab"
,
"abb"
"abbcd"
,
"AB"
,
"ABB"
"ABBCD"
,
""
,
"a"
"acd"
:Product { schema:name xsd:string MaxLength 10 ; schema:weight xsd:decimal MinInclusive 1 MaxInclusive 200 ; schema:sku /^[A-Z0-9]{10,20}$/ ; } |
:product1 schema:name "Product 1"; #Passes as a :Product schema:weight "23.0"^^xsd:decimal; schema:sku "A23456B234CBDF" . :product2 schema:name "Product 2" ; #Fails as a :Product schema:weight "245.5"^^xsd:decimal ;# schema:weight > 200 *) schema:sku "ABC" . # schema:sku fails regex *) |
The pattern constraint (‘
/
regex
/
’) is based on the XPath regular expression function
fn
:
matches
(
str
,
re
,
flags
)
which takes as parameters the string to match, the regular expression, and an optional flags parameter to modify the matching behavior.
XPath regular expressions are based on common conventions from other languages like Perl or other Unix tools like grep. The regular expression language is a string composed of the characters to match and some characters which have special meaning called meta-characters.
x
matches the
'x'
character.
\
u0078
matches the unicode codepoint U+78 (which is again
'x'
).
.
matches any character.
[
vxz
]
declares a character class, and matches any of
'v'
,
'x'
, or
'z'
.
\
d
is a pre-defined character class which matches any digit.
It is equivalent ot “
[0-9]
”.
\
S
is a pre-defined character class which matches any space character (which also includes tabs and newlines). It is equivalent ot “
[\
u0008
\
u000d
\
u000a
\
u0020
]
”.
Inside character classes, the symbol “
^
” means negation and “
-
” can be used to declare character ranges.
For instance, the character class
[^
a
-
zA
-
Z
]
matches any non-letter.
Cardinality (repetition) operators can be used to specify how many characters are matched. The possibilities are as follows.
?
represents zero or one values.
+
one or more values.
*
zero or more values.
{
m
,
n
}
between
m
and
n
values.Any string of characters must be matched in the order of its characters with the following alterations.
|
declares alternatives, e.g., “
abc
|
def
|
ghi
” matches any of “
abc
”, “
def
”, “
ghi
”.
^
matches the beginning of a string.
$
matches the end of a string.
\^
ab
(
cd
|
ef
){2,}
gh
” matches “
abcdcdcdghij
”.
All of the meta characters above will be treated as a literal (i.e., they match themselves) if they are prefixed with a
\\
(backslash).
Table 4.4 contains several examples of regular expression matches.
Regular Expression Some values that match Some values that don’t match P
\
d
{2,3}
P12
P234
A1
P2n
P1
P2233
(
pa
)*
b
b
pab
papab
papapab
…pa
po
(
pa
)*
b
b
pab
papab
papapab
…pa
po
[
a
-
z
]{2,3}
ab
abc
a
abcd
23
[
a
-
z
]{2,3}
ab
abc
a
abcd
x45
23
The flags string has the following possibilities.
i
: Case-insensitive mode. m
: Multi-line mode.
If present, the
^
character matches the start of any line (not only the start of the string)
and the $
matches the end of any line (not only the end of the string).s
: If present, the dot matches also newlines, otherwise it matches any character except newlines.
This mode is called single-line mode in Perl.x
: Removes white space characters in the regular expression before matching.q
: All meta characters are interpreted as literals, i.e., they match themselves in the input string.
q
is compatible with the
i
flag.
If it’s used with the
m
,
s
or
x
flag, that flag is ignored.
A value set is a node constraint which enumerates the list of possible values that a focus node may have.
In ShExC, value sets are enclosed by square brackets (
[
and
]
) where each possible value is separated by a space.
The following example declares a shape
:
Product
with two properties:
schema
:
color
and
schema
:
manufacturer
,
whose possible values are enumerated.
:Product { schema:color [ "Red" "Green" "Blue" ] ; schema:manufacturer [ :OurCompany :AnotherCompany ] } |
:x1 schema:color "Red"; # Passes as a :Product schema:manufacturer :OurCompany . :x2 schema:color "Cyan" ; # Fails as a :Product schema:manufacturer :OurCompany . :x3 schema:color "Green" ; # Fails as a :Product schema:manufacturer :Unknown . |
A common pattern is to declare that a node must have a specific value. This can be done by a unit value set, i.e., a value set with a single value.
:Spanish { schema:country [ :Spain ] } :User { a [ schema:Person ] } |
:alice schema:country :Spain . # Passes as a :Spanish :bob schema:country :France . # Fails as a :Spanish :carol a schema:Person ; # Passes as a :Spanish and :User schema:country :Spain . :p1 a schema:Product; # Fails as a :User schema:country :Spain . # Passes as a :Spanish :dave rdf:type schema:Person; # Passes as a :User schema:country :Japan . # Fails as a :Spanish |
Note that the
:
User
shape employs the
a
keyword which stands for
rdf
:
type
.
There is no inference in ShEx, even for
rdf
:
type
, which is treated as any other arc.
See Section 3.2 for a discussion of the difference between shapes and classes.
As seen above, value sets contain one or more values.
The examples so far have included IRI and strings (literals with a datatype of
xsd
:
string
).
These match precisely the same value in the data.
They can also be language tags, which match any literal with the given language tag.
:FrenchProduct { schema:label [ @fr ] } :SpanishProduct { schema:label [ @es @es-AR @es-ES ] } |
:car1 schema:label "Voiture"@fr . # Passes as a :FrenchProduct :car2 schema:label "Auto"@es . # Passes as a :SpanishProduct :car3 schema:label "Carro"@es-AR . # Passes as a :SpanishProduct :car4 schema:label "Coche"@es-ES . # Passes as a :SpanishProduct |
We can see in the example above that it would be convenient to accept literals with any language tag starting with
.
This can be indicated with the postfix operator ‘
"es"
~
’.
For example, Argentinian, Chilean, and other region codes for Spain could be accepted with ‘
schema
:
label
[
@es
~ ]
’.
The following code declares that Spanish products contain
rdfs
:
label
with a value that must be a
language-tagged literal in Spanish or any variant.
:SpanishProduct { schema:label [ @es~ ] } |
:car1 schema:label "Auto"@es . # Passes as a :SpanishProduct :car2 schema:label "Carro"@es-AR . # Passes as a :SpanishProduct :car3 schema:label "Coche"@es-ES . # Passes as a :SpanishProduct |
This also works for strings, e.g., ‘
"+34"
~
’ (French telephone numbers) and
IRIs, e.g., ‘
<
http
://
www
.
w3
.
org
/
ns
/>~
’ (W3C namespaces).
:SpanishW3CPeople { schema:telephone [ "+34"~ ] ; schema:url [ <http://www.W3C.es/Personal>~ ] } |
:alice schema:telephone "+34 123 456 789"; # Passes as a :SpanishW3CPeople schema:url <http://www.W3C.es/Personal/Alice> . :bob schema:telephone "123 456 789" ; # Fails as a :SpanishW3CPeople schema:url <http://other.org/bob> . # Bad telephone and url *) |
IRIs represented as prefixed names can also have a postfix ‘
~
’, e.g.,
foaf
:~
represents the set of all URIs that start with the namespace bound to the prefix
foaf
:
.
In the following example, we declare that the status of a product must start by
http
://
example
.
codes
/
good
.
or
http
://
example
.
codes
/
bad
.
.
prefix codes: <http://example.codes/> :Product { :status [ codes:good.~ codes:bad.~ ] } |
prefix codes: <http://example.codes/> prefix other: <http://other.codes/> :x1 :status codes:good.Shipped . # Passes as a :Product :x2 :status other:done . # Fails as a :Product :x3 :status <http://example.codes/bad.Lost> . # Passes as a :Product |
It can also be useful to exclude some values from a range.
Exclusions are marked by the minus
-
sign.
For example:
codes
:~ -
codes
:
unknown
represents all values starting by
codes
:
except
codes
:
unknown
.
Exclusions can themselves be ranges.
For example:
codes
:~ -
codes
:
bad
.~
represents all values starting by
codes
:
except those that start by
codes
:
bad
.
.
The following code prescribes that the status of products can be anything that starts
with
codes
:
except
codes
:
unknown
or codes starting with
codes
:
bad
.
.
prefix codes: <http://example.codes/> :Product { :status [ codes:~ - codes:unknown - codes:bad.~ ] } |
prefix codes: <http://example.codes/> prefix other: <http://other.codes/> :p1 :status codes:good.Shipped . # Passes as a :Product :p2 :status other:done . # Fails as a :Product :p3 :status <http://example.codes/bad.Lost> . # Fails as a :Product :p4 :status <http://example.codes/unknown> . # Fails as a :Product |
Exclusions must be the same kind (IRI, string or language tag) as the stem type.
For instance, ‘
[
codes
:
good
.~ -
"bad."
-
@fr
~ ]
’ would be malformed as it’s an IRI range excluding a string and a language stem.
There is no requirement that value sets be composed of a consistent kind of value (IRI, string or language tag).
For instance, the status of a product can be the IRIs (
:
Accepted
or
:
Rejected
) or a string, e.g., “unknown".
Sometimes we want to accept user data with any value except some specific values. For this, a wildcard character (‘.’) followed by one or more exclusions can be used (so long as those exclusions are all of the same kind). The kind of the exlcusions (IRI, string, or language tag) establishes the type of RDF term that will be matched.
The following code declares that the status of products can be anything except the IRI
codes
:
bad
.
Given that the exclusion is an IRI, the status must be an IRI.
prefix codes: <http://example.codes/> :Product { :status [ . - codes:bad ] } |
prefix codes: <http://example.codes/> prefix other: <http://other.codes/> :p1 :status codes:good . # Passes as a :Product :p2 :status other:bad . # Passes as a :Product :p3 :status codes:bad . # Fails as a :Product :p4 :status "good" . # Fails as a :Product # "good" must be a IRI *) |
Value sets are mostly a shorthand syntax for complex Boolean combinations of node constraints. ShEx includes them because they are much more concise and, given their ubiquity in other schema languages, they are fundamental to how people model and understand data.
The following shape:
:User { schema:gender [ schema:Male schema:Female ] } |
can be defined without value sets using the
OR
operator that will be presented in Section 4.6.
:User { schema:gender [ schema:Male ] } OR { schema:gender [ schema:Female ] } |
In the previous section we explored node constraints and how they declare a set of permissible RDF terms. Most of the examples used node constraints in triple constraints, limiting the permissible values for triples in the input graph.
In the following example, we describe a shape
:
User
:User { schema:name xsd:string } |
and we will try to validate the nodes
:
alice
and
:
bob
represented in the following data:
:alice schema:name "Alice" ; # Passes as a :User schema:knows :bob . :bob schema:name 34 ; # Fails as a :User schema:knows :alice . # wrong schema:name *) |
To solidify our intuition of validating shapes, we need to think of this as a series of steps to validate a focus node against a shape expression.
:
alice
conforms to the shape expression
:
User
.
:
User
is a shape so check if the neighborhood of
:
alice
matches the triple expression in the shape
:
User
.
This step means that one needs to find a way to distribute the triples in the neighborhood to satisfy the triple expression.
:
alice
schema
:
name
"Alice"
.
"Alice"
, as the focus node and test it against the node constraint (in this case
xsd
:
string
).
"Alice"
matches ‘
xsd
:
string
’ so this test succeeds.
{1,1}
(the default one) and as there is only one tripe matching the node conforms to the
shape expression.
When the same steps are performed to check
:
bob
, the last step will have
34
as the focus node.
This test fails so
:
bob
fails to conform to
:
User
.
A shape is a container for a triple expression along with some properties stating how to treat triples not matching the triple expression. We will describe these properties after introducing triple expressions (Section 4.6.8). Since triple expressions are combinations of triple constraints, we start with them.
The basic building block of a triple expression is a triple constraint. It is composed of a property, a node constraint, and a cardinality.
A triple constraint expresses a constraint on the values of triples with the given property and the number of values expressed by the cardinality. Cardinalities will be described in more detail in Section 4.6.3.
:Product { schema:productId xsd:string {1,2} } |
The meaning is that nodes conforming to
:
Product
must satisfy:
schema
:
productId
.
schema
:
productId
must satisfy the node constraint
xsd
:
string
.
{1,2}
, there can be between 1 and 2 values of
schema
:
productId
.
:p1 schema:productId "P1" . # Passes as a :Product :p2 schema:productId "P2", "C2". # Passes as a :Product :p3 schema:productId "P3", "C3", "X3" . # Fails as a :Product # Cardinality exceeded *) :p4 schema:name "No Id" . # Fails as a :Product # No schema:productId *) :p5 schema:productId 5 . # Fails as a :Product # xsd:string not satisfied *) :p6 schema:productId "P6", 5 . # Fails as a :Product # xsd:string not satisfied *) |
Triple constraints have an implicit meaning of closing the possible values of a property.
In the previous example, the declaration
schema
:
productId
xsd
:
string
requires all values of
schema
:
productId
to satisfy
xsd
:
string
.
That’s why
:
p6
failed to conform: although it had one string value, the other value wasn’t.
This behavior can be modified with the directives
EXTRA
and
CLOSED
that will be shown in Section 4.6.8.
The EachOf operator combines two or more triple expressions.
All the sub-expressions must be satisfied by triples in the neighborhood of the focus node.
EachOf is indicated by a semicolon (
;
) in the compact syntax.
:
User
is defined by an EachOf expression that combines three triple constraints.
A node satisfies the
:
User
type if all the three triple constraints are satisfied.
:User { schema:name xsd:string ; foaf:age xsd:integer ; schema:email xsd:string } |
Cardinalities indicate the required number of triples satisfying the given constraint. They are most often used on triple constraints although they can also be applied to more complex expressions. Table 4.5 gives an overview of the different representations of cardinalities in ShExC.
Value Description *
0 or more +
1 or more ?
0 or 1 {
m
}
Exactly m repetitions {
m
,
n
}
Between m and n repetitions {
m
,}
m or more repetitions
If the cardinality is not specified, the default value is
{1}
(exactly one).
The following
:
User
shape declares that nodes must have exactly one value for
schema
:
name
(default cardinality),
and optional value for
schema
:
worksFor
and zero or more values for
schema
:
follows
.
The
:
Company
shape uses the explicit
{
m
,
n
}
syntax to assert that a matching node must have between 1 and 100 employees and
an optional
schema
:
founder
value.
:User { schema:name xsd:string ; schema:worksFor IRI ? ; schema:follows IRI * } :Company { schema:founder IRI ?; schema:employee IRI {1,100} } |
:alice schema:name "Alice"; #Passes as a :User schema:follows :bob; schema:worksFor :OurCompany. :bob schema:name "Robert" ; #Passes as a :User schema:worksFor :OurCompany. :carol schema:name "Carol" ; #Passes as a :User schema:follows :alice . :dave schema:name "Dave" . #Passes as a :User :emily schema:name "Emily" ; #Fails as a :User schema:worksFor :OurCompany, # more than one schema:worksFor *) :OtherCompany . :OurCompany schema:founder :dave ; schema:employee :alice, :bob. #Passes as a :Company :OtherCompany schema:founder :alice . #Fails as a :Company # 0 employees *) |
A cardinality can also be used on more general expressions indicating that the neighborhood of a node must contain several groups of triples, each of them satisfying the expression.
The following shape declares that nodes must have exactly one value for
schema
:
name
and that they can contain the combination of
schema
:
givenName
and
schema
:
familyName
with optional cardinality (either they contain the group of both properties or none of them).
:User { schema:name xsd:string ; ( schema:givenName xsd:string ; schema:familyName xsd:string ) ? } |
:alice schema:name "Alice" #Passes as a :User . :bob schema:name "Robert" ; #Passes as a :User schema:givenName "Robert" ; schema:familyName "Smith" . :carol schema:name "Carol" ; #Fails as a :User schema:givenName "Carol" . |
The pipe or choice operator
|
can be used to declare compose complex triple expressions with the meaning that one of the branches must be satisfied.
The following shape declares that nodes must have either
schema
:
name
or
foaf
:
name
, but not both.
:User { schema:name xsd:string | foaf:name xsd:string } |
:alice schema:name "Alice" . #Passes as a :User :bob foaf:name "Bob" ; #Passes as a :User schema:identifier "P234" . :carol schema:name "Carol" ; #Fails as a :User foaf:name "Carol" . # More than one *) :dave schema:identifier "P123" . #Fails as a :User # None provided *) |
A typical pattern consists of combining OneOf (
|
operator) with EachOf (
;
) to form more complex expressions.
The following shape declares that nodes must have either one
schema
:
name
or a combination of zero or more
schema
:
givenName
and
one
schema
:
lastName
.
:User { schema:name xsd:string | ( schema:givenName xsd:string + ; schema:familyName xsd:string ) } |
:alice schema:name "Alice" . #Passes as a :User :bob schema:givenName "Bob" ; #Passes as a :User schema:givenName "Bobby"; schema:familyName "Smith" . :carol schema:name "Carol" ; #Fails as a :User schema:familyName "King" . # Can't have both *) :dave schema:name 23 . #Fails as a :User # schema:name must be xsd:string *) |
A typical pattern is to add some cardinality to an expression formed by the OneOf (
|
) operator.
The following shape declares that nodes must have exactly one value for
schema
:
productId
and that they can contain between 0 or two combinations of
schema
:
isRelatedTo
or
schema
:
isSimilarTo
.
:Product { schema:productId xsd:string ; ( schema:isRelatedTo @:Product | schema:isSimilarTo @:Product ){0,2} } |
:p1 schema:productId "P1" ; #Passes as a :Product schema:isRelatedTo :p2, :p3 . :p2 schema:productId "P2" . #Passes as a :Product :p3 schema:productId "P3"; #Passes as a :Product schema:isRelatedTo :p1 ; schema:isSimilarTo :p2 . :p4 schema:productId "P4" ; #Fails as a :Product schema:isRelatedTo :p1, :p2, :p3 . |
It is possible to avoid defining two shapes when one of them is just an auxiliary shape that is not needed elsewhere.
The following schema declares that nodes conforming with
:
User
must have a property
schema
:
name
with
xsd
:
string
and another property
schema
:
worksFor
whose value must conform with an anonymous shape
_
:1
which must have
rdf
:
type
with the value
:
Company
.
:User { schema:name xsd:string ; schema:worksFor @_:1 } _:1 { a [ :Company ] } |
It can be rewritten as:
:User { schema:name xsd:string ; schema:worksFor { a [ :Company] } } |
:alice schema:name "Alice" ; #Passes as a :User schema:worksFor :OurCompany . :bob schema:name "Robert" ; #Passes as a :User schema:worksFor [ a :Company] . :carol schema:name "Carol" ; #Fails as a :User schema:worksFor [ # The value of schema:worksFor *) schema:name "AnotherCompany" # does not have rdf:type :Company *) ]. :OurCompany a :Company . #Passes as a anonymous shape |
Nested shapes can be used to emulate simple SPARQL property paths.
The
^
operator reverses the order of the triple constraint.
Instead of constraining the focus node’s outgoing arcs, it constrains incoming arcs.
The following code declares that nodes conforming to shape
:
Company
must have
rdf
:
type
:
Company
and must be the objects of
one or more triples with predicate
schema
:
worksFor
and a subject conforming to shape
:
User
.
:User { schema:name xsd:string } :Company { a [schema:Company] ; ^schema:worksFor @:User + } |
With the following data, node
:
Company1
conforms to
:
Company
because there are two nodes,
:
alice
and
:
bob
that work for it.
However, node
:
Company2
does not conform because there are no node pointing to it by the property
schema
:
worksFor
and
node
:
Company3
also fails because the node that works for it, does not conform to shape
:
User
.
:alice schema:name "Alice"; #Passes as a :User schema:worksFor :Company1 . :bob schema:name "Bob" ; #Passes as a :User schema:worksFor :Company1 . :carol schema:worksFor :Company3 . #Fails as a :User # No schema:name *) :Company1 a schema:Company . #Passes as a :Company :Company2 a schema:Company . #Fails as a :Company # No one works for it *) :Company3 a schema:Company . #Fails as a :Company # Carol works for it *) # but does not conform to User *) |
The EachOf operator is different from a conjunction operator.
This is best illustrated when a shape uses the same property several times; we call this a repeated property.
In Example 60, the
:
User
shape is an EachOf with three triple constraints, two of which have the same property
:
parent
.
This shape is conformed by a node that has two arcs for the
:
parent
property,
each of which contributes to satisfy one of the two triple constraints.
:User { schema:name xsd:string; schema:parent { schema:gender [schema:Male ] } ; schema:parent { schema:gender [schema:Female ] } ; } |
:alice schema:name "Alice" ; #Passes as a :User schema:parent :bob, :carol . :bob schema:gender schema:Male . :carol schema:gender schema:Female . :dave schema:name "Dave" ; #Fails as a :User schema:parent :carol, :emily . # both parents are Female :emily schema:gender schema:Female . :frank schema:name "Frank"; #Fails as a :User schema:parent :x . # only one parent :x schema:gender schema:Female, schema:Male . |
Remember that ShEx distributes the triples to triple constraints in a triple expression (see Section 4.6).
This means the same triple cannot contribute for satisfying two different triple constraints,
even if its object satisfies the node constraints for both.
That is why the node
:
frank
does not conform to the
:
User
shape even if its parent satisfies both conditions.
When defining RDF-based services using ShEx schemas, there are several possibilities that have to be taken into account. Some services backed by an RDF triple store may simply accept and store any triples not described in the schema; in such a case, the role of the schema is mainly to identify and constrain the triples that the service understands and manipulates, allowing any extra triples for unforeseen applications. This open model is more popular in the semantic web community.
At the other extreme, some services or databases may accept or emit some fixed structure, disallowing any triples that are not mentioned in the schema. In this case, the role of ShEx schemas is to validate and verify the content before it is processed or published. This closed model has been traditionally employed in contexts where data quality and security play a significant part.
ShEx manages these use cases with two granularities:
As we described in Section 4.6.1 triple constraints close properties by default.
Sometimes, it is useful to open a property to permit instances of it which are not included in the schema.
The
EXTRA
qualifier can be used to allow the appearance of other properties.
A shape of the form
<Shape> EXTRA <property> { <property> <NodeConstraint> } |
is equivalent to:
<Shape> { <property> <NodeConstraint> ; <property> (Not <NodeConstraint>)* } |
which means that it allows zero or more values of
<
property
>
that do not satisfy
<
NodeConstraint
>
.
Note that that there is a hidden negation in any shape that includes an
EXTRA
qualifier.
The following example declares that nodes that conform to
:
FollowSpaniards
must follow one of more nodes whose nationality is
:
Spain
, but can also follow other nodes.
:FollowSpaniards EXTRA schema:follows { schema:follows { schema:nationality [:Spain] }+ } |
:alice schema:follows :david . #Passes as a :FollowSpaniards :bob schema:follows :david, :emily . #Passes as a :FollowSpaniards :carol schema:follows :emily . #Fails as a :FollowSpaniards :david schema:nationality :Spain . :emily schema:nationality :France . |
Notice that in the case of
:
bob
is passes although it follows
:
emily
which is not Spaniard.
If we remove the
EXTRA
declaration it would fail.
A typical pattern using
EXTRA
declarations is to constrain the set of required values of a node but to allow other values.
The following example declares the shapes for companies which must have two values for the
rdf
:
type
predicate:
schema
:
Organization
and
org
:
Organization
.
Shape
:
Company1
does not allow any extra
rdf
:
type
arc, while shape
:
Company2
allows extra values.
:Company1 { a [ schema:Organization ] ; a [ org:Organization ] } :Company2 EXTRA a { # Allows extra values of rdf:type a [ schema:Organization ] ; a [ org:Organization ] } |
:OurCompany a org:Organization, #Passes as a :Company1 and :Company2 schema:Organization . :OurUniversity a org:Organization, #Fails as a :Company1 schema:CollegeOrUniversity, # unexpected rdf:type schema:Organization . #Passes as a :Company2 |
A shape can be declared to have only the triples matching a given set of triple constraints and no others using the
keyword
CLOSED
.
:User1 { schema:name xsd:string; schema:knows IRI* } :User2 CLOSED { schema:name xsd:string; schema:knows IRI* } |
:alice schema:name "Alice" ; #Passes as a :User1 and :User2 schema:knows :bob . :bob schema:name "Bob" ; #Passes as a :User1 schema:knows :alice ; #Fails as a :User2 schema:age 23 . # unexpected schema:age |
A common pattern is to combine
CLOSED
and
EXTRA
.
The shape
KnowsW3CPeople
:KnowsW3CPeople CLOSED EXTRA schema:knows { schema:name xsd:string; schema:affiliation IRI ? ; schema:knows { schema:affiliation [:W3C] }+ } |
:alice schema:name "Alice" ; #Passes as a :KnowsW3CPeople schema:affiliation :ACompany ; schema:knows :bob . :bob schema:name "Bob" ; #Fails as a :KnowsW3CPeople schema:affiliation :W3C; schema:knows :carol . # :carol's affiliation is not :W3C *) :carol schema:name "Carol" ; #Passes as a :KnowsW3CPeople schema:affiliation :ACompany ; schema:knows :alice, :bob . :dave schema:name "Dave" ; #Fails as a :KnowsW3CPeople schema:knows :alice, :bob ; schema:age 23 . # schema:age not allowed*) |
A node constraint can be a shape reference, which has the form
@label
where label is the identifier of another shape expression in the schema.
Shape expression reference would be a more precise name but is long enough to be awkard.
:User { schema:worksFor @:Company ; } :Company { schema:name xsd:string } |
:alice a :User; #Passes as a :User schema:worksFor :a . :bob a :User; #Fails as a :User because :x fails as :Company schema:worksFor :x . :a schema:name "CompanyA" . #Passes as a :Company :x schema:name 23 . #Fails as a :Company |
It is possible to define data models with cyclic references, i.e., shapes that recursively refer to themselves either directly or indirectly. ShEx supports these kinds of data models which appear frequently.
The model depicted in Figure 66 can be specified in ShEx as:
:User { schema:worksFor @:Company ; } :Company { schema:name xsd:string ; schema:employee @:User* } |
:alice schema:worksFor :OurCompany . #Passes as a :User :bob schema:name "Robert"; #Passes as a :User schema:worksFor :OurCompany . :carol schema:worksFor :AnotherCompany . #Passes as a :User :OurCompany schema:name "OurCompany" ; #Passes as a :Company schema:employee :alice, :bob . :AnotherCompany schema:name "AnotherCompany" . #Passes as a :Company |
As an exercise, we present a more complex cyclic data model in Figure 67. Although the model has several cycles, it can be easily represented in ShEx as:
:University { schema:name xsd:string ; schema:employee @:Teacher +; schema:course @:Course + } :Teacher { a [ schema:Person ]; schema:name xsd:string ; :teaches @:Course* } :Course { schema:name xsd:string ; :university @:University ; :hasStudent @:Student+ } :Student { a [ schema:Person ]; schema:name xsd:string ; schema:mbox IRI ; :hasFriend @:Student* ; :isEnroledIn @:Course* } |
Notice the separation between the types and shapes of nodes.
Both
:
Teacher
and
:
Student
must have
rdf
:
type
with value
schema
:
Person
, but their properties are different.
As can be seen, ShEx can model any kind cyclic or recursive model in a natural way.
The only restriction is when combining recursion with negation, as we will explain in Section 4.8.3 where the negation operator
NOT
is introduced.
External shapes are an extension mechanism to externally define shapes. This is useful when we want to describe functional shapes or very large value sets. As a practical example, in medical schemas, value sets can be dynamically derived and include hundreds of thousands of terms. In the FHIR use case (see Section 6.2), these are resolved using an emerging REST API for ShEx.
The following code declares an external shape for products where the value of
schema
:
category
is defined as an external shape.
In this case, an annotation declares the property
:
service
that points to the URL where the shape can be retrieved.
:Product { schema:productId xsd:string ; schema:category EXTERNAL // :service <http://categories.org/> } |
Although at the time of this writing,
the ShEx specification does not define a mechanism like the
:
service
above, it is expected that future mechanisms
like that will be developed.
Much as shape references (Section 4.7.1) are allowed wherever a shape expression may appear, any triple expression can be labeled so it can later be referenced.
The target triple expression must be labeled with $label
and references are made with &label
.
For instance, if we want to share a name expression between
:
User
and
:
Employee
shapes, we could include the expression in one and reference it from the other.
:User { $:name ( schema:name . | schema:givenName . ; schema:familyName . ) ; schema:email IRI } :Employee { &:name ; :employeeId . } |
:alice schema:name "Alice" ; #Passes as a :User schema:email <mailto:alice@example.org> . :bob schema:givenName "Robert" ; #Passes as a :Employee schema:familyName "Smith" ; :employeeId 1234567 . |
The “
\&:
name
" directive can be considered to insert the value of
:
name
into its place.
Logically,
:
Employee
is equivalent to this:
:Employee { ( schema:name . | schema:givenName . ; schema:familyName .) ; :employeeId . } |
ShEx allows to provide annotations, which are lists of pairs
(
predicate
,
object
)
where predicate is an IRI and object is any RDF node.
Annotations provide additional information about the elements to that they are applied, which can be
triple constraints, EachOf, OneOf, or shapes.
The compact syntax for annotations uses two slashes
//
followed by a predicate and an object.
The following code declares a shape
:
User
which must have a
schema
:
name
with a
xsd
:
string
value, and a
schema
:
birthDate
with a
xsd
:
date
. Each triple constraint has its corresponding
rdfs
:
label
and
rdfs
:
comment
annotations.
:Person { schema:name xsd:string // rdfs:label "Name" // rdfs:comment "Name of person" ; schema:birthDate xsd:date // rdfs:label "birthDate" // rdfs:comment "Birth of date" ; } |
In this case, each triple constraint has its specific annotations which are internally represented as triples.
At the time of this writing ShEx does not have any built-in annotation vocabulary. It is expected that some specific annotations could be used for future uses like user interface generation or any other use case.
The logical operators
AND
,
OR
, and
NOT
can be used to form complex shape expressions.
Their meaning follows the conventional logical meaning of conjunction, disjunction, and negation.
The precedence of the operators is the usual one.
Operation Description AND
S1
AND
S2
is satisfied if and only if both are satisfiedOR
S1
OR
S2
is satisfied if and only ifS1
orS2
(or both) are satisfiedNOT
NOT
S
is satisfied if and only ifS
is not satisfied
The
AND
operator forms a new shape expression from two shape expressions
with the meaning that a node conforms to
S1
AND
S2
if it conforms to both
S1
and
S2
.
The following example expresses that
:
User
nodes must satisfy two shape expressions at the same time.
Notice that the appearance of the repeated property
schema
:
owns
means that both expressions must be satisfied, i.e., that the value of
schema
:
owns
must be an
IRI
and must have shape
:
Product
, which must have a property
schema
:
productId
whose value is a
xsd
:
string
between 5 and 10 characters.
:User { schema:name xsd:string ; schema:owns IRI } AND { schema:owns @:Product } :Product { schema:productId xsd:string AND MINLENGTH 5 AND MAXLENGTH 10 } |
:alice schema:name "Alice" ; #Passes as a :User schema:owns :product1 . :bob schema:name "Robert" ; #Fails as a :User schema:owns :product2, :product3 . :carol schema:name "Carol" ; #Fails as a :User schema:owns _:x . :product1 schema:productId "Product1" . #Passes as a :Product :product2 schema:productId "Product2" . #Passes as a :Product :product3 schema:productId "Product3" . #Passes as a :Product :product4 schema:productId "P4" . #Fails as a :Product _:x schema:productId "ProductX" . #Passes as a :Product |
If the left-hand side of the conjunction is a node constraint, the
AND
keyword can be omitted.
In the following schema,
:
User1
and
:
User2
, and
:
Product1
and
:
Product2
are equivalent:
:User1 IRI AND { schema:name xsd:string } :User2 IRI { schema:name xsd:string } :Product1 { schema:productId xsd:string AND MINLENGTH 5 AND MAXLENGTH 10 } :Product2 { schema:productId xsd:string MINLENGTH 5 MAXLENGTH 10 } |
A common situation is to declare a set of constraints that we want to repeat.
In the following example,
we reuse
:
CompanyConstraints
in two places (for
schema
:
worksFor
and for
schema
:
affiliation
).
:CompanyConstraints IRI /^http:\/\/example.org\/id[0-9]+/ @:CompanyShape :User { schema:name xsd:string; schema:worksFor @:CompanyConstraints; schema:affiliation @:CompanyConstraints } :CompanyShape { schema:founder xsd:string; } |
:alice schema:name "Alice" ; #Passes as a :User schema:worksFor :id1 ; schema:affiliation :id2 . :id1 schema:founder "Robert" . :id2 schema:founder "Carol" . |
Another example of shape reuse is to extend a shape with more constraints emulating a kind of inheritance as in Object-Oriented languages.
The following example declares a top-level shape
:
Person
whose nodes must have
rdf
:
type
with value
schema
:
Person
and
schema
:
name
.
The shape
:
User
extends
:
Person
adding a new constraint on the existing property
schema
:
name
and declaring the need of another property
schema
:
email
.
Finally, the shape
:
Student
extends
:
User
adding a new property
:
course
.
:Person { a [ schema:Person ] ; schema:name xsd:string ; } :User @:Person AND { schema:name MaxLength 20 ; schema:email IRI } :Student @:User AND { :course IRI *; } |
:alice a schema:Person ; # Passes as a :Person schema:name "Alice" . :bob schema:name "Robert"; # Fails as a :User schema:email <bob@example.org> . # lacks rdf:type :Person *) :carol a schema:Person; # Passes as a :Person and :User schema:name "Carol" ; schema:email <carol@example.org> . :dave a schema:Person; # Passes as a :Person, :User and Student schema:name "Carol" ; schema:email <carol@example.org>; :course :algebra . |
Notice that this kind of reuse requires the shapes extended to be compatible with the new ones. Otherwise, there will be no nodes satisfying them.
For example, we may want to declare a
:
Teacher
shape extending
:
User
but adding the constraint that teachers have no email.
:Teacher @:User AND { schema:email . {0,0} ; } |
However, there will be no nodes satisfying it, because shape
:
User
prescribes that they must have exactly one
schema
:
email
, while the extended shape
:
Teacher
prescribes that they must have no
schema
:
email
.
In order to obtain the desired model, it is necessary that the shapes to be extended are general enough to be compatible with the new shapes.
In this case, for example, it would be better to declare that the cardinality of
schema
:
email
in
:
User
was optional.
The
Or
operator combines two shape expressions with an inclusive disjunction,
i.e., either one side or the other, or both must be satisfied.
The following example declares that nodes of shape
:
User
must have either a
schema
:
name
with
xsd
:
string
value or
a combination of
schema
:
givenName
and
schema
:
familyName
with
xsd
:
string
values, or both.
:User { schema:name xsd:string } OR { schema:givenName xsd:string ; schema:familyName xsd:string } |
:alice schema:name "Alice" . #Passes as a :User :bob schema:givenName "Robert"; #Passes as a :User schema:familyName "Smith" . :carol schema:name "Carol King" ; #Passes as a :User schema:givenName "Carol"; schema:familyName "King" . |
Or
and
|
There is a difference between the
Or
and the choice (
|
) operator.
The former defines an inclusive-or, while the latter specifies an exclusive-or in this case (only one of the shape expressions must be satisfied, but not both).
:User1 { schema:name xsd:string } OR { schema:givenName xsd:string ; schema:familyName xsd:string } :User2 { schema:name xsd:string | schema:givenName xsd:string ; schema:familyName xsd:string } |
:alice schema:name "Alice" . #Passes as a :User1 and :User2 :bob schema:givenName "Robert"; #Passes as a :User1 and :User2 schema:familyName "Smith" . :carol schema:name "Carol King" ; #Passes as a :User1 schema:givenName "Carol"; #Fails as a :User2 schema:familyName "King" . :dave schema:name "Dave" ; #Passes as a :User1 schema:givenName "Dave" . #Fails as a :User2 |
A common use case is to declare that the value of some property is the disjunction of several datatypes or value sets.
The following example declares that products must have a
rdfs
:
label
with a string value or a language tagged literal (remember that those literal have type
rdf
:
langString
), and a
schema
:
releaseDate
whose values must be either
xsd
:
date
,
xsd
:
gYear
or one of the values
or
"unknown-past"
."unknown-future"
:Product { rdfs:label xsd:string OR rdf:langString; schema:releaseDate xsd:date OR xsd:gYear OR [ "unknown-past" "unknown-future" ] } |
:p1 a :Product ; #Passes as a :Product rdfs:label "Laptop"; schema:releaseDate "1990"^^xsd:gYear . :p2 a :Product ; #Passes as a :Product rdfs:label "Car"@en ; schema:releaseDate "unknown-future" . :p3 a :Product ; #Fails as a :Product rdfs:label :House ; schema:releaseDate "2020"^^xsd:integer . |
SPARQL property paths are a very expressive feature that can define complex expressions. ShEx does not support property paths in order to have a more controlled way to define shapes. However, using nested shapes (see Example 58), recursion and logical operators, it is possible to emulate their behavior.
In SHACL, instances are declared by the expression
rdfs
:
subClassOf
*/
rdf
:
type
, which defines the closure of the
rdfs
:
subClassof
property followed by
rdf
:
type
(see Section 5.7.2).
The following example declares that nodes conforming to shape
:
Person
must be SHACL instances of
schema
:
Person
.
:Person { a @:PersonShape } :PersonShape [ schema:Person ] OR { rdfs:subClassOf @:PersonShape } |
:alice a schema:Person . #Passes as a :Person :bob a :Teacher . #Passes as a :Person :carol a :Assistant . #Passes as a :Person :Teacher rdfs:subClassOf schema:Person . :Assistant rdfs:subClassOf :Teacher . |
NOT
s
creates a new shape expression from a shape
s
.
Nodes conform to
NOT
s
when they do not conform to
s
.
A common use case for
Not
is to check other shapes.
Defining a shape
:
NotS
as
Not
:
S
,
all nodes in an RDF graph can be valid, some of them will conform to
:
S
while the others will conform to
:
NotS
.
In this way, a continuous integration system can define the shape map that all nodes must satisfy (either positive or negatively)
and check whether they satisfy it or not.
The following code declares a shape
:
User
and its complementary
:
NotUser
.
:User { schema:name xsd:string ; schema:birthDate xsd:date? ; } :NoUser Not @:User |
Both nodes
:
alice
and
:
bob
conform to one of the shapes,
:
alice
to
:
User
and
:
bob
to
:
NoUser
.
:alice schema:name "Alice" ; #Passes as a :User schema:birthDate "1980-03-10"^^xsd:date . :bob schema:name 23 ; #Passes as a :NoUser schema:birthDate "Unknown" . |
The operator
Not
checks that a node fails to conform to a whole shape expression.
Sometimes, the intended meaning is not to negate a whole shape expression but to declare that some properties cannot appear.
This behavior is better described by declaring the maximum cardinality to 0.
Shape
:
NoName1
prohibits the appearance of property
schema
:
name
establishing its maximum cardinality to 0.
Shape
:
NoName2
looks like it does the same thing using the negation.
However, notice that
:
NoName2
will be satisfied by any node that does not conform to
schema
:
name
xsd
:
string
:NoName1 { schema:name xsd:string {0} } :NoName2 Not { schema:name xsd:string } |
The behavior differs for node
:
bob
which conforms to
:
NoName2
. The reason is that it fails to have a string value for
schema
:
name
so it fails to conform to the shape
{
schema
:
name
xsd
:
string
}
and thus, conforms to
:
NoName2
.
:alice schema:name "Alice". #Fails as a :NoName1 and :NoName2 :bob schema:name 23 . #Fails as a :NoName1 Passes as a :NoName2 :carol foaf:age 34 . #Passes as a :NoName1}*) \Passes{:NoName2 |
A common pattern is the IF-THEN construct: if some condition holds, then a given shape expression must be satisfied.
This pattern can be modeled using the logical operators
OR
and
NOT
.
Remember that
IF
x
THEN
y
is equivalent to
(
NOT
x
)
OR
y
.
The following example specifies that all products must have a
schema
:
productID
and
if a product has type
schema
:
Vehicle
,
then it must have the properties
schema
:
vehicleEngine
and
schema
:
fuelType
.
:Product { schema:productID . } AND NOT { a [ schema:Vehicle ] } OR { schema:vehicleEngine . ; schema:fuelType . } |
:kitt schema:productID "C21"; #Passes as a :Product a schema:Vehicle; schema:vehicleEngine :x42 ; schema:fuelType :electric . :bad schema:productID "C22"; #Fails as a :Product a schema:Vehicle; schema:fuelType :electric . :c23 schema:productID "C23" ; #Passes as a :Product a schema:Computer . |
The
IF
-
THEN
-
ELSE
pattern construct can be defined in a similar way.
In this case:
IF X THEN Y ELSE Z≡((NOT X) OR Y) AND (X OR Z)
The following shape declares that if a product has type
schema
:
Vehicle
,
then it must have the properties
schema
:
vehicleEngine
and
schema
:
fuelType
, otherwise,
it must have the property
schema
:
category
with a
xsd
:
string
value.
:Product ( NOT { a [ schema:Vehicle ] } OR { schema:vehicleEngine . ; schema:fuelType . } ) AND ({ a [schema:Vehicle] } OR { schema:category xsd:string } ) |
With the following data, nodes
:
kitt
and
:
c23
conform to
:
Product
each one passing one of the branches, while
:
bad1
and
:
bad2
do not conform.
:kitt a schema:Vehicle; #Passes as a :Product schema:vehicleEngine :x42 ; schema:fuelType :electric . :c23 a schema:Computer ; #Passes as a :Product schema:category "Laptop" . :bad1 a schema:Vehicle; #Fails as a :Product schema:fuelType :electric . :bad2 a schema:Computer . #Fails as a :Product |
One problem of combining recursion with negation freely is the possibility of defining paradoxical shapes.
The following shape declares a
:
Barber
as someone who shaves a person but does not shave a barber.
:Barber { # Violates the negation requirement :shaves @:Person } AND NOT { :shaves @:Barber } :Person { schema:name xsd:string } |
Given the following data:
:albert :shaves :dave . #Passes as a :Barber :bob schema:name "Robert" ; #Passes as a :Person :shaves :bob . # Passes :Barber or not? *) :dave schema:name "Dave" . #Passes as a :Person |
It is easy to check that
:
bob
conforms to
:
Person
(he has
schema
:
name
with a
xsd
:
string
value), so he shaves a person, but:
Does
:
bob
conform to
:
Barber
?
If we assume he does, then it should not shave another barber, but as he shaves himself, and we assumed he conformed to
:
Barber
then he fails the constraint of not shaving barbers which means that he should not conform.
On the other hand, if we assumed he does not conform to
:
Barber
then he satisfies both constraints, and he should conform to
:
Barber
.
This kind of problems that arise when combining negation and recursion have been studied by the logic programming and databases community. Several approaches have been studied such as negation-as-failure, stratified negation and well-founded semantics [1].
ShEx imposes a constraint to avoid ill formed data models: whenever a shape refers to itself either directly or indirectly, the chain of references cannot traverse an occurrence of the negation operation
NOT
.
The previous shape
:
Barber
violates the negation requirement as is has one self reference pointing to itself that includes a negation.
More formally, we say that there is a dependency from
:
ShapeA
to
:
ShapeB
if the definition of
:
ShapeA
contains a reference
@
:
ShapeB
.
We say that a dependency from
:
ShapeA
to
:
ShapeB
is a negative dependency if at least one of the following holds:
@
:
ShapeB
in the definition of
:
ShapeA
appears under an occurrence of the negation operator
NOT
; and :
prop
@
:
ShapeB
in the definition of
:
ShapeA
and the property
:
prop
is declared as
EXTRA
in the corresponding triple expression.In the latter case, the negation operator
NOT
does not appear explicitly,
but we still need to verify that a
:
ShapeB
is not satisfied in some neighbor nodes.
This was called hidden negation in Section 4.6.8.
The ShEx 2 specification is focused on the semantics of the validation language and separates the invocation mechanisms to a different specification called Shape Maps [77]. They were already introduced in Section 4.4.2 and are node/shape associations that are used as input to the validation process and are also the result of it.
In ShEx, the construction of shape maps is orthogonal to their use in validation. Decoupling these processes enables ShEx to address a wide range of use cases. Just as XML Schema could not have predicted its use in WSDL (a protocol that was developed years later), it is impossible to predict the many and varied ways in which shape maps may be constructed in the future.
The current ShapeMap specification defines three kinds of shape map.
Each of these consists of a comma-separated list of node/shape associations with at least two components.
nodeSelector
- identify a set of RDF nodes.
shapeLabel
- select a shape expression from the schema.
The simplest kind of shape map is a fixed shape map.
ShEx validation takes as input a set of nodeSelector/shapeLabel pairs called a fixed shape map.
The
shapeLabel
is either the label for a shape expression in the schema or the case-insensitive keyword
START
to identify the start shape
(see Section 4.4.4).
For the fixed shape map, the
nodeSelector
is one of:
Note that because the shapeLabel can identify a shape expression with only node constraints, one can use ShEx to valdiate RDF terms that do not appear in the graph. This can be useful for testing membership in a value set or verifying the form of a URL.
Fixed shape maps have a compact syntax which consists of separating each shape association by comma and node selectors from shape labels by
@
:
:alice@:User, :alice@:Employee, :bob@:User |
The query shape map extends the fixed shape map to enable simple pattern matching to select focus nodes from the data graph.
This is done by permitting the node selectors to be either an RDF node as in a fixed map or a triple pattern.
A triple pattern can have a focus keyword to represent the nodes that will be validated and a node or wildcard (represented by the underscore character
_
).
The shape map:
{ FOCUS schema:worksFor _ }@:User, { FOCUS rdf:type schema:Person}@:User, { _ schema:worksFor FOCUS }@:Company |
associates all subjects of property
schema
:
worksFor
and all nodes of type
schema
:
Person
with
:
User
,
and all objects of property
schema
:
worksFor
with shape
:
Company
.
Any node in the data graph which is both of type
schema
:
Person
and the subject of a
schema
:
worksFor
triple would be selected by both triple patterns and associated with
:
User
in the fixed map.
Such duplicates are eliminated in accordance with the rule that a shape map can have no duplicate pairs of node selector and shape label.
While the node selector may be a triple pattern, it may also be an RDF node as we would see in a fixed shape map. Common idioms of query map can do the following.
sh
:
targetNode
(see Section 5.7). sh
:
targetSubjectsOf
and
sh
:
targetObjectsOf
. rdf
:
type
.
In fact, the SHACL directive
sh
:
targetClass
offers a similar selection mechanism for the
rdf
:
type
predicate (the difference is that SHACL uses the notion of SHACL instance), see 5.7.2).
As with the above selectors, this one is very use-case
specific—one may not want to say that everything with an
rdf
:
type
property should be validated against a
:
Person
,
but it may be reasonable to select everything with type
:
Employee
.While it is not currently part of the shape map specification, the Wikidata use of shape maps extends the nodeSelector to contain a SPARQL query, enabling another common use case.
Query shape maps are not the only way to select focus nodes. For instance, it would make sense to associate a shape with a service endpoint. The Linked Data Platform [93] defines a notion of container which handles requests to get, create, modify and delete objects with a given structure. While it does not specify a mechanism to publish that structure or validate incoming data against it, earlier work at OSLC used Resource Shapes for that purpose. It is reasonable to assume that protocols like the linked data platform will exploit shapes technology, perhaps with the added precision of using HTTP Link headers to specify a node of interest, which would be associated with the related shape with that interface.
The product of validation is a result shape map which is annotated with errors encountered while testing the conformance of each node/shape pair. The result shape map is again an extension of the fixed map. Each nodeSelector/shapeLabel association in the result shape map may include any of these three additional components:
result
: either
conformant
or
nonconformant
;
reason
: a human-readable report, usualy to explain a non-conformant
result; or
appInfo
: a machine readable structure.
Engines vary in how they report errors, and they may add extra information to the resulting shape map. Some implementations extend this to include machine-readable failure messages in case of errors or recursive proof of conformance in case of success.
Given the following ShEx schema:
:User { schema:name xsd:string ; schema:knows @:User* } |
and the RDF data:
:alice schema:name "Alice"; schema:knows :carol . :bob schema:name "Robert"; schema:knows :carol . :carol schema:name "Carol" . |
If we have the query shape map:
{FOCUS schema:knows _ }@:User |
A shape map resolver would generate the fixed shape map:
:alice@:User, :bob@:User |
After applying the validation process, the result shape map obtained would be:
:alice@:User, :bob@:User, :carol@:User |
Figure 87 depicts a whole validation process with the different shape maps involved.
The fixed shape map from Figure 87 can be represented as:2
[ { "node": ":alice", "shape": ":User" }, { "node": ":bob", "shape": ":User" } ] |
The output shape map would be:
[ { "node": ":alice", "shape": ":User", "status": "conformant" }, { "node": ":bob", "shape": ":User", "status": "conformant" }, { "node": ":carol", "shape": ":User", "status": "conformant" } ] |
Because the input and output of the validation process is a shape map, long-running workflows can use the result shape map as a starting state for further validation. This is useful when shapes have inter-dependencies, i.e., when validating one node/shape pair requires validating others. Let’s look at a simplified subset of that schema and data.
Given the following schema:
:User { schema:name xsd:string ; schema:knows @:User* } |
and RDF graph
:alice schema:name "Alice"; schema:knows :bob . :bob schema:name "Robert" . |
If we were to individually validate
:
alice
and
:
bob
, we would validate
:
bob
twice, once while validating
:
alice
’s
schema
:
knows
arc and once for the explicit call to validate
:
bob
.
Semantic actions3 serve as an extension point for Shape Expressions. They can be used to signal a failure or perform some operations during the validation process.
A semantic action contains a label that indicates the language in which the action is written and a string with its contents.
When the ShEx validator finds a semantic action, it checks if it has a processor for that language and calls it with the action contents.
The result of the processor is cast to a Boolean value, in case the result is
false
, the corresponding shape would fail.
The following example uses a hypothetical Javascript semantic actions processor to capture the start and end events in a conference and to check that the start date is before the end date.
prefix js: <http://shex.io/extensions/javascript> :Event { schema:startDate xsd:dateTime %js:{ let start = o %} ; schema:endDate xsd:dateTime %js:{ let end = o %} ; }
The following example checks that the declared area of a rectangle is effectively its width times height.
prefix js: <http://shex.io/extensions/javascript> :Rectangle { :height xsd:float %js:{ let height = o %} ; :width xsd:float %js:{ let width = o %} ; :area xsd:float %js:{ o = height * width %} }
Semantic actions have been employed to transform RDF files to other formats like XML or JSON [80], or even other ShEx schemas as performed by the Map extension.4
The test suite defines a single extension language called Test5 that can fail a validation and/or return a message.
ShEx was designed as an RDF validation language which is independent of reasoners or inference systems.
A ShEx processor takes as input an RDF graph and checks if its nodes conform to the shapes defined in a ShEx schema.
The shapes describe the topology of the RDF graph taking into account the possible values of nodes as well as the incoming and outgoing arcs.
In ShEx, a triple whose predicate is
rdf
:
type
is treated as any other triple, and in fact there is no special treatment for nodes that are also RDF classes.
ShEx separates RDF classes and types following the guidelines described
in Section 3.2.
This independence between ShEx and reasoners makes it possible to apply a ShEx processor to a plain RDF graph before inference, to validate the resulting graph after applying a reasoner, or even to validate the intermediate graphs during the reasoning phase, checking reasoner’s behavior.
The following shapes can be used to check an RDF graph before and after RDF Schema inference.
Shape
:
TeacherBefore
describes that nodes must have
rdf
:
type
:
Teacher
,
a property
schema
:
name
with a
xsd
:
string
value and zero or more properties
:
teaches
whose nodes must conform to
:
Course
.
Shape
:
TeacherAfter
describes the shape that teachers must have after inference.
For example, they must have
rdf
:
type
:
Teacher
and
:
Person
,
and the values of property
:
teaches
must have
rdf
:
type
:
Course
.
:TeacherBefore EXTRA a { a [:Teacher]? ; schema:name xsd:string ; :teaches @:Course* } :TeacherAfter EXTRA a { a [:Teacher]; a [:Person]; schema:name xsd:string ; :teaches { a [:Course] } @:Course } :Course { a [:Course]? } |
If we validate the following RDF data before applying inference,
nodes
:
bob
and
:
carol
do not conform to shape
:
TeacherAfter
:alice a :Teacher, :Person; #Passes as a :TeacherBefore schema:name "Alice" ; #Passes as a :TeacherAfter :teaches :algebra . :bob schema:name "Robert" ; #Passes as a :TeacherBefore :teaches :logic . #Fails as a :TeacherAfter :carol a :Teacher ; #Passes as a :TeacherBefore schema:name "Carol" . #Fails as a :TeacherAfter :algebra a :Course . :teaches rdfs:domain :Teacher . :teaches rdfs:range :Course . :Teacher rdfs:subClassOf :Person . |
On the other side, if we validate the previous RDF graph after applying RDF Schema inference, both
:
bob
and
:
carol
should conform to
:
TeacherAfter
.
This combination of shapes before and after inference can be used to check the behavior of a reasoner.
For example, if in the previous case, a faulty RDFS reasoner does not infer that
:
logic
must have
rdf
:
type
:
Course
,
:
bob
would not conform to
:
TeacherAfter
and the bug could be detected.
ShEx has an
import
keyword that specifies the IRI of another schema that can be imported.
The ShEx processor puts the labeled shapes and triple expressions of the imported schema in scope
for resolution of references in the importing document.
If the imported schema imports other schemas, they are also imported.
For example, if there is a schema located at
http
://
example
.
org
/
Person
.
shex
with the content.
:Person { $:name ( schema:name . | schema:givenName . ; schema:familyName . ) ; schema:email . } |
And we define a new schema as.
import <http://example.org/Person.shex> :Employee { &:name ; schema:worksFor <CompanyShape> } :Company { schema:employee @:Employee ; schema:founder @:Person ; } |
:alice schema:name "Alice"; #Passes as a :Employee schema:worksFor :OurCompany . :OurCompany schema:employee :alice ; schema:founder :bob . :bob schema:name "Robert" ; schema:email <mailto:bob@example.com> . |
The ShEx processor imports each imported schemas exactly once so cyclic imports are allowed. For instance, a schema may import itself or it may import some schema which directly or indirectly imports it.
However, it is an error to import a schema which attempts to re-define a shape expression or triple expression.
For instance, if
http
://
example
.
org
/
Person
.
shex
defined either
:
Employee
or
:
Company
,
or if the importing schema defined
:
name
, the import would fail and processing would stop.
The ShEx language is defined in terms of a JSON-LD syntax, called “ShExJ”, which separates the compact syntax details from the language specification. This serves as an abstract syntax in that it has constructs to capture all of the logic of ShEx. Having an abstract syntax provides a clear definition of the language, makes it easier to write language processors and encourages the definition of other concrete syntax formats. The fact that it is JSON-LD means that the RDF representation of ShEx, called “ShExR”, is simply the JSON-LD interpretation of ShExJ.
The following ShEx schema
PREFIX : <http://example.org/> PREFIX schema: <http://schema.org/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> :User IRI { schema:name xsd:string ; schema:knows @:User* } |
can be represented in ShExR as6:
PREFIX sx: <http://shex.io/ns/shex#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> prefix : <http://example.org/> prefix schema: <http://schema.org/> <> a sx:Schema ; sx:shapes :User . :User a sx:ShapeAnd ; sx:shapeExprs ( [ a sx:NodeConstraint ; sx:nodeKind sx:iri ] [ a sx:Shape; sx:expression [ a sx:EachOf ; sx:expressions ( [ a sx:TripleConstraint ; sx:predicate schema:name ; sx:valueExpr [ a sx:NodeConstraint ; sx:datatype xsd:string ] ] [ a sx:TripleConstraint ; sx:predicate schema:knows ; sx:valueExpr :User; sx:min 0 ; sx:max -1 ] ) ] ] ). |
It can can also be represented in JSON-LD as:
{ "@context": "https://shexspec.github.io/context.jsonld", "type": "Schema", "shapes": [ { "type": "ShapeAnd", "shapeExprs": [ { "type": "NodeConstraint", "nodeKind": "iri" }, { "type": "Shape", "expression": { "type": "EachOf", "expressions": [ { "type": "TripleConstraint", "predicate": "http://schema.org/name", "valueExpr": { "type": "NodeConstraint", "datatype": "xsd:string" } }, { "type": "TripleConstraint", "predicate": "http://schema.org/knows", "valueExpr": "http://example.org/User", "min": 0, "max": -1 } ] } } ], "id": "http://example.org/User" } ] } |
In this chapter we learned about the ShEx language.
We collected the following selection of references about Shape Expressions.
-1
in
max
means unbounded.