|Shapes Constraint Language|
|Status||Published, W3C Recommendation |
|First published||October 8, 2015|
|Committee||RDF Data Shapes Working Group|
Shapes Constraint Language (SHACL) is a World Wide Web Consortium (W3C) standard language for describing Resource Description Framework (RDF) graphs. SHACL has been designed to enhance the semantic and technical interoperability layers of ontologies expressed as RDF graphs.
The growing adoption of SHACL may influence the future of linked data.
SHACL lets its users describe shapes of data, targeting where a specific shape applies.
A property shape describes characteristics of graph nodes that can be reached via a specific path. A path can be a single predicate (property) or a chain of predicates. A property shape must always specify a path. This is done by using
One can think of property shapes that use simple paths as describing values of certain properties e.g., values of an age property or values of a works for property.
Complex paths can specify a combination of different predicates in a chain, including the inverse direction, alternative predicates and transitive chains.
Property shapes can be defined as part of a node shape. In this case, a node shape points to property shapes using
sh:property predicate. Property shapes can also be "stand-alone" i.e., completely independent from any node shapes.
A node shape describes characteristics of specific graph nodes irrespective of how you get to them. It can, for example, be said that certain graph nodes must be literals or a URIs, etc. It is common to include property shapes into a node shape, effectively defining values of many different properties of a node.
For example, a node shape for an employee may incorporate property shapes for age and works for properties.
A constraint is a way to describe different characteristics of values. A shape will contain one or more constraint declarations. SHACL provides many pre-built constraint types. For example,
sh:datatype is used to describe the type of literal values e.g., if they are strings or integers or dates.
sh:minCount is used to describe the minimum required number of values.
sh:length is used to describe the number of characters for a value.
A target connects a shape with data it describes. A simplest way to specify a target is to say that a node shape is also a class. This means that its definition is applicable to all members (instances) of a class. Other ways to define a target of a shape are by:
Target declarations can be included in a node shape or in a property shape. However, when a property shape is a part of a node shape, its own targets are ignored.
rdfs:subClassOf statements to identify targets. A shape targeting members of a class, also targets members of all its subclasses. In other words, all SHACL definitions for a class are inherited by subclasses.
SHACL enables validation of graphs. A SHACL validation engine takes as input a graph to be validated (called data graph) and a graph containing SHACL shapes declarations (called shapes graph) and produces a validation report, also expressed as a graph. All these graphs can be represented in any Resource Description Framework (RDF) serialization formats including JSON-LD or Turtle.
SHACL is fairly unique in its approach in that it builds-in not only the ability to specify a severity level of validation results, but also the ability to return suggestions on how data may be fixed if the validation result is raised. Built-in levels are Violation, Warning and Info, defaulting to Violation if no
sh:severity has been specified for a shape. Users of SHACL can add other, custom levels of severity. Validation results may also have values for other properties, as described in the specification. For example, the property
sh:resultMessage is designed to communicate additional textual details to users, including recommendations on how data may be fixed to address to validation result. In cases where a constraint does not have any values for
sh:message in the shapes graph the SHACL processor may automatically generate other values for
sh:resultMessage. Some SHACL processors (e.g., the one implemented by TopQuadrant) made these suggestions actionable in software, automating their application on user's request.
World Wide Web Consortium published the following SHACL Specifications:
The SHACL Test Suite and Implementation Report linked to from the SHACL W3C specification lists some open source tools that could be used for SHACL validation as of June 2019. By the end of 2019 many commercial RDF database and framework vendors announced support for at least SHACL Core.
Some of the open source tools listed in the report are:
Eclipse RDF4J is an open source Java framework by the Eclipse Foundation for processing RDF data, which supports SHACL validation.
SHACL is supported by most RDF Graph technology vendors including Cambridge Semantics (Anzo, coming in Q1 2022), Franz (AllegroGraph), Metaphacts, Ontotext (GraphDB), Stardog and TopQuadrant. There is even support in the commercial products that use property graph data model, such as Neo4J. 
Levels of implementation may vary. At minimum, vendors support SHACL Core. Some also support SHACL SPARQL for higher expressivity, while others may support SHACL Advanced Features which include rules and functions.