OCL Conversion to Schematron (xslt2 query binding)
Overview
This page documents how OCL expressions are translated to Schematron, based upon the xslt2 query binding, which has been standardized by ISO 19757-3:2016. More specifically, it documents the translation of the OCL language constructs that are supported by ShapeChange.
NOTE: This translation is available since ShapeChange v2.9.0.
ShapeChange configuration items relevant for the generation of Schematron schemas are documented as well.
Relevant ShapeChange configuration items
XmlSchema target conversion rules:
- rule-xsd-pkg-schematron: Must be part of the encoding rule of a model element in order for OCL constraints defined on that element to be translated to Schematron rules.
- rule-xsd-cls-codelist-constraints-codeAbsenceInModelAllowed: If OCL constraints refer to code list codes that are not explicitly modelled in the respective code list, then this rule must be added to the XML Schema encoding rule, in order for the constraints to be parsed correctly.
- rule-xsd-cls-codelist-constraints2: May be used for creating a Schematron schema that checks the content of code list typed properties, more specifically that the code list exists, is correctly referenced, and also that the code is defined by the code list. This conversion rule is added here only for the sake of completeness, since the Schematron generated by this rule does not translate any OCL constraints. It is purely generated based upon class and property information available in the model.
The XmlSchema target parameter schematronQueryBinding must be set to “xslt2” in order for ShapeChange to use the Schematron translation documented on this page.
Translation of OCL to Schematron
Conversion from OCL to Schematron is performed on the basis of a ShapeChange-internal syntax representation of OCL expressions. The representation is close to the Concrete Syntax structure described in the OCL 2.2 standard.
Naturally, the syntax representation of OCL is recursive. Therefore the principles of translation from OCL to another language can best be described using a recursive notation. Below, we describe how some particular constructs such as the application of the select() iterator
x->select(t|pred(t))
translate to XPath 2.0, in the xslt2 Schematron query binding, where the translation results of the constituent parts (such as x and pred(t)) are presumed.
For a valid OCL expression x, τ(x) denotes the equivalent XPath 2.0 expression (in the xslt2 query binding). The expression x may contain free variables (explicit or implicit), which need to be treated when computing τ(x). One typical variable is self, which translates to current(). So, τ(self)=current().
NOTE: Do not use variables called BYREFVAR, METAREFVAR, COUNT1, COUNT2, ISUVAR1, ISUVAR2, or that matches the regular expression ‘VAR\d+’ within OCL expressions. The translation of property calls, isUnique(..), and calls of operation propertyMetadata() may generate and use such variables, which can result in naming conflicts if the OCL expression contained variables with these names.
Variable access self
OCL syntax
self
In words
The current object in the context of which the expression shall hold.
Schematron translation
current()
Iterator variable access
OCL syntax
t defined in an iterator used in x(t)
In words
t has to be assigned a value from the path that leads to x(t).
Schematron translation
$t
NOTE: Do not use variables called BYREFVAR, METAREFVAR, COUNT1, COUNT2, ISUVAR1, ISUVAR2, or that matches the regular expression ‘VAR\d+’ within OCL expressions. The translation of property calls, isUnique(..), and calls of operation propertyMetadata() may generate and use such variables, which can result in naming conflicts if the OCL expression contained variables with these names.
Let variable access
OCL syntax
t defined in a let construct used in x(t)
In words
t necessarily has a value from the initialize expression.
Schematron translation
If t is defined in the outer (current()) context: $id, where id is some unique <let> variable. The let initializer is translated in the current() context and initializes a Schemtron <let> element.
Other: The let initializer of the variable t is translated in the current context and substitutes t.
NOTE: Do not use variables called BYREFVAR, METAREFVAR, COUNT1, COUNT2, ISUVAR1, ISUVAR2, or that matches the regular expression ‘VAR\d+’ within OCL expressions. The translation of property calls, isUnique(..), and calls of operation propertyMetadata() may generate and use such variables, which can result in naming conflicts if the OCL expression contained variables with these names.
Let expression
OCL syntax
let x=y in z(x)
NOTE: Multiple variables in the let expression are separated using commas: let a=w, b=x, c=y in z(a,b,c)
In words
Assignment of expression y to variable x. Result is z(x).
Schematron translation
If x and y are defined in the outer (current()) context:τ(z(τ(x)))
Additionally, a Schematron <let> is created.
Other: τ(z(τ(y)))
This means we are substituting the initializers.
Integer or real constants
OCL syntax
123 or 3.1415
Schematron translation
same, i.e.: 123 or 3.1415
Boolean constants
OCL syntax
true or false
Schematron translation
true() or false()
String constants
OCL syntax
‘xxxxx’
Schematron translation
same, i.e.: ‘xxxxx’
Enumeration constants
OCL syntax
Type::value
Schematron translation
‘value’
Codelist constants
OCL syntax
Type::value
Schematron translation
GML 3.3 rules:
The constant is translated to an external codelist reference according to a pattern in tagged values in the codelist class.
Other GML version rules: ‘value’
If-then-else-endif expression
OCL syntax
if x then y else z endif
In words
If x evaluates to true then the value of the expression is y, otherwise z.
Schematron translation
if τ(x) then τ(y) else τ(z)
Property call
OCL syntax
x . property
NOTE: This is the shorthand notation for the OCL “Collect” operation.
In words
Collection of property values reached from the instance or collection represented by x by applying property property.
Schematron translation
- If simple-typed:
- value type is NOT a code list:
- property encoding is GML 3.2 or GML 3.3: τ(x)/property
- property encoding is ISO 19139: τ(x)/property/*
- value type IS a code list:
- A so-called code list value pattern is used. The default pattern is {value}. If property is not GML 3.3 encoded, and the code list has tagged value ‘codeList’, then the default changes to {codeList}/{value}. The default can be overridden by 1) tagged value ‘codeListValuePattern’ on the code list, or 2) the XmlSchema target parameter defaultCodeListValuePattern. The pattern should always contain the keyword {value} and may also contain the keyword {codeList}. When creating the XPath expression, these keywords will be replaced. If the pattern is not just {value} then it is translated as a concat( ) operation. In general, the keywords are mapped to XML items as follows:
- GML 3.2:
- codeList -> @codeSpace
- value -> text() (i.e. the text of the XML element that represents property)
- ISO 19139:
- codeList -> @codeList
- value -> @codeListValue
- If prop is GML 3.3 encoded, then the code list value pattern is ignored. The @xlink:href provides the code value.
- GML 3.2:
-
property encoding is GML 3.2:
- code list IS encoded as dictionary: τ(x)/property/ + result of code list value pattern (see above)
- code list is NOT encoded as dictionary: τ(x)/property
-
property encoding is GML 3.3:
- code list IS encoded as dictionary: τ(x)/property/@xlink:href
- code list is NOT encoded as dictionary: τ(x)/property
- property encoding is ISO 19139: τ(x)/property/*/ + result of code list value pattern (see above)
- A so-called code list value pattern is used. The default pattern is {value}. If property is not GML 3.3 encoded, and the code list has tagged value ‘codeList’, then the default changes to {codeList}/{value}. The default can be overridden by 1) tagged value ‘codeListValuePattern’ on the code list, or 2) the XmlSchema target parameter defaultCodeListValuePattern. The pattern should always contain the keyword {value} and may also contain the keyword {codeList}. When creating the XPath expression, these keywords will be replaced. If the pattern is not just {value} then it is translated as a concat( ) operation. In general, the keywords are mapped to XML items as follows:
- value type is NOT a code list:
- If the value type is a complex type without identity – typically a data type or union: τ(x)/property/*
- If the value type is a complex type with identity:
- This requires careful consideration of how the value can be encoded: inline, by reference, or both. Tagged value inlineOrByReference on property can be used to control this encoding (default is both, i.e. inline or by reference).
- If the value is encoded by reference, then it must be looked up in the XML document that is being validated, using an expression as the following: (for $BYREFVAR in variable/property/@xlink:href return key(‘idKey’,identifierExpression)) – where:
- variable is either current(), the variable defined by a surrounding iterator, or a variable that is generated as part of a ‘for’ expression while translating the OCL expression of a property path.
- identifierExpression is an XPath expression to extract from an xlink:href value the @id or @gml:id of the referenced object. That ID is then used to look up the referenced object in the key-construct ‘idKey’. The ‘key’ provides an efficient lookup mechanism for all objects in the dataset, based upon their @id or @gml:id.
- The identifierExpression is different, depending upon whether the xlink:href reference contains a prefix α and/or a postfix β:
- α can be configured via the XmlSchema target parameter schematronXlinkHrefPrefix (default is #).
- β can be configured via the XmlSchema target parameter schematronXlinkHrefPostfix (default is the empty string).
- If both α and β have a value, the identifierExpression is: substring-before(substring-after($BYREFVAR, α), β)
- If only α has a value, the identifierExpression is: substring-after($BYREFVAR, α)
- If only β has a value, the identifierExpression is: substring-before($BYREFVAR, β)
- If neither α nor β have a value, the identifierExpression boils down to: $BYREFVAR
- Inline encoded properties are translated to property/*.
- If a property can (also) be encoded by reference, then a ‘for’ expression is constructed. The expression merges the sequences produced by the expressions for the inline and by reference cases.
- NOTE: Cases where a property value is given both inline and by reference are currently not handled. Such cases will lead to incorrect results.
- If the OCL expression is a path of multiple property names (e.g. property1.property2.property3) then a nesting of ‘for’ expressions may be created. Such a complex expression is necessary, in order to correctly set the context for computing referenced objects in all cases.
- If the value is encoded by reference, then it must be looked up in the XML document that is being validated, using an expression as the following: (for $BYREFVAR in variable/property/@xlink:href return key(‘idKey’,identifierExpression)) – where:
- This requires careful consideration of how the value can be encoded: inline, by reference, or both. Tagged value inlineOrByReference on property can be used to control this encoding (default is both, i.e. inline or by reference).
- If the metadata of the property shall be accessed (see operation call propertyMetadata()): τ(x)/property
Property call according to nilReason implementation pattern
OCL syntax
x . property. value
x . property. reason
In words
Set of instances reached by property, respectively by property/@nilReason
Schematron translation
- Case x . property. value: τ(x.property)
- Compilation as above – ‘x.property’ is assumed to have the type of ‘value’.
- Case x . property. reason
- 19136 encoding: τ(x.property) [@xsi.nil=’true’]/@nilReason
- 19139 encoding: τ(x.property)[not(*)]/@gco:nilReason
Operation call propertyMetadata()
OCL syntax
x. propertyMetadata()
In words
Access the metadata associated with the property, instead of the property value.
Schematron translation
- Case x is a property: This is translated like a property call for the case of a property whose value is a complex type with identity, given by reference – just using @metadata instead of @xlink:href: for $METAREFVAR in variable/property/@metadata return key(‘idKey’,identifierExpression)
- Case x is a variable: Cannot be translated, because the variable may be bound to different kinds of expressions, not just a property call, and even if it was bound to a property call, then the variable may be used multiple times: not only for accessing the metadata of the property, but also its value. That would require different translations of the property call. However, a variable can only be bound to a single expression.
Operation call allInstances()
OCL syntax
x . allInstances()
In words
Collection of all object instances of type x.
x represents a type-valued expression.
Schematron translation
If x is a type constant:
An expression that selects all elements based upon a predicate which uses a combination of local-names and namespace-uris of the type and its (direct and indirect) subtypes (if they are instantiable, i.e. not abstract and not suppressed). We differentiate three cases:
- Case multiple instantiable classes: //*[(local-name()=’localName_type_1′ and namespace-uri()=’namespace_type_1′) or … or (local-name()=’localName_type_n’ and namespace-uri()=’namespace_type_n’)]
- Case single instantiable class: //*[local-name()=’localName_type’ and namespace-uri()=’namespace_type’]
- Case no instantiable class: //*[false()]
NOTE: ShapeChange will store the expression in a new let variable, since it makes sense to only compute the collection of instances of a certain type once. An example where this would immediately be useful is Type.allInstances().inlineOrByReferenceProperty… – because the let variable can then be used both for the translation of the inline and the byReference case of the property access.
If x is a type expression:
Cannot be translated because required schema information is not available at run-time.
Operation call oclIsKindOf()
OCL syntax
x . oclIsKindOf(y)
In words
The single object instance x is checked for complying with type y.
Schematron translation
If y is a type constant:
boolean(τ(x)[(local-name()=’localName_T1‘ and namespace-uri()=’namespace_T1‘) or … or (local-name()=’localName_Ti‘ and namespace-uri()=’namespace_Ti‘)]), where Tk is one of the concrete derivations of y, including y.
NOTE: boolean(…) may be omitted if the argument is known to be used by operands, which do an implicit conversion to Boolean.
If y is a type expression: Cannot be translated because required schema information is not available at run-time.
Operation call oclIsTypeOf()
OCL syntax
x . oclIsTypeOf(y)
In words
The single object instance x is checked for being of type y.
Schematron translation
If y is a type constant: boolean(τ(x) [local-name()=’localName_y’ and namespace-uri()=’namespace_y’]).
boolean(…) may be omitted if the argument is known to be used by operands, which do an implicit conversion to Boolean.
If x is a type expression:
Cannot be translated because required schema information is not available at run-time.
Operation call oclAsType()
OCL syntax
x . oclAsType(y)
In words
The single object instance x is downcast to type y. The value is ‘undefined’ if this is not possible.
NOTE: This operation is typically used in situations where ISO 19139 encoding applies, to cast an attribute with value type CharacterString to a code list type, so that comparison with a code/enum is possible.
Schematron translation
If y is a type constant:
τ(x) [(local-name()=’localName_T1‘ and namespace-uri()=’namespace_T1‘) or … or (local-name()=’localName_Ti‘ and namespace-uri()=’namespace_Ti‘)], where Tk is one of the concrete derivations of y, including y.
If y is a type expression: Cannot be translated because required schema information is not available at run-time.
Casting CharacterString to a code list: We are making an exception to the strict rules with simple data elements. We permit CharacterString being cast to a code list type.
Operation call +,-,*,/
OCL syntax
x + y, etc.
In words
Value of x + y, etc.
Schematron translation
τ(x) + τ(y)
τ(x) – τ(y)
τ(x) * τ(y)
τ(x) div τ(y)
Operation call =, <>
OCL syntax
x = y
x <> y
In words
Value of x=y, x<>y
Schematron translation
If x and y are simple types:
- τ(x) = τ(y)
- τ(x) != τ(y)
If x and y are complex types:
- generate-id(τ(x)) = generate-id(τ(y))
- generate-id(τ(x)) != generate-id(τ(y))
Operation call <, >, <=, >=
OCL syntax
x < y, etc.
In words
Value of x < y, etc.
Schematron translation
- τ(x) < τ(y)
- τ(x) > τ(y)
- τ(x) <= τ(y)
- τ(x) >= τ(y)
Operation call size()
OCL syntax
x . size()
In words
Number of characters in the string instance x.
Schematron translation
string-length(τ(x))
Operation call concat()
OCL syntax
x . concat(y)
In words
String concatenation of x and y.
Schematron translation
concat(τ(x),τ(y))
A series of concats may be joined to a multi-argument concat invocation.
Operation call substring()
OCL syntax
x . substring(y,z)
In words
Substring of x running from position y to position z.
More specifically: the sub-string of x starting at the character number y, up to and including character number z. Character numbers run from 1 to x.size(). The following condition should be met: 1 <= y <= z <= x.size().
Schematron translation
substring(τ(x), τ(y), τ(z)-τ(y)+1)
Operation call and, or, xor, implies
OCL syntax
- x and y
- x or y
- x xor y
- x implies y
In words
Logical combination as indicated
Schematron translation
- τ(x) and τ(y)
- τ(x) or τ(y)
- boolean(τ(x)) != boolean(τ(y))
- NOTE: The boolean(..) is omitted if the result type of T(x)/T(y) is known to be boolean.
- not(τ(x)) or τ(y)
Collection operation call size()
OCL syntax
x -> size()
In words
Number of objects in the collection x.
Schematron translation
count(τ(x))
Collection operation call isEmpty()
OCL syntax
x->isEmpty()
In words
Predicate: Is the collection represented by x empty?
Schematron translation
not(τ(x))
Collection operation call notEmpty()
OCL syntax
x->notEmpty()
In words
Predicate: Is the collection represented by x not empty?
Schematron translation
boolean(τ(x))
NOTE: boolean may be omitted, if τ(x) is known to be boolean or is used by operands, which do an implicit conversion to boolean.
Collection operation call exists()
OCL syntax
x -> exists(t|b(t))
In words
Predicate: Does the collection x contain a value t for which the boolean expression b(t) holds?
Schematron translation
some $t in τ(x) satisfies τ(b(t))
Collection operation call forAll()
OCL syntax
x -> forAll(t|b(t))
In words
Predicate: Does the collection x only contain values t for which the boolean expression b(t) holds?
Schematron translation
every $t in τ(x) satisfies τ(b(t))
Collection operation call isUnique()
OCL syntax
x -> isUnique(t|y(t))
In words
Determine if y(t) evaluates to a different, possibly null, value for each element in the source collection x.
If y(t) equals t, then uniqueness of the values in the collection x is checked.
Schematron translation
This can only be translated in a few cases:
- If y is a constant, y(t)=const: count(τ(x))<=1
- If y is identity (x->isUnique(t|t)):
- if value t has a simple type:
- type other than Boolean: count(T(x)) = count(distinct-values(T(x))
- type is Boolean: count(T(x)) = count(distinct-values(T(x)/xs:boolean(.)))
- if value is a complex type with identity: count(T(x)) = count(distinct-values(T(x)/@*:id)
- else (value is a complex type without identity – for example a data type or union): for $COUNT1 in count(T(X)), $COUNT2 in sum(for $ISUVAR1 in T(X), $ISUVAR2 in T(X) return (if(empty($ISUVAR1) and empty($ISUVAR2)) then 0 else if ((empty($ISUVAR1) and not(empty($ISUVAR2))) or (not(empty($ISUVAR1)) and empty($ISUVAR2))) then 1 else if (generate-id($ISUVAR1) = generate-id($ISUVAR2)) then 0 else if (deep-equal($ISUVAR1,$ISUVAR2)) then 0 else 1)) return $COUNT1 * ($COUNT1 – 1) = $COUNT2
- if value t has a simple type:
- If y(t) is a property call that is based on variable t and:
- if the property call only involves properties with max cardinality = 1:
- if the value resulting from the property call has a simple type:
- type other than Boolean: count(T(x)) = count(distinct-values(for $t in T(x) return (if (empty(T(y(t)))) then ‘SC_EMPTY_ISU_BODY’ else T(y(t)) )))
- type is Boolean: count(T(x)) = count(distinct-values(for $t in T(x) return (if (empty(T(y(t)))) then ‘SC_EMPTY_ISU_BODY’ else T(y(t))/xs:boolean(.) )))
- NOTE: An empty value for T(y(t)) is mapped to the string ‘SC_EMPTY_ISU_BODY’ in order to correctly handle null values in the evaluation of the iterator isUnique(). Without this mapping, empty/null values would be ignored by the function distinct-values().
- if the value resulting from the property call has a complex type with identity: count(T(x)) = count(distinct-values(for $t in T(x) return (if (empty(T(y(t)))) then ‘SC_EMPTY_ISU_BODY’ else T(y(t))/@*:id )))
- else (value resulting from the property call is a complex type without identity – for example a data type or union): for $COUNT1 in count(T(X)), $COUNT2 in sum(for $t in T(x), $ISUVAR1 in T(y(t)), $ISUVAR2 in T(y(t)) return (if(empty($ISUVAR1) and empty($ISUVAR2)) then 0 else if ((empty($ISUVAR1) and not(empty($ISUVAR2))) or (not(empty($ISUVAR1)) and empty($ISUVAR2))) then 1 else if (generate-id($ISUVAR1) = generate-id($ISUVAR2)) then 0 else if (deep-equal($ISUVAR1,$ISUVAR2)) then 0 else 1)) return $COUNT1 * ($COUNT1 – 1) = $COUNT2
- if the value resulting from the property call has a simple type:
- if the property call involves a property with max cardinality > 1: cannot be translated; each evaluation of y(t) must result in at most one value. ShapeChange will log an error if this condition is not fulfilled.
- if the property call only involves properties with max cardinality = 1:
- Any other, particularly arbitrary expressions: Cannot be translated because no way to express this in XPath 2.0 has been found. isUnique bodies must not contain expressions other than constants, identity, or a property call. ShapeChange will log an error if this condition is not fulfilled.
Collection operation call select()
OCL syntax
x -> select(t|b(t))
In words
Compute the collection of those objects t in x, for which the boolean expression b(t) holds.
Schematron translation
τ(x) [for $t in . return τ(b(t))]
NOTE: Using a ‘for’-expression might seem odd here, but it allows us to bind the variable $t used in the select statement to the context defined by T(x). The variable can then be used in T(b(t)).
Pattern matching function on Strings
OCL syntax
x . matches( pattern )
Note: This operation call is an extension. It is not part of the OCL standard.
In words
Boolean function which yields true if the pattern of type String matches the String argument.
Schematron translation
matches( τ(x), τ(pattern) )