OCL Conversion to Schematron (xslt2 query binding)

This page documents how OCL expressions are translated to Schematron, based upon the xslt2 query binding, which has been standardized by ISO 19757-3:2016. More specifically, it documents the translation of the OCL language constructs that are supported by ShapeChange.

NOTE: This translation is available since ShapeChange v2.9.0.

ShapeChange configuration items relevant for the generation of Schematron schemas are documented as well.

Relevant ShapeChange configuration items

XmlSchema target conversion rules:

rule-xsd-pkg-schematron: Must be part of the encoding rule of a model element in order for OCL constraints defined on that element to be translated to Schematron rules.
rule-xsd-cls-codelist-constraints-codeAbsenceInModelAllowed: If OCL constraints refer to code list codes that are not explicitly modelled in the respective code list, then this rule must be added to the XML Schema encoding rule, in order for the constraints to be parsed correctly.
rule-xsd-cls-codelist-constraints2: May be used for creating a Schematron schema that checks the content of code list typed properties, more specifically that the code list exists, is correctly referenced, and also that the code is defined by the code list. This conversion rule is added here only for the sake of completeness, since the Schematron generated by this rule does not translate any OCL constraints. It is purely generated based upon class and property information available in the model.

The XmlSchema target parameter schematronQueryBinding must be set to “xslt2” in order for ShapeChange to use the Schematron translation documented on this page.

Translation of OCL to Schematron

Conversion from OCL to Schematron is performed on the basis of a ShapeChange-internal syntax representation of OCL expressions. The representation is close to the Concrete Syntax structure described in the OCL 2.2 standard.

Naturally, the syntax representation of OCL is recursive. Therefore the principles of translation from OCL to another language can best be described using a recursive notation. Below, we describe how some particular constructs such as the application of the select() iterator

x->select(t|pred(t))

translate to XPath 2.0, in the xslt2 Schematron query binding, where the translation results of the constituent parts (such as x and pred(t)) are presumed.

For a valid OCL expression x, τ(x) denotes the equivalent XPath 2.0 expression (in the xslt2 query binding). The expression x may contain free variables (explicit or implicit), which need to be treated when computing τ(x). One typical variable is self, which translates to current(). So, τ(self)=current().

NOTE: Do not use variables called BYREFVAR, METAREFVAR, COUNT1, COUNT2, ISUVAR1, ISUVAR2, or that matches the regular expression ‘VAR\d+’ within OCL expressions. The translation of property calls, isUnique(..), and calls of operation propertyMetadata() may generate and use such variables, which can result in naming conflicts if the OCL expression contained variables with these names.

Variable access self

OCL syntax

self

In words

The current object in the context of which the expression shall hold.

Schematron translation

current()

Iterator variable access

OCL syntax

t defined in an iterator used in x(t)

In words

t has to be assigned a value from the path that leads to x(t).

Schematron translation

Let variable access

OCL syntax

t defined in a let construct used in x(t)

In words

t necessarily has a value from the initialize expression.

Schematron translation

If t is defined in the outer (current()) context: $id, where id is some unique <let> variable. The let initializer is translated in the current() context and initializes a Schemtron <let> element.

Other: The let initializer of the variable t is translated in the current context and substitutes t.

Let expression

OCL syntax

let x=y in z(x)

NOTE: Multiple variables in the let expression are separated using commas: let a=w, b=x, c=y in z(a,b,c)

In words

Assignment of expression y to variable x. Result is z(x).

Schematron translation

Schematron translation

GML 3.3 rules:
The constant is translated to an external codelist reference according to a pattern in tagged values in the codelist class.

Other GML version rules: ‘value’

If-then-else-endif expression

OCL syntax

if x then y else z endif

In words

If x evaluates to true then the value of the expression is y, otherwise z.

Schematron translation

if τ(x) then τ(y) else τ(z)

Property call

OCL syntax

x . property

NOTE: This is the shorthand notation for the OCL “Collect” operation.

In words

Collection of property values reached from the instance or collection represented by x by applying property property.

Schematron translation

If simple-typed:
- value type is NOT a code list:
  - property encoding is GML 3.2 or GML 3.3: τ(x)/property
  - property encoding is ISO 19139: τ(x)/property/*
- value type IS a code list:
  - A so-called code list value pattern is used. The default pattern is {value}. If property is not GML 3.3 encoded, and the code list has tagged value ‘codeList’, then the default changes to {codeList}/{value}. The default can be overridden by 1) tagged value ‘codeListValuePattern’ on the code list, or 2) the XmlSchema target parameter defaultCodeListValuePattern. The pattern should always contain the keyword {value} and may also contain the keyword {codeList}. When creating the XPath expression, these keywords will be replaced. If the pattern is not just {value} then it is translated as a concat( ) operation. In general, the keywords are mapped to XML items as follows:
    - GML 3.2:
      - codeList -> @codeSpace
      - value -> text() (i.e. the text of the XML element that represents property)
    - ISO 19139:
      - codeList -> @codeList
      - value -> @codeListValue
    - If prop is GML 3.3 encoded, then the code list value pattern is ignored. The @xlink:href provides the code value.
  - property encoding is GML 3.2:
    - code list IS encoded as dictionary: τ(x)/property/ + result of code list value pattern (see above)
    - code list is NOT encoded as dictionary: τ(x)/property
  - property encoding is GML 3.3:
    - code list IS encoded as dictionary: τ(x)/property/@xlink:href
    - code list is NOT encoded as dictionary: τ(x)/property
  - property encoding is ISO 19139: τ(x)/property/*/ + result of code list value pattern (see above)
If the value type is a complex type without identity – typically a data type or union: τ(x)/property/*
If the value type is a complex type with identity:
- This requires careful consideration of how the value can be encoded: inline, by reference, or both. Tagged value inlineOrByReference on property can be used to control this encoding (default is both, i.e. inline or by reference).
  - If the value is encoded by reference, then it must be looked up in the XML document that is being validated, using an expression as the following: (for $BYREFVAR in variable/property/@xlink:href return key(‘idKey’,identifierExpression)) – where:
    - variable is either current(), the variable defined by a surrounding iterator, or a variable that is generated as part of a ‘for’ expression while translating the OCL expression of a property path.
    - identifierExpression is an XPath expression to extract from an xlink:href value the @id or @gml:id of the referenced object. That ID is then used to look up the referenced object in the key-construct ‘idKey’. The ‘key’ provides an efficient lookup mechanism for all objects in the dataset, based upon their @id or @gml:id.
    - The identifierExpression is different, depending upon whether the xlink:href reference contains a prefix α and/or a postfix β:
      - α can be configured via the XmlSchema target parameter schematronXlinkHrefPrefix (default is #).
      - β can be configured via the XmlSchema target parameter schematronXlinkHrefPostfix (default is the empty string).
      - If both α and β have a value, the identifierExpression is: substring-before(substring-after($BYREFVAR, α), β)
      - If only α has a value, the identifierExpression is: substring-after($BYREFVAR, α)
      - If only β has a value, the identifierExpression is: substring-before($BYREFVAR, β)
      - If neither α nor β have a value, the identifierExpression boils down to: $BYREFVAR
  - Inline encoded properties are translated to property/*.
  - If a property can (also) be encoded by reference, then a ‘for’ expression is constructed. The expression merges the sequences produced by the expressions for the inline and by reference cases.
    - NOTE: Cases where a property value is given both inline and by reference are currently not handled. Such cases will lead to incorrect results.
  - If the OCL expression is a path of multiple property names (e.g. property1.property2.property3) then a nesting of ‘for’ expressions may be created. Such a complex expression is necessary, in order to correctly set the context for computing referenced objects in all cases.
If the metadata of the property shall be accessed (see operation call propertyMetadata()): τ(x)/property

Property call according to nilReason implementation pattern

OCL syntax

x . property. value
x . property. reason

In words

Set of instances reached by property, respectively by property/@nilReason

Schematron translation

Case x . property. value: τ(x.property)
- Compilation as above – ‘x.property’ is assumed to have the type of ‘value’.
Case x . property. reason
- 19136 encoding: τ(x.property) [@xsi.nil=’true’]/@nilReason
- 19139 encoding: τ(x.property)[not(*)]/@gco:nilReason

Operation call propertyMetadata()

OCL syntax

x. propertyMetadata()

In words

Access the metadata associated with the property, instead of the property value.

Schematron translation

Case x is a property: This is translated like a property call for the case of a property whose value is a complex type with identity, given by reference – just using @metadata instead of @xlink:href: for $METAREFVAR in variable/property/@metadata return key(‘idKey’,identifierExpression)
Case x is a variable: Cannot be translated, because the variable may be bound to different kinds of expressions, not just a property call, and even if it was bound to a property call, then the variable may be used multiple times: not only for accessing the metadata of the property, but also its value. That would require different translations of the property call. However, a variable can only be bound to a single expression.

Operation call allInstances()

OCL syntax

x . allInstances()

In words

Collection of all object instances of type x.

x represents a type-valued expression.

Schematron translation

If x is a type constant:
An expression that selects all elements based upon a predicate which uses a combination of local-names and namespace-uris of the type and its (direct and indirect) subtypes (if they are instantiable, i.e. not abstract and not suppressed). We differentiate three cases:

Case multiple instantiable classes: //*[(local-name()=’localName_type_1′ and namespace-uri()=’namespace_type_1′) or … or (local-name()=’localName_type_n’ and namespace-uri()=’namespace_type_n’)]
Case single instantiable class: //*[local-name()=’localName_type’ and namespace-uri()=’namespace_type’]
Case no instantiable class: //*[false()]

NOTE: ShapeChange will store the expression in a new let variable, since it makes sense to only compute the collection of instances of a certain type once. An example where this would immediately be useful is Type.allInstances().inlineOrByReferenceProperty… – because the let variable can then be used both for the translation of the inline and the byReference case of the property access.

If x is a type expression:
Cannot be translated because required schema information is not available at run-time.

Operation call oclIsKindOf()

OCL syntax

x . oclIsKindOf(y)

In words

The single object instance x is checked for complying with type y.

Schematron translation

If y is a type constant:
boolean(τ(x)[(local-name()=’localName_T1‘ and namespace-uri()=’namespace_T1‘) or … or (local-name()=’localName_Ti‘ and namespace-uri()=’namespace_Ti‘)]), where Tk is one of the concrete derivations of y, including y.

NOTE: boolean(…) may be omitted if the argument is known to be used by operands, which do an implicit conversion to Boolean.

If y is a type expression: Cannot be translated because required schema information is not available at run-time.

Operation call oclIsTypeOf()

OCL syntax

x . oclIsTypeOf(y)

In words

The single object instance x is checked for being of type y.

Schematron translation

If y is a type constant: boolean(τ(x) [local-name()=’localName_y’ and namespace-uri()=’namespace_y’]).

boolean(…) may be omitted if the argument is known to be used by operands, which do an implicit conversion to Boolean.

If x is a type expression:
Cannot be translated because required schema information is not available at run-time.

Operation call oclAsType()

OCL syntax

x . oclAsType(y)

In words

The single object instance x is downcast to type y. The value is ‘undefined’ if this is not possible.

NOTE: This operation is typically used in situations where ISO 19139 encoding applies, to cast an attribute with value type CharacterString to a code list type, so that comparison with a code/enum is possible.

Schematron translation

If y is a type constant:
τ(x) [(local-name()=’localName_T1‘ and namespace-uri()=’namespace_T1‘) or … or (local-name()=’localName_Ti‘ and namespace-uri()=’namespace_Ti‘)], where Tk is one of the concrete derivations of y, including y.

If y is a type expression: Cannot be translated because required schema information is not available at run-time.

Casting CharacterString to a code list: We are making an exception to the strict rules with simple data elements. We permit CharacterString being cast to a code list type.

Operation call +,-,*,/

OCL syntax

x + y, etc.

In words

Value of x + y, etc.

Schematron translation

τ(x) + τ(y)
τ(x) – τ(y)
τ(x) * τ(y)
τ(x) div τ(y)

Operation call =, <>

OCL syntax

x = y
x <> y

In words

Value of x=y, x<>y

Schematron translation

If x and y are simple types:

τ(x) = τ(y)
τ(x) != τ(y)

If x and y are complex types:

generate-id(τ(x)) = generate-id(τ(y))
generate-id(τ(x)) != generate-id(τ(y))

Operation call <, >, <=, >=

OCL syntax

x < y, etc.

In words

Value of x < y, etc.

Schematron translation

τ(x) < τ(y)
τ(x) > τ(y)
τ(x) <= τ(y)
τ(x) >= τ(y)

Operation call size()

OCL syntax

x . size()

In words

Number of characters in the string instance x.

Schematron translation

string-length(τ(x))

Operation call concat()

OCL syntax

x . concat(y)

In words

String concatenation of x and y.

Schematron translation

concat(τ(x),τ(y))
A series of concats may be joined to a multi-argument concat invocation.

Operation call substring()

OCL syntax

x . substring(y,z)

In words

Substring of x running from position y to position z.

More specifically: the sub-string of x starting at the character number y, up to and including character number z. Character numbers run from 1 to x.size(). The following condition should be met: 1 <= y <= z <= x.size().

Schematron translation

substring(τ(x), τ(y), τ(z)-τ(y)+1)

Operation call and, or, xor, implies

OCL syntax

x and y
x or y
x xor y
x implies y

In words

Logical combination as indicated

Schematron translation

τ(x) and τ(y)
τ(x) or τ(y)
boolean(τ(x)) != boolean(τ(y))
- NOTE: The boolean(..) is omitted if the result type of T(x)/T(y) is known to be boolean.
not(τ(x)) or τ(y)

Schematron translation

This can only be translated in a few cases:

If y is a constant, y(t)=const: count(τ(x))<=1
If y is identity (x->isUnique(t|t)):
- if value t has a simple type:
  - type other than Boolean: count(T(x)) = count(distinct-values(T(x))
  - type is Boolean: count(T(x)) = count(distinct-values(T(x)/xs:boolean(.)))
- if value is a complex type with identity: count(T(x)) = count(distinct-values(T(x)/@*:id)
- else (value is a complex type without identity – for example a data type or union): for $COUNT1 in count(T(X)), $COUNT2 in sum(for $ISUVAR1 in T(X), $ISUVAR2 in T(X) return (if(empty($ISUVAR1) and empty($ISUVAR2)) then 0 else if ((empty($ISUVAR1) and not(empty($ISUVAR2))) or (not(empty($ISUVAR1)) and empty($ISUVAR2))) then 1 else if (generate-id($ISUVAR1) = generate-id($ISUVAR2)) then 0 else if (deep-equal($ISUVAR1,$ISUVAR2)) then 0 else 1)) return $COUNT1 * ($COUNT1 – 1) = $COUNT2
If y(t) is a property call that is based on variable t and:
- if the property call only involves properties with max cardinality = 1:
  - if the value resulting from the property call has a simple type:
    - type other than Boolean: count(T(x)) = count(distinct-values(for $t in T(x) return (if (empty(T(y(t)))) then ‘SC_EMPTY_ISU_BODY’ else T(y(t)) )))
    - type is Boolean: count(T(x)) = count(distinct-values(for $t in T(x) return (if (empty(T(y(t)))) then ‘SC_EMPTY_ISU_BODY’ else T(y(t))/xs:boolean(.) )))
    - NOTE: An empty value for T(y(t)) is mapped to the string ‘SC_EMPTY_ISU_BODY’ in order to correctly handle null values in the evaluation of the iterator isUnique(). Without this mapping, empty/null values would be ignored by the function distinct-values().
  - if the value resulting from the property call has a complex type with identity: count(T(x)) = count(distinct-values(for $t in T(x) return (if (empty(T(y(t)))) then ‘SC_EMPTY_ISU_BODY’ else T(y(t))/@*:id )))
  - else (value resulting from the property call is a complex type without identity – for example a data type or union): for $COUNT1 in count(T(X)), $COUNT2 in sum(for $t in T(x), $ISUVAR1 in T(y(t)), $ISUVAR2 in T(y(t)) return (if(empty($ISUVAR1) and empty($ISUVAR2)) then 0 else if ((empty($ISUVAR1) and not(empty($ISUVAR2))) or (not(empty($ISUVAR1)) and empty($ISUVAR2))) then 1 else if (generate-id($ISUVAR1) = generate-id($ISUVAR2)) then 0 else if (deep-equal($ISUVAR1,$ISUVAR2)) then 0 else 1)) return $COUNT1 * ($COUNT1 – 1) = $COUNT2
- if the property call involves a property with max cardinality > 1: cannot be translated; each evaluation of y(t) must result in at most one value. ShapeChange will log an error if this condition is not fulfilled.
Any other, particularly arbitrary expressions: Cannot be translated because no way to express this in XPath 2.0 has been found. isUnique bodies must not contain expressions other than constants, identity, or a property call. ShapeChange will log an error if this condition is not fulfilled.

Collection operation call select()

OCL syntax

x -> select(t|b(t))

In words

Compute the collection of those objects t in x, for which the boolean expression b(t) holds.

Schematron translation

τ(x) [for $t in . return τ(b(t))]

NOTE: Using a ‘for’-expression might seem odd here, but it allows us to bind the variable $t used in the select statement to the context defined by T(x). The variable can then be used in T(b(t)).

Pattern matching function on Strings

OCL syntax

x . matches( pattern )

Note: This operation call is an extension. It is not part of the OCL standard.

In words

Boolean function which yields true if the pattern of type String matches the String argument.

Schematron translation

matches( τ(x), τ(pattern) )