Understanding semantic constraints in railML
by Gulruh Farmonova (railML.org) (comments: 0)
What is a Semantic Constraint and why do we need them?
A semantic constraint may be defined as a rule or condition applied to data, with the purpose of validating that the information is not only syntactically correct, but also logically consistent and/or meaningful within a specific context or domain.
Semantic constraints complement syntactic constraints. XML Schema Definitions (XSDs) define syntactic constraints on the structure of the data, as well as the allowed type and range of each data value. For instance, in a railway context, XSDs can define that a train must have a train number, what the legal format of that train number is, that the train must have an itinerary, that this itinerary may have one or more stops, and that it is optional to specify the type of rolling stock for the train. Additionally, XSDs can express choices, such as specifying that each point on the itinerary of a train must indicate either a stop or a passthrough.
Furthermore, XSDs can restrict references between elements, requiring that references given by a certain attribute point to an ID of a certain type of element. Apart from this, XML Schema Definitions (XSDs) are unable to create rules about the relationship between different parts of the data. An XSD cannot express constraints on the value of one data entry that depend on the value of other data entries. As an example, it cannot express that the end of a period cannot be before the start of the same period. This is instead a semantic constraint, that must be expressed separately from any XSDs. Although such semantic constraints cannot be included in the XSDs, they can still be expressed and verified programmatically in other languages.
Semantic Constraints in railML
In the context of railML, semantic constraints serve a similar function. The data exchanged in the railway system must be both structured correctly and logically consistent with the operations of the railway network. These constraints validate that data about trains, tracks, stations, schedules and other elements contradict itself or other data given in the same file. It is important to note that railML allows for the expression of data that may intentionally contain operational contradictions. For example when two trains stop at the same track at the same time and location. However this can easily be expressed in railML to provide input to a system that detects these contradictions and visualizes them in order to be fixed. Semantic constraints in railML serve the purpose to make sure that the governing concepts of the railML model are maintained when syntactic validation alone is insufficient.
To avoid potential confusion, it is important to understand what a semantic constraint is not. When we want to constrain the possible values that a property can take, we use syntactic constraints. When we want to describe the semantics of what a property represents, we use the schema documentation. The purpose of semantic constraints in railML is to avoid inconsistencies or outright contradictions between different properties in the same railML file.
For instance, a semantic constraint in railML could ensure that:
- A train schedule cannot include a departure time before its arrival time at a station.
- A train passing through a station without stopping should not have an arrival time
- If station A is a substation of station B, station B cannot be a substation of station A.
These constraints go beyond simply explaining the correct format of data and enforce that the data follows logical, operational rules relevant to railway systems and help to prevent errors that might occur when software reads the file. If the semantic rules are not followed, the software could misinterpret the file or behave unexpectedly. Therefore, their implementation will be checked during certification as well.
Establishing Semantic Constraints in railML
The process of establishing semantic constraints in railML generally involves collaboration between railML community and coordinators. When a rule cannot be expressed by XML Schema Definitions, it is proposed as a semantic constraint. Semantic constraints may be proposed by members of the railML® working groups or by any individual through a forum post.
Following the proposal, the community discusses the proposed semantic constraint to identify the logical relationships and operational rules that must be applied to the data. If the working group reaches a consensus to introduce the constraint, the proposal is posted in the forum and added to the element documentation using a dedicated template.
The proposal is initially marked as "proposed" and added to a list for review. After giving the community reasonable time to raise any objections, the railML coordinators decide if the proposed semantic constraint is approved. When a decision has been made, the railML documentation is updated accordingly. Only approved semantic constraints affect certifications. This process demonstrates the effective establishment of semantic constraints through community-driven input prior to official implementation within the railML® system.
To view the existing approved and proposed semantic constraints and to learn more about the process of their introduction, you can visit the dedicated semantic constraints section on the railML2 and railML3 wiki pages or visit the railML forum, where you will find them under relevant subschema sections. Should you require further information on our semantic constraints, please contact our railML coordinators.