Home » railML newsgroups » railml.misc » railML 3.x: Data Modelling Patterns
railML 3.x: Data Modelling Patterns [message #1797] Fri, 18 May 2018 09:01 Go to next message
christian.rahmig is currently offline  christian.rahmig
Messages: 181
Registered: January 2016
Senior Member
Dear all,

upcoming new major version railML 3.x will contain lot of modifications
and changes. In order to keep a unified layout and model structure, some
elementary "Data Modelling Patterns" have been defined. These patterns
shall represent a set of rules for modellers (and users) how certain
things are being structured in the model. You can find a first version
of these "Data Modelling Patterns" in [1].

I would like to get your feedback on this approach: Do you agree with
the overall concept of having Data Modelling Patterns? What do you think
about the different patterns in particular?

Any reaction is highly appreciated...

[1] file replaced by newer version on August 17th, 2018

Best regards
Christian

--
Christian Rahmig - Infrastructure scheme coordinator
railML.org (Registry of Associations: VR 5750)
Phone Coordinator: +49 173 2714509; railML.org: +49 351 47582911
Altplauen 19h; 01187 Dresden; Germany www.railml.org

[Updated on: Fri, 17 August 2018 17:02] by Moderator

Report message to a moderator

Re: railML 3.x: Data Modelling Patterns [message #1915 is a reply to message #1797] Fri, 17 August 2018 16:51 Go to previous messageGo to next message
christian.rahmig is currently offline  christian.rahmig
Messages: 181
Registered: January 2016
Senior Member
Dear all,

there is a new draft version of the "Data Modelling Patterns" document
available in [2]. The following small changes have been incorporated:
* adding pattern "Names of elements and attributes"
* clarification on Element vs attribute
* clarification on preferred solution for pattern "Boolean information"

Any reaction is highly appreciated...

[2]
http://forum.railml.org/userfiles/2018-08-17_railml_railml3- modelling-patterns.pdf

Best regards
Christian

PS: To avoid confusion, the old file [1] has been deleted.

--
Christian Rahmig - Infrastructure scheme coordinator
railML.org (Registry of Associations: VR 5750)
Phone Coordinator: +49 173 2714509; railML.org: +49 351 47582911
Altplauen 19h; 01187 Dresden; Germany www.railml.org


Am 18.05.2018 um 09:01 schrieb Christian Rahmig:
> Dear all,
>
> upcoming new major version railML 3.x will contain lot of modifications
> and changes. In order to keep a unified layout and model structure, some
> elementary "Data Modelling Patterns" have been defined. These patterns
> shall represent a set of rules for modellers (and users) how certain
> things are being structured in the model. You can find a first version
> of these "Data Modelling Patterns" in [1].
>
> I would like to get your feedback on this approach: Do you agree with
> the overall concept of having Data Modelling Patterns? What do you think
> about the different patterns in particular?
>
> Any reaction is highly appreciated...
>
> [1]
> http://forum.railml.org/userfiles/2018-05-02_railml_railml3- modelling-patterns.pdf
>
>
> Best regards
> Christian
>
Re: railML 3.x: Data Modelling Patterns [message #2011 is a reply to message #1915] Wed, 14 November 2018 16:05 Go to previous messageGo to next message
Christian Rößiger is currently offline  Christian Rößiger
Messages: 29
Registered: March 2015
Junior Member
Dear all,

In anticipation of the next TT developer meeting I have dealt with the
above document and would like to document the points relevant to me in
the following.
Christian, please don't see it as a negative comment ;-), with all not
mentioned points I agree and I generally welcome that there are defined
modeling principles.

I see a general problem: The TT data cannot be easily classified into
microscopic or macroscopic. This is probably due to the fact that the TT
objects are not hierarchically composed, but rather networked on the
same level (m:n relations). Example: I have the option to specify a
departure time to the minute or to the second or to omit it at
unimportant stations, but from my point of view it is rather a question
of vagueness than of macroscopic / microscopic. In all cases, the same
data type is used (xs:time), which is always processed the same way by
the reading system.
The only obvious differentiation according to microscopic/macroscopic
results from the dependencies to the railML3 infrastructure.


Comments on the individual chapters:

Model hierarchy and inheritance (slides 3-6)

Do the requirements for container elements also apply if the container
is a subelement (part) of another object or only at the top level?
Otherwise I see the requirements already fulfilled in railML2.x-TT:
- There is a general base element tElementWithNameAndId for all TT objects
- Top-level containers contain only elements of type


Element vs. Attributes (Slide 14)

The mere presence of several (countable) attributes is not a criterion
for choosing element vs. attribute. With this reason one could also
model a coordinate (x, y) as a countable attribute with the help of two
elements. More important seems to be the existence of further
information about the respective attribute (validForDirecton="both" in
the example).


Unknown information and Default-Values (slide 15/16)

I understand the intended rule to mean that the absence of an attribute
should always be interpreted as an "unknown value" in the future. I
think this rule is too strict and has several disadvantages.
Currently, a missing attribute can have the following meanings:

1) Value unknown
2) Value intentionally absent. Example: arrival / departure (time) for
beginning / ending trains at first / last station
3) Does not apply in context. Example: <stopDescription>.onOff only
makes sense when held commercially, otherwise the meaning is "does not
apply". However, this value does not currently exist for "onOff
4) Static default value is to be used. Example:
<ocpTT>.trainReverse="false" applies in 99% of cases and can therefore
be regarded as the default value.
5) Context-dependent default value must be used. Example:
<stopDescription>.stopOnRequest is implicitly always true for
non-commercial holds and therefore does not have to be explicitly specified.

Possible consequences:

1) is still allowed
2) and 3) could only be eliminated by changing the modelling. I see
there primarily the replacement of attributes by elements, since a
missing element can still represent both "value unknown" and "value
intentionally not present". It is questionable whether this makes the
schema more understandable.
4) and 5) These actual default values can certainly always be specified
explicitly in the future, but in my opinion this would lead to an insane
enlargement of the files. At least the (more frequent) static default
values (4.) should be preserved.


Boolean information (slide 17)

It seems to me that the case "Mandatory Boolean with true|false as valid
values" is not intended. What is the reason for this limitation?
In general, I would look at boolean values as general attributes and not
define them as a special case. Option 2 therefore contradicts the
statement "Mandatory elements can never be unknown" (slide 15).
In addition, the question arises in which cases a boolean value will be
used in the future and when an enumeration with 2 elements will be used.
In my opinion, this decision has so far been decided more according to
semantic aspects.


References (slide 21)

As I said before, I still don't have the idea to divide the TT-world
into macroscopic / microspcopic, but so far option 1 is used
(macroscopic refers to microscopic) more frequently, which I still think
makes sense. Example: A <train> references its operating points
(<ocpTT>) and not vice versa.


Multiplicity (slide 24)

I generally agree with the statements, whereby in special cases there
may be container elements with a minimum of 2 elements, e.g. there is
currently a <Path> element with minOccurence="2" in railML2.x.


What is missing from my point of view

Definition of container elements in xml/Uml.

Currently most containers are defined in railML using xs:sequences. Is
there a "ban" on other structuring elements such as "xs:choice" or
"xs:all" or their equivalents in UML by railLM3-IS?


Use of polymorphism

Inheritance of elements is currently already used in the schema, but
rather by defining "template elements" with certain general attributes.
However, there are hardly any cases in TT where the schema requires an
(abstract) base class, and alternatively different classes derived from
it can be used. This would be conceivable, for example, when mapping
hold information with an abstract element <ocpTT> with some common
attributes and derived subclasses <ocpTTPass>, <ocpTTCommercialStop> or
<ocpTTOperationalStop> with additional individual attributes. Is there
any rejection or encouragement on the part of railML3-IS?

Best regards
Christian Rößiger

--
iRFP e. K. · Institut für Regional- und Fernverkehrsplanung
Hochschulstr. 45, 01069 Dresden
Tel. +49 351 4706819 · Fax. +49 351 4768190 · www.irfp.de
Registergericht: Amtsgericht Dresden, HRA 9347
Re: railML 3.x: Data Modelling Patterns [message #2014 is a reply to message #1915] Mon, 19 November 2018 05:25 Go to previous messageGo to next message
Joerg von Lingen is currently offline  Joerg von Lingen
Messages: 93
Registered: May 2011
Member
Dear all,

Christian's proposal for the modelling pattern is quite useful to normalise the sub-schemas within railML. Basically I
agree with the set patterns. However, it shall be used as a guideline and not as a strict corset. The railway world is
wide spread and requires flexibility.

I see currently the following issues where the strict adherence to the patterns is not useful:
1) Hierarchy
RS: The view would be formation/vehicle which are at the same time containers. Seeing
Engine/Wagon/Maintenance/Classification/... as objects their parts definitely have several levels. One could argue to
use the components as containers within the "super"-container vehicle. But this would bring a lot overhead in referencing.
IL: According the pattern we have again "super"-containers without real view-level -
AssetsForIL/Controller/SignalBox/GenericIM. The objects in these containers sometimes need more than one part-level.

2) Layer
IL: Elements like Controller or SignalBox cannot live without the remaining AssetsForIL/GenericIM because they need them
as basis.

3) Extension points
RS: In the 2.4 version there are not much extension points requested. At least most of enumerations can be extended.
IL: A good portion of elements are derived from EntityIL which provides the possibility of any-elements. However,
extensions with any-elements were not requested. The majority of enumerations cannot extended as this is mostly not
sensible.

4) Booleans
RS/IL: I would prefer option 1 and make the attributes in question optional.

5) Naming
IL: The use of verbs in the names enhance the legibility as it points already to the related function of the element. In
addition it would cause conflicts or confusion, if a name like "overlap" appears at several locations due to the
referencing.

Thanks to Christian for his work.

Jörg v. Lingen - Rollingstock/Interlocking coordinator


Christian Rahmig wrote on 17.08.2018 16:51:
> Dear all,
>
> there is a new draft version of the "Data Modelling Patterns" document
> available in [2]. The following small changes have been incorporated:
> * adding pattern "Names of elements and attributes"
> * clarification on Element vs attribute
> * clarification on preferred solution for pattern "Boolean information"
>
> Any reaction is highly appreciated...
>
> [2]
> http://forum.railml.org/userfiles/2018-08-17_railml_railml3- modelling-patterns.pdf
>
> Best regards
> Christian
>
> PS: To avoid confusion, the old file [1] has been deleted.
>
Re: railML 3.x: Data Modelling Patterns [message #2015 is a reply to message #2014] Mon, 19 November 2018 12:49 Go to previous messageGo to next message
Cédric Lavanchy is currently offline  Cédric Lavanchy
Messages: 1
Registered: February 2018
Junior Member
Dear all,

I find a good idea to have some rules how elements have to be modeled. These are rules, but rules are made to be broken. Not in all cases but I can imagine, that at some place, they could be counterproductive. My advise is to tolerate some difference to the rules but it must be justified and known from the community, otherwise it is the open door to any kind of abuse.

Here some specific points:

1) "Inheritance": I have nothing against inheritance except that it is not always correctly use. See the Liskov substitution principle (Square should not inherit from Rectangle)

2) "Common domain": take care to put in there only the required elements and not everything, otherwise it will become impossible to manage

3) "Names of elements and attributes": I heard many times some complains about the size of the files and for instance by the Timetable, effort are made to reduce the amount of date in order to reduce the size of the file. On the other hands, the names can become longer. We have to take care not to ruin some efforts.

4) "Boolean information": option 1 (optional version)

5) "References": option 1 (if they are not sub-elements already)
Re: railML 3.x: Data Modelling Patterns [message #2019 is a reply to message #1915] Thu, 22 November 2018 11:28 Go to previous messageGo to next message
Dirk Bräuer is currently offline  Dirk Bräuer
Messages: 255
Registered: August 2008
Senior Member
Dear Christian and all,

there is already a lot of reply for this topic, so I reduce my words (this time ;-) ) to a short summary concerning two issues which are very important for me:

1) Hierarchy

I think a rather flat hierarchy has no advantage. I prefer a good (possibly deep) hierarchy especially in a very 'technical' context. I already have often the problem of needing to 'jump' very often in the railML files (when reading manually) to resolve references.

Additionally, when I made the suggestion of a possible generic model for future <TT> (with a very flat hierarchy), it was widely refused because of too less structure. So, I am probably (obviously) not the only one with this opinion.

2) Default values

I want to encourage what Christian Rößiger wrote. We cannot avoid default values. It is highly unreasonable for everyone to write "operationalStopOrdered=..." each time when there is no operational stop up to the horizon. Same applies e. g. to "guaranteedPass=..." which does not make sense when there is no pass but a stop. I would simply be more confusion than helping for someone who reads the railML file.

We have plenty of such examples in <TT>.

Best regards,
Dirk.
Re: railML 3.x: Data Modelling Patterns [message #2026 is a reply to message #1915] Tue, 27 November 2018 16:08 Go to previous message
Joachim.Rubröder is currently offline  Joachim.Rubröder
Messages: 33
Registered: September 2004
Member
Dear all,
I generally agree with the design patterns, as long as we really try to respect them within all railML 3.x domains in order to create a common look and feel for railML.

1) "Hierarchy:" Ok, Container/Object/Parts (or Components) would correspond with trains/train/trainParts. Views could be "commercial trains", "operational trains", "ordered slots", ...

2) "Naming": Ok, so there will be lots of boolean attributes like "isThis" or "hasThat"

3) "Boolean information": Option 1, a dedicated "unknown" should be part of an enumeration together with "assumed", "irrelevant", ...

4) "References": Option 1, because the linking might not be as straight forward as intended. A train could belong to several different kind of train categories.

Best regards,
Joachim

--
SMA und Partner AG
Gubelstrasse 28, CH-8050 Zürich
www.sma-partner.com
Previous Topic: SI units in railML 3.x
Next Topic: Define requirements for processContent strictly than lax
Goto Forum:
  


Current Time: Mon Dec 10 17:13:18 CET 2018