Data Types - Abstract Specification

Chair/Editor Gunther Schadow
Regenstrief Institute for Health Care
Editor Paul Biron
Kaiser Permanente, Southern California
Editor Doug Pratt
Siemens

1

Preface

This document specifys the HL7 Version 3 Data Types on an abstract layer, independent of representation. By "independent of representation" we mean independent of both abstract syntax as well as implementation in any particular implementation technology.

This document is accompanied by Implementation Technology Specifications (ITS). The ITS documents can serve as a quick compendium to the data types that is more practically oriented toward the representation in that particular implementation technology.

Vocabulary tables within this specification list the current contents of vocabulary domains for ease of reference by the reader. However, at any given time the normative source for these domains is the vocabulary tables in the RIM database. For some large domains, only a sample of possible values is shown. The complete domains can be referenced in the vocabulary tables by looking up the domain name associated with the table in the RIM vocabulary tables.

1.1

A note to all readers who participated in previous ballots

In the previous ballots of the v3 data types documents there were two ITS-independent specification, called "Part-1" and "Part-2". Since a completely representation-independent data type specification is abstract, Part-1 was supposed to provide an easier read. However, it was also more shallow and at times not correct. Part-1 gave the wrong impression as if HL7 version 3 data types were defined as an abstract syntax which is an incorrect assumptions. For that reason the specification has again been restructured.

The ITS documents now assume the function of a "practical" exposition of this material that is fairly concise and easy to read for those readers who know the respective implementation technology. The ITS documents quote the abstract specification on its concise definitions and possibly on tutorial material (that has been merged back from Part-1 into this abstract specification.)

During the last two ballots, the editorial process of these document has been largely automated to minimize duplication of text and formalize the specification shuch that technical changes to the material can be implemented much more quickly and consistently. The work on document restructuring both regarding document production (i.e., moving from Word processor documents to XML, that was initiated by the HL7 publication committee several ballot cycles ago) as well as the various attempts at restructuring this material, imposed a lot of editorial burden on the editors. We believe that the result to date is a great improvement as future technical ballot comments can be more safely, consistently and easily accomodated.

On the other hand, this editorial work dominated much of the contnet work for the last two ballot preparations. The reader may remember that during the previous ballot the abstract data type specification was added without any changes (neither editorial nor content changes) because all the available resources had been spent on the ITS for XML. During this ballot preparation the abstract specification has been updated regarding the editorial process improvements, but only very few essential content modification had been applied.

The editors therefore want to apologize to all those readers who had previously submitted technical ballot comments that had been agreed to during the reconciliation, as many of those suggestions and agreements are still not reflected in this ballot draft. We would ask the readers to please resubmit their understanding of the prior resolutions to their comments -- however informal this may be -- into their ballot. We understand that a formal resubmission of comments places an extra burden, which is why we will greatly appreciate notes about previous reconciliation agreements submitted as informal material (this would even include handwritten notes that can be sumbitted by FAX to +1-317-630-6962 attn: Gunther Schadow/V3DT.)

2

Acknowledgements

This standard is the result of several years of intense work through e-mail, telephone conferences and meeting discussions. Gunther Schadow (Regenstrief Institute for Health Care) chaired this task force, and is the main author of this document. Major contributions are from Mark Tucker (Regenstrief Institute), Paul V. Biron (Kaiser Permanente), Lloyd McKenzie (IBM), George Beeler, and Stan Huff (Intermountain Health Care), as well as Mike Henderson (Kaiser Permanente), Anthony Julian (Mayo), Joann Larson (Kaiser Permanente), Mark Shafarman (Oacis Healthcare Systems), Wes Rishel (Gartner Group), and Robin Zimmerman (Kaiser Permanente). Acknowledgements for their critical review and infusion of ideas go to Bob Dolin (Kaiser Permanente), Clem McDonald (Regenstrief Institute), Kai Heitmann (HL7 Germany), Rob Seliger (Sentillion), and Harold Solbrig (Mayo Clinic). Vital support came from the members of the task force, Laticia Fitzpatrick (Kaiser Permanente), Matt Huges, Randy Marbach (Kaiser Permanente), Larry Reis (Wizdom Systems), Carlos Sanroman (Kaiser Permanente), Greg Thomas (Kaiser Permanente). Thanks James Case (University of California, Davis), Norman Daoust (Partners HealthCare Systems), Irma Jongeneel (HL7 The Netherlands), Michio Kimura (HL7 Japan), John Molina (SMS), Richard Ohlmann (McKessonHBOC), David Rowed (HL7 Australia), and Klaus Veil (Macquarie Health Corp., HL7 Australia), for sharing their expertise in critical questions. This work was made possible by the Regenstrief Institute for Health Care.


Table of contents

1 Introduction
    1.1 What is a Data Type?
    1.2 Representation of Data Values
    1.3 Properties of Data Values
    1.4 Need for the Abstraction
    1.5 Need for an HL7 Data Type Standard
    1.6 Forms of Data Type Definitions
        1.6.1 Formal Data Type Definition Language
        1.6.2 Tables of Properties
        1.6.3 Unified Modeling Language (UML) Diagrams
    1.7 Overview of Data Types
    1.8 Introduction to the Formal Data Type Definition Language
        1.8.1 Declaration
        1.8.2 Invariant Statements
        1.8.3 Type Conversion
        1.8.4 Literal Form
        1.8.5 Generic Data Types
    1.9 DataType (type)
        1.9.1 Properties of DataType (type)
    1.10 DataValue (ANY)
        1.10.1 Properties of DataValue (ANY)
2 Basic Types
    2.1 Boolean (BL)
        2.1.1 Properties of Boolean (BL)
    2.2 Encapsulated Data (ED)
        2.2.1 Binary Data (BIN)
        2.2.2 Properties of Encapsulated Data (ED)
    2.3 Character String (ST)
        2.3.1 Properties of Character String (ST)
    2.4 Concept Descriptor (CD)
        2.4.1 Concept Role (CR)
        2.4.2 Properties of Concept Descriptor (CD)
        2.4.3 Coded Simple Value (CS) restricts CD
        2.4.4 Coded Value (CV) restricts CD
        2.4.5 Coded With Equivalents (CE)
    2.5 Instance Identifier (II)
        2.5.1 Unique Identifier String (UID)
        2.5.2 ISO Object Identifier (OID) extends UID
        2.5.3 DCE Universal Unique Identifier (UUID) extends UID
        2.5.4 HL7 Reserved Identifier Scheme (RUID) extends UID
        2.5.5 Properties of Instance Identifier (II)
    2.6 Telecommunication Address (TEL) extends URL
        2.6.1 Universal Resource Locator (URL)
        2.6.2 Properties of Telecommunication Address (TEL)
    2.7 Postal Address (AD) extends
        2.7.1 Address Part (ADXP) extends ST
        2.7.2 Properties of Postal Address (AD)
    2.8 Entity Name (EN)
        2.8.1 Entity Name Part (ENXP)
        2.8.2 Properties of Entity Name (EN)
        2.8.3 Trivial Name (TN) restricts EN
        2.8.4 Person Name (PN)
        2.8.5 Person Name Part (PNXP) restricts ENXP
        2.8.6 Organization Name (ON)
        2.8.7 Person Name Part (ONXP) restricts ENXP
    2.9 Abstract Type Quantity (QTY)
        2.9.1 Properties of Abstract Type Quantity (QTY)
    2.10 Integer Number (INT)
        2.10.1 Properties of Integer Number (INT)
    2.11 Real Number (REAL)
        2.11.1 Properties of Real Number (REAL)
    2.12 Ratio (RTO)
        2.12.1 Properties of Ratio (RTO)
    2.13 Physical Quantity (PQ)
        2.13.1 Physical Quantity Representation (PQR) extends CV
        2.13.2 Properties of Physical Quantity (PQ)
    2.14 Monetary Amount (MO)
        2.14.1 Properties of Monetary Amount (MO)
    2.15 Point in Time (TS)
        2.15.1 Properties of Point in Time (TS)
        2.15.2 Calendar (CAL)
        2.15.3 Calendar Cycle (CLCY)
3 Generic Collections
    3.1 Set (SET)
        3.1.1 Properties of Set (SET)
    3.2 Sequence (LIST)
        3.2.1 Properties of Sequence (LIST)
        3.2.2 GeneratedSequence (GLIST) restricts LIST
        3.2.3 SampledSequence (SLIST) restricts LIST
    3.3 Bag (BAG)
        3.3.1 Properties of Bag (BAG)
    3.4 Interval (IVL)
        3.4.1 Properties of Interval (IVL)
        3.4.2 Interval of Physical Quantities (IVL<PQ>)
        3.4.3 Interval of Point in Time (IVL<TS>)
4 Generic Type Extensions
    4.1 History Item (HXIT)
        4.1.1 Properties of History Item (HXIT)
        4.1.2 History (HIST)
    4.2 Uncertain Value - Probabilistic (UVP)
        4.2.1 Properties of Uncertain Value - Probabilistic (UVP)
        4.2.2 Non-Parametric Probability Distribution (NPPD)
    4.3 Parametric Probability Distribution (PPD)
        4.3.1 Properties of Parametric Probability Distribution (PPD)
        4.3.2 Probability Distribution over Real Numbers (PPD_REAL)
        4.3.3 Parametric Probability Distributions over Physical Quantities (PPD_PQ)
        4.3.4 Probability Distribution over Time Points (PPD_TS)
5 Timing Specification
    5.1 Periodic Interval of Time (PIVL)
        5.1.1 Properties of Periodic Interval of Time (PIVL)
        5.1.2 Periodic Intervals as Sets
    5.2 Event-Related Periodic Interval of Time (EIVL)
        5.2.1 Properties of Event-Related Periodic Interval of Time (EIVL)
        5.2.2 Resolving the Event-Relatedness
    5.3 General Timing Specification (GTS)
        5.3.1 Convex Hull
        5.3.2 GTS as a Sequence of Occurrence Intervals
        5.3.3 Interleaving Schedules and Periodic Hull
        5.3.4 Literal Form

Appendices


1

Introduction

1.1

What is a Data Type?

Every data element has a data type. Data types define the meaning (semantics) of data values that can be assigned to a data element. Meaningful exchange of data requires that we know the definition of values so exchanged. This is true for complex "values" such as business messages as well as for simpler values such as character strings or integer numbers.

According to ISO 11404, a data type is "a set of distinct values, charac terized by properties of those values and by operations on those values." A data type has intension and extension. Intentionally, the data type defines the properties exposed by every data value of that type. Extensionally, data types have a set of data values that are of that type (the type's "value set").

Semantic properties of data types are what ISO 11404 calls "properties of those values and [...] operations on those values." A semantic property of a data type is referred to by a name and has a value for each data value. The value of a data value's property must itself be a value defined by a data type - no data value exists that would not be defined by a data type.

Data types are thus the basic building blocks used to construct any higher order meaning: messages, computerized patient record documents, or business objects and their transactions. What, then, is the difference between a data type and a message, document, or business object? Data type values stand for themselves, the value is all that counts, neither identity nor state or changing of state is defined for a data value. Conversely in business objects, we track state and identity; the properties of an identical object might change between now and later. Not so with data values: a data value and its properties are constant. For example, number 5 is always number 5, there is no difference between this number 5 and that number 5 (no identity distinguished from value), number 5 never changes to number 6 (no change of state). One can think of data values as immutable objects where identity does not matter (identity and equality are the same.)1

1.2

Representation of Data Values

Data values can be represented through various symbols but the data value's meaning is not bound to any particular representation.

For example, cardinal numbers (non-negative integers) are defined - intentionally - as a data type where each value has a successor value, where zero is the successor of no other cardinal value. Based on this definition we can define addition, multiplication, and other mathematical operations. Whatever representation reflects the rules we stated in the intentional definition of the cardinal data type is a valid representation of cardinal numbers. Examples for valid cardinal number representations are decimal digit strings, bags of glass marbles, or scratches on a wall. The number two is represented by the word "five" by the Arabic number "5" or the Roman number "V". The representation does not matter as long as it conforms to the semantic definition of the data type.

Another example, the Boolean data type is defined by its extension, the two distinct values true and false and the rules of negation and combining these values in conjunction and disjunction. The representation of Boolean values can be the words "true" and "false," "yes" and "no," the numbers 0 and 1, any two signs that are distinct from each other. The representation of data types does not matter as long as it conforms to the semantic definition of the data type.

This specification defines the semantics, the meaning of the HL7 data types. This specification is about semantics only, independent from representational and operational concerns or specific implementation technologies. Additional standards for representing the data values defined here are being defined for various technological approaches. These standards are called "Implementable Technology Specification" (ITS.) Those ITS define how values are represented so that they conform to the semantic definitions of this specifications, this may include syntaxes for character or binary representations, and computer procedures to act on the representation of data values. The meaning of these ITS representations communicated, generated, and processed in computer programs, is defined based on this standard, the semantic data type specification.

1.3

Properties of Data Values

Data values have properties defined by their data type. The "fields" of "composite data types" are the most common example of such properties. However, more generally one should think of a data value's property as logical predicates or as mathematical functions; in simpler but still correct terms, properties are questions one can ask about a data value to receive another data value as an answer.

A property is referred to by its name. For example, the data type integer may have a property named "sign." A property has a domain, which is the set of possible "answer" values. The set of possible "answer" values is defined by the property's data type, but the domain of a property may be a subset of the data type's value set.

A property may also have arguments, additional information one must supply with a question to get an answer. For example, an important property of an integer number is that one integer plus another integer results in another integer, so the plus property of one integer needs an argument: the other integer.

Whether semantic properties have arguments is not a fundamentally relevant distinction. A data type's semantic property without arguments is not necessarily a "field" of a "composite" data type. For example, for integer values, we can define the property is-zero that has the Boolean value true when the number is zero and false when the number is not zero. This does not mean that is-zero must be an explicit component of any integer representation.

A data type's semantic property with arguments has no specific operational notions such as "procedure call," "passing arguments," "return values," "throwing exceptions," etc. These are all concepts of computer systems implementation of data types - but these operational notions are irrelevant for the semantics of data types.

This specification is about semantics of data types only. Neither is it about value representation syntax (not even an abstract syntax), nor is it about an operational interface to the data values.

1.4

Need for the Abstraction

Why does this specification make such a big issue about its being abstract from representation syntax as well as operational implementation?

HL7 needs this kind of abstract semantic data type specification for a very practical purpose. One important design feature of HL7 version 3 is its openness towards representation and implementation technologies. All HL7 version 3 specifications are supposed to be done in a form independent from specific representation and implementation technologies. HL7 acknowledges that, while at times some representation and implementation technologies may be more popular than others, technology is going to change - and with changing technology, representations of data values will change. HL7 standards are primarily targeted to healthcare domain information, independent from the technology supporting this information. HL7 expects that specifications defined independent from today's technology will continue to be useful, even after the next technological "paradigm shift".

The issue of data types is closer to implementation technology than most other HL7 information standards - and therein lays a certain danger that we define data types too dependent on current implementation technologies.

The majority of HL7 standards are about complex business objects. Complex business objects with many informational attributes can be specified as abstract syntax, where components are eventually defined in terms of data types. Conversely, defining data types in terms of abstract syntax is of little use because the components of such abstract syntax constructs would still have to have data types.2

Why doesn't this specification define a set of primitive data types based on which composite data types could be defined simply as abstract syntax?

Any concrete implementation of the HL7 standards must ultimately use the built-in data types of their implementation technology. Therefore, we need a very flexible mapping between HL7 abstract data types and those data types built into any specific implementation technology. With a semantic specification, an Implementable Technology Specification (ITS) can conform simply by stating a mapping between the constructs of its technology and the HL7 version 3 data type semantics. Whether a data type is primitive of composite is irrelevant from a semantic perspective, and the answer may be different for different implementation technologies.

For example, this standard specifies a character string as a data type with many properties (e.g., charset, language, etc.) However, in many Implementation Technologies, character strings are primitive first class data types. We encourage that these native data types be used rather than a structure that slavishly represents all the semantic properties as "components." This specification only requires that the properties defined for data values can somehow be inferred from whatever representation is chosen, it does not matter how these values are represented. Whether "primitive" or "composite", with few or many "components", as "fields" or "methods" - this is all irrelevant.

For another example, a decimal representation, a floating-point register and a scaled integer are all possible native representations of real numbers for different implementation technologies. Some of these representations have properties that others do not have. Scaled integers, for instance, have a fixed precision and a relatively small range. Floating-point values have variable precision and a large range, but floating-point values lose any information about precision. Decimal representations are of variable precision and maintain the precision information (yet are slow to processing.) The data type semantics must be independent from all these accidental properties of the various representations, and must define the essential properties that any technology should be able to represent.

1.5

Need for an HL7 Data Type Standard

Why does HL7 need its own data type standard? Why can't HL7 simply adopt a standard defined by some other body?

As noted in the previous section, all HL7 implementation technologies have some data type system, but there are differences among the data type systems between implementation technologies. In addition, many implementation technologies' data type systems are not powerful enough to express the concepts that matter for the HL7 application layer.

For example, few implementation technologies provide the concepts of physical quantities, precision, ranges, missing information, and uncertainty that are so relevant in scientific and health care computing.

On the other hand, implementation technologies do make distinctions that are not relevant from the abstract semantics viewpoint, e.g., fixed point vs. floating-point real numbers; 8, 16, 32, or 64-bit integers; date vs. timestamp.

A number of data type systems have been used as input to this specification. These include the type systems of many major programming languages, including BASIC, Pascal, MODULA-2, C, C++, JAVA, ADA, LISP and SCHEME. This also includes type systems of language-independent implementation technologies, such as Abstract Syntax Notation One (ASN.1), Object Management Group's (OMG) Interface Definition Language (IDL) and Object Constraint Language (OCL), SQL 92 and SQL 99, the ISO 11404 language independent data types, and XML Schema Part 2 data types. Health care standards related data types have been considered as well, among these HL7 version 2.x, types used by CEN TC 251 messages and Electronic Health Record Architecture (EHCRA) and DICOM.

1.6

Forms of Data Type Definitions

This specification defines data types in several forms, using textual description, UML diagrams, tables, and a formal definition.

1.6.1

Formal Data Type Definition Language

A formal definition of data types is used in order to clarify the semantics of the proposed types as unambiguously as possible. This data type definition language is described in detail in Section 1.3. Formal languages make crisp essential statement and are therefore accessible to some formal argument of proof or rebuttal. However, the terseness of such formal statements may also be difficult to understand by humans. Therefore, all the important inferences from the formal statements are also included as plain English statements.

1.6.2

Tables of Properties

For a quick overview at the beginning of many data types this specification contains tables listing what is called "primary" properties. "Primary" properties are a somewhat fuzzy notion of those properties that are more likely to be thought of as "fields" when the data type where implemented as a record ("composite data type"). These tables only exist to facilitate an overview of the content and purpose of data types. While their content is part of the normative specification, the fact that a property is or is not listed in these tables has no significance. There is no requirement that the properties listed in these tables be represented as fields, and these tables are not abstract syntax definitions.

Property tables are not shown for all data types. Again, this does not mean that those data types have no properties. It also does not mean that those data types are "primitive" data types as per this specification. The property tables are used as a helpful summary only, and are not used when they would confuse more than they would help.

Each row of the property tables describes one property with the following columns:

  1. Name - the name of the property as of the formal definition. For some data types, the name field of the first property may be empty. This may happen in those data types that are defined as extension of other data types and when it is not useful for the summary of the child to show any properties of the parent.


  2. Type - the data type of that property.


  3. Definition - a short text describing the meaning of the property.


1.6.3

Unified Modeling Language (UML) Diagrams

The Unified Modeling Language (UML) is used for a graphical presentation of how data types relate. Data types are shown as UML classes. The name compartment contains the long name of the data type followed by a colon and the standard abbreviation. Properties of types without are all shown in the UML operations compartment. No instance attributes are shown, in accordance with the fact that this abstract specification is not about implementation or concrete representation. Generalization links indicate extension and restriction relationships. Aggregations are an additional representation of properties, when the relation between data types through that property is important. Generic types are shown as UML parameterized classes, with UML realization links relating their instantiations

1.7

Overview of Data Types

UML Overview of Data Types

Figure 1: UML Overview of Data Types

Table 1: Overview of HL7 version 3 data types
Name Symbol Description
DataValue ANY Defines the basic properties of every data value. This is an abstract type, meaning that no value can be just a data value without belonging to any concrete type. Every concrete type is a specialization of this general abstract DataValue type.
Boolean BL The Boolean type stands for the values of two-valued logic. A Boolean value can be either or , or, as any other value may be NULL.
Encapsulated Data ED Data that is primarily intended for human interpretation or for further machine processing outside the scope of HL7. This includes unformatted or formatted written language, multimedia data, or structured information in as defined by a different standard (e.g., XML-signatures.) Instead of the data itself, an may contain only a reference (see .) Note that the data type is a specialization of the data type when the media type is text/plain.
Character String ST The character string data type stands for text data, primarily intended for machine processing (e.g., sorting, querying, indexing, etc.) Used for names, symbols, and formal expressions.
Concept Descriptor CD A concept descriptor represents any kind of concept usually by giving a code defined in a code system. A concept descriptor can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems. A concept descriptor can also contain qualifiers to describe, e.g., the concept of a "left foot" as a postcoordinated term built from the primary code "FOOT" and the qualifier "LEFT". In exceptional cases, the concept descriptor need not contain a code but only the original text describing that concept.
Coded Simple Value CS Coded data in its simplest form, where only the code and display name is not predetermined. The code system and code system version is fixed by the context in which the CS value occurs. CS is used for coded attributes that have a single HL7-defined value set.
Coded With Equivalents CE Coded data that consists of a coded value (CV) and, optionally, coded value(s) from other coding systems that identify the same concept. Used when alternative codes may exist.
Instance Identifier II An identifier that uniquely identifies a thing or object. Examples are object identifier for HL7 RIM objects, medical record number, order id, service catalog item id, Vehicle Identification Number (VIN), etc. Instance identifiers are defined based on ISO object identifiers.
Telecommunication Address TEL A telephone number (voice or fax), e-mail address, or other locator for a resource mediated by telecommunication equipment. The address is specified as a Universal Resource Locator (URL) qualified by time specification and use codes that help deciding which address to use for a given time and purpose.
Postal Address AD Mailing and home or office addresses. A sequence of address parts, such as street or post office Box, city, postal code, country, etc.
Entity Name EN A name for a person, organization, place or thing. A sequence of name parts, such as first name or family name, prefix, suffix, etc. Examples for entity name values are "Jim Bob Walton, Jr.", "Health Level Seven, Inc.", "Lake Tahoe", etc. An entity name may be as simple as a character string or may consist of several entity name parts, such as, "Jim", "Bob", "Walton", and "Jr.", "Health Level Seven" and "Inc.", "Lake" and "Tahoe".
Trivial Name TN A restriction of entity name that is effectively a simple string used for a simple name for things and places.
Person Name PN A name for a person. A sequence of name parts, such as first name or family name, prefix, suffix, etc.
Organization Name ON A name for an organization. A sequence of name parts.
Integer Number INT Integer numbers (-1,0,1,2, 100, 3398129, etc.) are precise numbers that are results of counting and enumerating. Integer numbers are discrete, the set of integers is infinite but countable. No arbitrary limit is imposed on the range of integer numbers. Two NULL flavors are defined for the positive and negative infinity.
Real Number REAL Fractional numbers. Typically used whenever quantities are measured, estimated, or computed from other real numbers. The typical representation is decimal, where the number of significant decimal digits is known as the precision.
Ratio RTO A quantity constructed as the quotient of a numerator quantity divided by a denominator quantity. Common factors in the numerator and denominator are not automatically cancelled out. The data type supports titers (e.g., "1:128") and other quantities produced by laboratories that truly represent ratios. Ratios are not simply "structured numerics", particularly blood pressure measurements (e.g. "120/60") are not ratios. In many cases the should be used instead of the .
Physical Quantity PQ A dimensioned quantity expressing the result of measuring.
Monetary Amount MO A monetary amount is a quantity expressing the amount of money in some currency. Currencies are the units in which monetary amounts are denominated in different economic regions. While the monetary amount is a single kind of quantity (money) the exchange rates between the different units are variable. This is the principle difference between physical quantity and monetary amounts, and the reason why currency units are not physical units.
Point in Time TS A a quantity specifying a point on the axis of natural time. A point in time is most often represented as a calendar expression.
Set SET A value that contains other distinct values in no particular order.
Sequence LIST A value that contains other discrete values in a defined sequence.
Bag BAG An unordered collection of values, where each value can be contained more than once in the bag.
Interval IVL A set of consecutive values of an ordered base data type.
History HIST A set of data values that conform to the history item (HXIT) type, (i.e., that have a valid-time property). The history information is not limited to the past; expected future values can also appear.
Uncertain Value - Probabilistic UVP A generic data type extension used to specify a probability expressing the information producer's belief that the given value holds.
Parametric Probability Distribution PPD A generic data type extension specifying uncertainty of quantitative data using a distribution function and its parameters. Aside from the specific parameters of the distribution, a mean (expected value) and standard deviation is always given to help maintain a minimum layer of interoperability if receiving applications cannot deal with a certain probability distribution.
General Timing Specification GTS A set of points in time, specifying the timing of events and actions and the cyclical validity-patterns that may exist for certain kinds of information, such as phone numbers (evening, daytime), addresses (so called "snowbirds," residing in the south during winter and north during summer) and office hours.

1.8

Introduction to the Formal Data Type Definition Language

NOTE: This is not an API specification. While this formal language might resemble some programming language or interface definition language, it is not intended to define the details of programs and other means of implementation. The formal definitions are normative part of this specification, but this particular language needs not be implemented or used in conformant systems; nor need all the semantic properties be implemented or used by conformant systems. The internal working of systems, their way to implement data types, their functionality and services is entirely out of scope of this specification. The formal definition only specifies the meaning of the data values through making statements how one would theoretically expect these values to relate and behave.

This formal data type definition language3 specifies:

Definition of a data type occurs in two steps. First, the data type is declared. The declaration claims a name for a new data type with a list of names, types, and signatures of the new type's semantic properties. This declares, not defines the type. The definition occurs in both logic statements about what is always true about this type's values and their properties (invariant statements.)

1.8.1

Declaration

Every data type is declared in a form that begins with the keyword type. For example, the following is the header of a declaration for the data type Boolean that has the short name alias BL and extends (specializes) the data type ANY.4

Definition 1:
type Boolean alias BL extends ANY
    values(true, false)
{
    BL      not;
    BL      and(BL x);
};
      

The Boolean data type declaration also contains a values-clause that declares the Boolean's complete set of values (its extension) as named entities. These named values are also valid character string literals. None of the other data types defined in this specification has a finite value set, which is why the values-clause is unique to the Boolean. In the marked-up formal language, value names use Italics font.

The block in curly braces following the header contains declarations of the semantic properties that hold for every value of the data type. A semicolon terminates each property declaration; and another semicolon after the closing curly brace terminates the data type declaration.

A property declaration mentions from left to right: (1) the data type of the property's value domain, the property name, and (3) an optional argument list. The argument list of a property is enclosed in parentheses containing a sequence of argument declarations. Each argument is declared by the data type name and argument name. Semantic properties without arguments do not use an empty argument list.5

The extends-clause has the usual meaning of a specialization relationship known from the object-oriented method.6 Specialization means (a) inheritance of properties from the genus to the species, and (b) substitutability of values of the species type for variables of the genus type. In addition, however, this data type definition language specifies two variants of specialization: extension (extends) and restriction (restricts). Extension indicates that additional properties are being defined for the specialized type. Restriction indicates that the inherited properties are being constrained.

An example for inheritance is: when ANY has the property isNull and BL extends ANY then BL also has this property isNull even though isNull is not listed explicitly in the property declaration of BL. An example for substitutability is: when a property is declared as of a data type ANY and BL extends ANY then a value of such property may be of type BL. In other words, substitutability is the same as subsumption of all values of type BL being also values of type ANY.7

The type-declaration may be qualified by the keyword abstract and protected. An abstract type is a type where no value can be just of this type without belonging to a concrete specialization of the abstract. A protected type is a type that is used inside this specification but no property outside this specification should be declared of a protected type.8 (We also use the qualifier private at one point. Private types are only specified for the sake of formal definition of other types and are not used in any form outside this specification.)

1.8.2

Invariant Statements

The declaration of semantic properties, their names, data types, and arguments provide only clues as to what the new data type might be about. The true definition lies in the invariant statements. Invariant statements are logical statements that are true at all times.

Throughout this specification, invariant statements are provided in a formal syntax but are also written in plain English. The advantage of the formal syntax is that it can be interpreted unambiguously, and that it is strongly typed. The advantage of plain English statements is that they are more understandable, especially to those untrained in reading formal languages.

The formal syntax does help to sharpen the decisiveness of this specification. In some cases, however, the full semantics of a type are beyond what can be fully expressed in such invariant statements. The combination of both plain and formal language helps to make this specification more clear.

Invariant statements are formed using the invariant keyword that declares one or more variables in the same form as an argument list of a property. The invariant statement can contain a where clause that constrains the arguments for the entire invariant body. The invariant body is enclosed in curly braces. It contains a list of assertions that must all be true.

Definition 2:
invariant(BL x) where x.nonNull {
    x.and(true).equals(x);
};
      

The semantics of the invariant statement is a logic predicate with a universal quantifier ("for all").

The above invariant statement can be read in English as "For all Boolean values x, where x is non-NULL it holds that x AND true equals x." All properties should be named such that one can read the assertions like English sentences.9

The argument list of an invariant statement need not be specified if no such argument is needed.

Definition 3:
invariant {
    true.not.equals(false);
    false.not.equals(true);
};
      
1.8.2.1
Assertion Expressions

Assertions in invariant statements are expressions built with the semantic properties of defined data types. Assertion expressions must have a Boolean value (true or false.)10 No primitive data types, or operations, pre-exist the definition of any data type. The only preexisting features of the assertion expression language are:11

1.8.2.2
Nested Quantifier Expressions

Within assertion expressions, nested quantifier statements can be formed similar to invariant statements. In fact, the universal quantifier built using the forall keyword is the same as the invariant statement. The universal quantifier can be used in a nested expression when the complexity of the problem requires it, such as in the following example:

Definition 4:
invariant(SET<T> x, y) where x.nonNull {
  x.subset(y).equals(
      forall(T element) where x.contains(element) {
        y.contains(element);
      });
};
        

The existence quantifier has the meaning as in common propositional logic. For example, the following invariant means: "SET values x and y intersect if and only if there exists an element e that is contained in both sets x and y."

Definition 5:
invariant(SET x, y) where x.nonNull {
  x.intersects(y).equals(
      exists(T e) {
        x.contains(e);
        y.contains(e);
      });
};
        

The existence quantifier may have a where-clause; however, there is no difference whether an assertion is made as a where-clause or in the body of the existence quantifier. Conversely, for universal quantifiers, the where-clause weakens the assertion since the body now only applies for values that meet the criterion in the where-clause.

1.8.3

Type Conversion

This specification defines certain allowable conversions between data types. For example, there is a pair of conversions between the Character String (ST) and Encode Data (ED). This means that if a one expects an ED value but actually has an ST value instead, one can turn the ST value into an ED.12

Three kinds of type conversions are defined: promotion, demotion, and character string literals. Type conversions can be implicit or explicit. Implicit type conversion occurs when a certain type is expected (e.g. as an argument to a statement) but a different type is actually provided. If the type provided has a conversion to the type expected the conversion should be done implicitly.

NOTE: an Implementation Technology Specification will have to specify how implicit type conversions are supported. Some technologies support it directly others do not; in any case, processing rules can be set that specify how these conversions are realized.

An explicit conversion can be specified in an assertion expression using the converted-to type name in parenthesis before the converted value. For example the following is an explicit type conversion in the where clause of an invariant statement.

Definition 6:
invariant(ED x) where ((ST)x).nonNull { ... };
      

The type conversion has lower priority than the property resolution period. Thus "(T)a.b " converts the value of the property b of variable a to data type T while "((T)a).b " converts the value of variable a to T and then references property b of that converted value.

Implicit type conversions in the assertion expressions are performed where possible. If a property's formal argument is declared of data type T; but the expression used as an actual argument is of type U; and if U does not extend T; and if U defines a conversion to T, that conversion from T to U takes effect.

1.8.3.1
Demotion

A demotion is a conversion with a net loss of information. Generally, this means that a more complex type is converted into a simple type.

An example for a demotion is the conversion from Interval (IVL) to a simple Quantity (QTY), e.g. the center of the interval. In the data type definition language, a demotion is declared using the keyword demotion and the data type name to which to demote:

Definition 7:
type Interval alias IVL {
  ...
  demotion  QTY;
  ...
};
        

The specification of demotions shall indicate what information is lost and what the major consequences of losing this information are.

1.8.3.2
Promotion

A promotion is a conversion where new information is generated. Generally, this means that a simpler type is converted into a more complex type.

For example, we allow any Quantity (QTY) to be converted to an Interval (IVL). However, IVL has more semantic properties than QTY, low and high boundary. Thus, the conversion of QTY to IVL is a promotion. The additional properties of QTY not present in IVL must assume new values, default values, or computed values. The specification of the promotion must indicate what these values are or how they can be generated.

A promoting conversion from type QTY to type IVL is defined as a semantic property of data type QTY using the keyword promotion and the data type name to which to promote:

Definition 8:
type Quantity alias QTY {
  ...
  promotion   IVL;
  ...
};
        

Typically, a promotion is defined from a simple type to a more complex type. Also typically, the simple type is declared earlier in this document than a more complex type. Declaring all promotions to complex types in the simple type would thus involve forward references and would be confusing to the reader. Therefore, an alternative syntax allows promotions to be defined in the more complex type. This is indicated by naming the type from which to promote in an argument list behind the type to which to promote.

Definition 9:
type Interval alias IVL {
  ...
  promotion   IVL (QTY x);
  ...
};
        

1.8.4

Literal Form

A literal is a character string representation of a data value. Literals are defined for many types. A literal is a type conversion from and to a Character String (ST) with a specially defined syntax.

Not every conversion from and to an ST is a literal conversion, however. A literal for a data type should be able to represent the entire value set of a data type whereas any other conversion to and from ST may only map a smaller subset of the converted data type.

The purpose of having literals is so that one can write down values in a short human readable form. For example, literals for the types integer number (INT) and real number (REAL) are strings of sign, digits, possibly a decimal point, etc. The more important interval types (IVL<REAL>, IVL<PQ>, IVL<TS>) have literal representations that allow one to use, e.g., "<5" to mean "less than 5", which is much more readable than a fully structured form of the interval. For some of the more advanced data types such as intervals, general timing specification, and parametric probability distribution we expect that the literal form may be the only form seen for representing these values until users have become used to the underlying conceptualizations.

Each literal conversion has its own syntax (grammar,) often aligned with what people find intuitive. This syntax may therefore not be completely straightforward from a computer's perspective.13

NOTE: Character string based Implementable Technology Specifications (ITS) of these abstract data types may or may not choose the literals defined here as their representations for these data types. We expect that the XML ITS, will use some but not all of the literals defined here.
1.8.4.1
Declaration

In the data type definition language we declare a literal form as a property of a data type using the keyword literal followed by the data type name ST, since the literal is a conversion to and from the ST data type.

Definition 10:
type IntegerNumber alias INT {
  ...
  literal   ST;
  ...
};
        
1.8.4.2
Definition

The actual definition of the literal form occurs outside the data type declaration body using an attribute grammar. An attribute grammar is a grammar that specifies both syntax and semantics of language structures. The syntax is defined in essentially the Backus-Naur-Form (BNF).14

For example, consider the following simple definition of a data type for cardinal numbers (positive integers.) This type definition depends only the Boolean data type (BL) and has a character string literal declared:

Definition 11:
type CardinalNumber alias CARD {
  BL  isZero;
  BL  equals(CARD x);
  CARD  successor;
  CARD  plus(CARD x);
  CARD  timesTen;
  literal   ST;
};
        

The literal syntax and semantics is first exposed completely and then described in all detail.

Definition 12:
CARD.literal ST {
  CARD
  : CARD digit  { $.equals($1.timesTen.plus($2); }
  | digit   { $.equals($1); };

  CARD digit
  : "0"   { $.isZero; }
  | "1"     { $.equals(0.successor); }
  | "2"     { $.equals(1.successor); }
  ...
  | "8"   { $.equals(7.successor); }
  | "9"     { $.equals(8.successor); }
};
        

Every syntactic rule consists of the name of a symbol, a colon and the definition (so called production) of the symbol. A production is a sequence of symbols. These other symbols are also defined in the grammar, or they are terminal symbols. Terminal symbols are character strings written in double quotes or string patterns (called regular expressions.) Thus the form:

Definition 13:
CARD : CARD digit | digit;
        

means, that any cardinal number symbol is a cardinal number symbol followed by a digit or just a digit. The vertical bar stands for a disjunction (logical OR.) A syntactic rule ends with a semicolon.

Every symbol has exactly one value of a defined data type. The data type of the symbol's value is declared where the symbol is defined:

Definition 14:
CARD digit : "0" | "1" | "2" | ... | "8" | "9";
        

means that the symbol digits has a value of type CARD. The start-symbol is the data type itself and does not need a separate name.

The semantics of the literal expression is specified in semantic rules enclosed in curly braces for each of the defined productions of a symbol:

symbol : production1 { rule1 } | production2 { rule2 } | ... | productionn { rulen };

A semantic rule is simply a semicolon-separated list of Boolean assertion expressions of the same kind as those used in invariant statements. However, there are special variables defined in the semantic rule that all begin with a dollar character (e.g., $, $1, $2, $3, ...) The simple $ stands for the value of the currently defined symbol; while $1, $2, $3, etc. stand for the values of the parts of the semantic rule's associated production. For example, in

Definition 15:
CARD
: CARD digit  { $.equals($1.timesTen.plus($2); }
| digit   { $.equals($1); };
        

the first production "CARD digit" has a semantic rule that says: the value $ of the defined symbol equals the value $1 of the first symbol CARD times ten plus the value $2 of the second symbol digit.15

A terminal symbol can be specified as a string pattern, so-called regular expression. The regular expression syntax used here is the classic syntax invented by Aho and used in AWK, LEX, GREP, and PERL. Regular expressions appear between two slashes /.../. In a regular expression pattern every character except [ ] ^ $ . / : ( ) \ | ? * + { } matches itself. The other characters that are actually used in this specification are defined in Table 4.

Table 2: Special Characters for Regular Expressions
Pattern Definition
[ ... ] Specifies a character class. For example, /[A-Za-z]/ matches the characters of the upper and lower case English alphabet.
[^ ...] Specifies a character class negatively. For example, /[^BCD]/ matches any character except B, C, and D.
...? The preceding pattern is optional. For example, /ab?c/ matches "ac" and "abc".
...* The preceding pattern may occur zero or many times. For example, /ab*c/ matches "ac", "abc", "abbc", "abbbc", etc.
...+ The preceding pattern may occur one or more times. For example, /ab+c/ matches "abc", "abbc", "abbbc", but not "ac".
... {n,m} The preceding pattern may occur n to m times where n and m are cardinal numbers 0 ( n ( m. For example, /ab{2,4}c/ matches "abbc", "abbbc", and "abbbbc".
... | ... The pattern on either side of the bar may match. For example, /ab|cd/ matches "abd" and "acd" but not "abcd".
( ... ) The pattern in parentheses is used as one pattern for the above operators. For example, /a(bc)*/ matches "a", "abc", "abcbc", "abcbcbc", etc.
... : ... The left pattern matches if followed by the right pattern, but the right pattern is not consumed by a match. For example, /ab:c/ matches "abc" but not "ab", however, the value of a symbol thus matched is "ab" and the "c" is left over for the next symbol. The colon is a slight deviation from the conventional slash / but the slash is also conventionally used to enclose the entire pattern and may occur as a character to match - three meanings is one too many.
... \ ... Matches the following character literally, i.e. escapes from any special meaning of that character. For example, /a\+b/ matches "a+b".
... \/ ... Matches the slash as a character. For example, /a\/bc/ macthes "a/bc".

1.8.5

Generic Data Types

Generic data types are incomplete type definitions. This incompleteness is signified by one or more parameters to the type definition. Usually parameters stand for other types. Using parameters, a generic type might declare semantic properties of other not fully specified data types. For example, the generic data type Interval is declared with a parameter T that can stand for any Quantity data type (QTY). The components low and high are declared as being of type T.

Definition 16:
template<QTY T>
type Interval<T> alias IVL<T> {
    T low;
  T   high;
};
      

Instantiating a generic type means completing its definition. For example, to instantiate an Interval, one must specify of what base data type the interval should be. This is done by binding the parameter T. To instantiate an Interval of Integer numbers, one would bind the parameter T to the type Integer. Thus, the incomplete data type Interval is completed to the data type Interval of Integer.

For example the following type definition for MyType declares a property named "multiplicity" that is an interval of the cardinal number data type used in the above examples.

Definition 17:
type MyType alias MT {
    IVL<CARD> multiplicity;
};
      
1.8.5.1
Generic Collections

Generic data types for collections are being used throughout this specification. The most important of them are

Set (SET<T>.) A set contains elements in no particular order and without duplicate elements. The SET<T> data type requires all elements of a set to be of the same data type.

Sequence (LIST<T>.) A sequence is a collection of values in an arbitrary but particular order. A sequence has a head and a tail, where the head is an element and the tail is the sequence without its head.

Interval (IVL<T>.) An interval is a continuous subset of an ordered type.

These and other generic types are fully defined in Section Error! Reference source not found.. These generic data types and their properties are being used in this specification early on. For the best understanding of this specification knowledge about the set, sequence and interval is important and the reader is advised to refer to Section Error! Reference source not found. when coming across a generic type being used to define another type.

1.8.5.2
Generic Type Extensions

Generic data type extensions are generic types with one parameter type that the generic type extends. In the formal data type definition language, generic type extensions follow the pattern:

Definition 18:
template<ANY T> type GenericTypeExtensionName extends T { ... };
        

These generic type extensions inherit properties of their base type and add some specific feature to it. The generic type extension is a specialization of the base type, thus a value of the extension data type can be used instead of its base data type.16

NOTE: values of extended types can be substituted for their base type. However, an ITS may make some constraints as to what extensions to accommodate. Particularly, extensions need not be defined for those components carrying the values of data value properties. Thus, while any data value can be annotated outside the data type specification, an ITS may not provide for a way to annotate the value of a data value property.
Fundamental data types

Figure 2: Fundamental data types

1.9

DataType (type)

Definition:      A meta-type declared in order to allow the formal definitions to speak about the data type of a value. Any data type defined in this specification is a value of the type DataType.

Definition 19:
private type DataType extends DataValue {
    CE  name;
};
      

1.9.1

Properties of DataType (type)

1.9.1.1
Name (name : CE)

Definition:      A CE specifying the identifier of the data type. The short alias name, if defined, is the main code value, in which case the long name is an equivalent translation in the CE value.

1.10

DataValue (ANY)

Definition:      Defines the basic properties of every data value. This is an abstract type, meaning that no value can be just a data value without belonging to any concrete type. Every concrete type is a specialization of this general abstract DataValue type.

Definition 20:
abstract type DataValue alias ANY {
    DataType  dataType;
    BL  nonNull;
    CS  nullFlavor;
    BL  isNull;
    BL  notApplicable;
    BL  unknown;
    BL  other;
    BL  equals(ANY x);
};
      

1.10.1

Properties of DataValue (ANY)

1.10.1.1
Data Type (dataType : type)

Definition:      Represents the fact that every data value implicitly carries information about its own data type. Thus, given a data value one can inquire about its data type.

Definition 21:
invariant(ANY x) {
  x.dataType.nonNull;
};
        
1.10.1.2
Proper Value (nonNull : BL)

Definition:      Indicates that a value is a non-exceptional value of the data type.

Definition 22:
invariant(ANY x) {
  x.isNull.equals(x.nonNull.not);
};
        

When a property, RIM attribute, or message field is called mandatory this means that any non-NULL value of the type to which the property belongs has a non-NULL value for that property, in other words, a field may not be NULL, providing that its container (object, segment, etc.) is to have a non-NULL value.

1.10.1.3
Exceptional Value (isNull : BL)

Definition:      Indicates that a value is an exceptional value, or a NULL-value. A null value means that the information does not exists, is not available or cannot be expressed in the data type's normal value set.

Every data element has either a proper value or it is considered NULL. If (and only if) it is NULL, the provides more detail as to in what way or why no proper value is supplied.

1.10.1.4
Exceptional Value Detail (nullFlavor : BL)

Definition:      If a value is an exceptional value (NULL-value), this specifies in what way and why proper information is missing.

Definition 23:
invariant(ANY x) {
  x.nonNull.equals(x.nullFlavor.isNull);
};
        
Table 3: Domain NullFlavor:
code name definition
NI NoInformation No information whatsoever can be inferred from this exceptional value. This is the most general exceptional value. It is also the default exceptional value.
  NA not applicable No proper value is applicable in this context (e.g., last menstrual period for a male.)
  UNK unknown A proper value is applicable, but not known.
    NASK not asked This information has not been sought (e.g., patient was not asked)
    ASKU asked but unknown Information was sought but not found (e.g., patient was asked but didn't know)
      NAV temporarily unavailable Information is not available at this time but it is expected that it will be available later.
  OTH other The actual value is not an element in the value domain of a variable. (e.g., concept not provided by required code system.)
    PINF positive infinity Positive infinity of numbers.
    NINF negative infinity Negative infinity of numbers.
NP not present Value is not present in a message. This is only defined in messages, never in application data! All values not present in the message must be replaced by the applicable default, or no-information (NI) as the default of all defaults.

The null flavors are a general domain extension of all normal data types. Note the distinction between value domain of any data type and the vocabulary domain of coded data types. A vocabulary domain is a value domain for coded values, but not all value domains are vocabulary domains.

The null flavor "other" is used whenever the actual value is not in the required value domain, this may be, for example, when the value exceeds some constraints that are defined too restrictive (e.g., age less than 100 years.)

NOTE: NULL-flavors are applicable to any property of a data value or a higher-level object attribute. Where the difference of null flavors is not significant, ITS are not required to represent them. If nothing else is noted in this specification, ITS need not represent general NULL-flavors for data-value properties.

Some of these null flavors are defined as named properties that can be used as simple predicates for all data values. This is done to simplify the formulation of invariants in the remainder of this specification.

Remember the difference between semantic properties and representational "components" of data values. An ITS must only represent those components that are needed to infer the semantic properties. The null-flavor predicates ANY.nonNull, ANY.isNull, ANY.notApplicable, ANY.unknown, and ANY.other can all be inferred from the property.

1.10.1.5
Inapplicable Proper Value (notApplicable : BL)

Definition:      A predicate indicating that this exceptional value is of ANY.nullFlavor not-applicable (NA), i.e., that a proper value is not meaningful in the given context.

Definition 24:
invariant(ANY x) {
  x.notApplicable.equals(x.nullFlavor.implies(NA));
};
        
1.10.1.6
unknown (unknown : BL)

Definition:      A predicate indicating that this exceptional value is of ANY.nullFlavor unknown (UNK).

Definition 25:
invariant(ANY x) {
  x.unknown.equals(x.nullFlavor.implies(UNK));
};
        
1.10.1.7
Value Domain Exception (other : BL)

Definition:      A predicate indicating that this exceptional value is of ANY.nullFlavor other (OTH), i.e., that the required value domain does not contain the appropriate value.

Definition 26:
invariant(ANY x) {
  x.other.equals(x.nullFlavor.implies(OTH));
};
        
1.10.1.8
Equality (equals : BL)

Definition:      Equality is a reflexive, symmetric, and transitive relation between any two data values. Only proper values can be equal, null values never are equal (even if they have the same null flavor.)

Definition 27:
invariant(ANY x, y, z)
  where x.nonNull.and(y.nonNull).and(z.nonNull)
{
  x.equals(x);                                        /* reflexivity */
  x.equals(y).equals(y.equals(x));                    /* symmetry */
  x.equals(y).and(y.equals(z)).implies(x.equals(z))   /* transitivity */
  x.equals(y).implies(x.dataType.equals(y.dataType);
};
        

How equality is determined must be defined for each data type. If nothing else is specified, two data values are equal if they are indistinguishable, that is, if they differ in none of their semantic properties. A data type can "override" this general definition of equality, by specifying its own equals relationship. This overriding of the equality relation can be used to exclude semantic properties from the equality test. If a data type excludes semantic properties from its definition of equality, this implies that certain properties (or aspects of properties) that are not part of the equality test are not essential to the meaning of the value.

For example the physical quantity has the two semantic properties (1) a real number and (2) a coded unit of measure. The equality test, however, must account for the fact that, e.g., 1 meter equals 100 centimeters; independent equality of the two semantic properties is too strong a criterion for the equality test. Therefore, physical quantity must override the equality definition.

NOTE: with data values, no distinction exists between equality and identity. Equality is a static property between two values, and values never change.

2

Basic Types

2.1

Boolean (BL)

Definition:      The Boolean type stands for the values of two-valued logic. A Boolean value can be either true or false, or, as any other value may be NULL.

Definition 28:
type Boolean alias BL extends ANY
    values(true, false)
{
            BL  and(BL x);
            BL  not;
  literal   ST;
            BL  or(BL x);
            BL  eor(BL x);
            BL  implies(BL x);
};
    

With any data value potentially being NULL, the two-valued logic is effectively extended to a three-valued logic as shown in the following truth tables:

Table 4: Truth tables for Boolean logic with NULL values
NOT   AND true false NULL OR true false NULL
true false true true false NULL true true true true
false true false false false false false true false NULL
NULL NULL NULL NULL false NULL NULL true NULL NULL

2.1.1

Properties of Boolean (BL)

2.1.1.1
Negation (not : BL)

Definition:      Negation of a Boolean turns true into false and false into true and is NULL for NULL values.

Definition 29:
invariant(BL x) {
  true.not.equals(false);
  false.not.equals(true);
  x.isNull.equals(x.not.isNull);
};
      
2.1.1.2
Conjunction (and : BL)

Definition:      Conjunction (AND) is associative and commutative, with true as a neutral element. False AND any Boolean value is false. These rules hold even if one or both of the operands are NULL. If both operands for AND are NULL, the result is NULL.

Definition 30:
invariant(BL x) {
  x.and(true).equals(x);
  x.and(false).equals(false);
  x.isNull.implies(x.and(y).isNull);
};
      
2.1.1.3
Disjunction (or : BL)

Definition:      The disjunction x OR y is false if and only if x is false and y is false.

Definition 31:
invariant(BL x, y) {
  x.or(y).equals(x.not.and(y.not).not);
};
      
2.1.1.4
Exclusive Disjunction (eor : BL)

Definition:      The exclusive-OR constrains OR such that the two operands may not both be true.

Definition 32:
invariant(BL x, y) {
  x.eor(y).equals(x.or(y).and(x.and(y).not));
};
      
2.1.1.5
Implication (implies : BL)

Definition:      The logical implication is important to make invariant statements. An implication is a rule of the form IF condition THEN conclusion. Logically the implication is defined as the disjunction of the negated condition and the conclusion, meaning that when the condition is true the conclusion must be true to make the overall statement true.

Definition 33:
invariant(BL condition, conclusion) {
  condition.implies(conclusion).equals(condition.not.or(conclusion));
};
      

The implication is not reversible and does not specify what is true when the condition is false (ex falso quodlibet lat. “from false follows anything”).

2.1.1.6
Literal Form

The literal form of the Boolean is determined by the named values specified in the values clause, i.e., true and false.

Overview of Text and Multimedia Data Types

Figure 3: Overview of Text and Multimedia Data Types

2.2

Encapsulated Data (ED)

Definition:      Data that is primarily intended for human interpretation or for further machine processing outside the scope of HL7. This includes unformatted or formatted written language, multimedia data, or structured information in as defined by a different standard (e.g., XML-signatures.) Instead of the data itself, an ED may contain only a reference (see TEL.) Note that the ST data type is a specialization of the ED data type when the ED media type is text/plain.

Definition 34:
type EncapsulatedData alias ED extends BIN {
  CS   mediaType;
  CS   charset;
  CS   language;
  CS   compression;
  TEL  reference
  BIN  integrityCheck;
  CS   integrityCheckAlgorithm;
  ED   thumbnail;
  BL   equals(ED x);
};
    

Encapsulated data can be present in two forms, inline or by reference. Inline data is communicated or moved as part of the encapsulated data value, whereas by-reference data may reside at a different (remote) location. The data is the same whether it is located inline or remote.

2.2.1

Binary Data (BIN)

Definition:      Binary data is a raw block of bits. Binary data is a protected type that should not be declared outside the data type specification.

A bit is semantically identical with a non-null Boolean value. Thus, all binary data is — semantically — a sequence of non-null Boolean values.

Definition 35:
protected type BinaryData alias BIN extends LIST<BL>;
    
NOTE: the representation of arbitrary binary data is the responsibility of an ITS. How the ITS accomplishes this depends on the underlying Implementation Technology (whether it is character-based or binary) and on the so represented data. Semantically character data is represented as binary data, however, a character-based ITS should not convert character data into arbitrary binary data and then represent binary data in a character encoding. Ultimately even character-based implementation technology will communicate binary data.

An empty sequence is not considered binary data but counts as a NULL-value. In other words, non-NULL binary data contains at least one bit. No bit in a non-NULL binary data value can be NULL.

Definition 36:
invariant(BIN x) where x.nonNull {
  x.nonEmpty;
  x.length.greaterThan(0);
  x.head.nonNull;
};
      

2.2.2

Properties of Encapsulated Data (ED)

2.2.2.1
Media Type (mediaType : CS, default text/plain)

Definition:      Identifies the encoding of the encapsulated data and identifies a method to interpret or render the data.

The mediaType is a mandatory property, i.e., every non-NULL instance of encapsulated data must have a defined type property.

Definition 37:
invariant(ED x) where x.nonNull {
  x.mediaType.nonNull;
};
        

The IANA defined domain of media types is established by the Internet standard RFC 2046 [http://www.isi.edu/in-notes/rfc2046.txt]. RFC 2046 defines the media type to consist of two parts:

  1. top level media type, and


  2. media subtype.


However, this specification treats the entire media type as one atomic code symbol in the form defined by IANA, i.e., top level type followed by a slash "/" followed by media subtype. Currently defined media types are registered in a database [http://www.isi.edu/in-notes/iana/assignments/media-types] maintained by IANA. Currently more than 160 different MIME media types are defined, with the list growing rapidly. In general, all those types defined by the IANA may be used.

To promote interoperability, this specification prefers certain media types to others. This is to define a greatest common denominator on which interoperability is not only possible, but that is powerful enough to support even advanced multimedia communication needs.

Table 6 below assigns a status to certain MIME media types, where the status means one of the following:

Table 6: Domain MediaType:
code name status definition
text/plain   Plain Text   required   For any plain text. This is the default and is equivalent to a character string (ST) data type.  
text/x-hl7-ft   HL7 Text   recommended   For compatibility, this represents the HL7 v2.x FT data type. Its use is recommended only for backward compatibility with HL7 v2.x systems.  
text/html   HTML Text   recommended   For marked-up text according to the Hypertext Mark-up Language. HTML markup is sufficient for typographically marking-up most written-text documents. HTML is platform independent and widely deployed.  
application/pdf   PDF   recommended   The Portable Document Format is recommended for written text that is completely laid out and read-only. PDF is a platform independent, widely deployed, and open specification with freely available creation and rendering tools.  
text/xml   XML Text   indifferent   For structured character based data. There is a risk that general SGML/XML is too powerful to allow a sharing of general SGML/XML documents between different applications.  
text/rtf   RTF Text   indifferent   The Rich Text Format is widely used to share word-processor documents. However, RTF does have compatibility problems, as it is quite dependent on the word processor. May be useful if word processor edit-able text should be shared.  
application/msword   MSWORD   deprecated   This format is very prone to compatibility problems. If sharing of edit-able text is required, text/plain, text/html or text/rtf should be used instead.  
audio/basic   Basic Audio   required   This is a format for single channel audio, encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz. This format is standardized by: CCITT, Fascicle III.4 oRecommendation G.711. Pulse Code Modulation (PCM) of Voice Frequencies. Geneva, 1972.  
audio/mpeg   MPEG audio layer 3   required   MPEG-1 Audio layer-3 is an audio compression algorithm and file format defined in ISO 11172-3 and ISO 13818-3. MP3 has an adjustable sampling frequency for highly compressed telephone to CD quality audio.  
audio/k32adpcm   K32ADPCM Audio   indifferent   ADPCM allows compressing audio data. It is defined in the Internet specification RFC 2421 [ftp://ftp.isi.edu/in-notes/rfc2421.txt]. Its implementation base is unclear.  
image/png   PNG Image   required   Portable Network Graphics (PNG) [http://www.cdrom.com/pub/png] is a widely supported lossless image compression standard with open source code available.  
image/gif   GIF Image   indifferent   GIF is a popular format that is universally well supported. However GIF is patent encumbered and should therefore be used with caution.  
image/jpeg   JPEG Image   required   This format is required for high compression of high color photographs. It is a "lossy" compression, but the difference to lossless compression is almost unnoticeable to the human vision.  
image/g3fax   G3Fax Image   recommended   This is recommended only for fax applications.  
image/tiff   TIFF Image   indifferent   Although TIFF (Tag Image File Format) is an international standard it has many interoperability problems in practice. Too many different versions that are not handled by all software alike.  
video/mpeg   MPEG Video   required   MPEG is an international standard, widely deployed, highly efficient for high color video; open source code exists; highly interoperable.  
video/x-avi   X-AVI Video   deprecated   The AVI file format is just a wrapper for many different codecs; it is a source of many interoperability problems.  
model/vrml   VRML Model   recommended   This is an openly standardized format for 3D models that can be useful for virtual reality applications such as anatomy or biochemical research (visualization of the steric structure of macromolecules)  

The set of required media types is very small so that no undue requirements are forced on HL7 applications, especially legacy systems. In general, no HL7 application is forced to support any given kind of media other than written text. For example, many systems just do not want to receive audio data, because those systems can only show written text to their users. It is a matter of application conformance statements to say: "I will not handle audio". Only if a system claims to handle audio media, it must support the required media type for audio.

2.2.2.2
Charset (charset : CS)

Definition:      For character-based encoding types, this property specifies the character set and character encoding used. The charset is defined according to Internet RFC 2278, [http://www.isi.edu/in-notes/rfc2278.txt].

The charset domain is maintained by the Internet Assigned Numbers Authority (IANA) [http://www.isi.edu/in-notes/iana/assignments/character-sets]. The IANA source specifies names and multiple aliases for most character sets. For the HL7's purposes, use of multiple alias names is not allowed. The standard name for HL7 is the one marked by IANA as "preferred for MIME." If IANA has not marked one of the aliases as "preferred for MIME" the main name shall be the one used for HL7.

Table 7 lists a few of the IANA defined character sets that are of interest to current HL7 members.

Table 7: Domain Charset:
code name definition
EBCDIC EBCDIC HL7 is indifferent to the use of this Charset.
ISO-10646-UCS-2 ISO-10646-UCS-2 Deprecated for HL7 use.
ISO-10646-UCS-4 ISO-10646-UCS-4 Deprecated for HL7 use.
ISO-8859-1 ISO-8859-1 HL7 is indifferent to the use of this Charset.
ISO-8859-2 ISO-8859-2 HL7 is indifferent to the use of this Charset.
ISO-8859-5 ISO-8859-5 HL7 is indifferent to the use of this Charset.
JIS-2022-JP JIS-2022-JP HL7 is indifferent to the use of this Charset.
US-ASCII US-ASCII Required for HL7 use.
UTF-7 UTF-7 HL7 is indifferent to the use of this Charset.
UTF-8 UTF-8 Required for Unicode support.
NOTE: The above list is not complete let alone exclusive. In particular, international HL7 affiliates may make special recommendations about charsets to be used in their realm. These recommendations may add additional charsets and may reassign the recommendations status of a listed charset.
2.2.2.3
Language (language : CS)

Definition:      For character based information the language property specifies the human language of the text.

The need for a language code for text data values is documented in RFC 2277, IETF Policy on Character Sets and Languages [http://www.isi.edu/in-notes/rfc2277.txt]. Further background information can be found in Using International Characters in Internet Mail [http://www.imc.org/mail-i18n.html], a memo by the Internet Mail Consortium.

The principles of the code domain of this attribute are specified by the Internet standard RFC 1766. It is a set of pre-coordinated pairs of one 2-letter ISO 639 language code and one 2-letter ISO 3166 country code.17

Language tags do not modify the meaning of the characters found in the text; they are only an advice on if and how to present or communicate the text.18

NOTE: Representation of language tags to text is highly dependent on the ITS. An ITS should use the native way of language tagging provided by its target implementation technology. Some may have language information in a separate component, e.g., XML has the xml:lang tag for strings. Others may rely on language tags as part of the binary character string representation, e.g., ISO 10646 (Unicode) and its "plane-14" language tags.

The language tag should not be mandatory if it is not mandatory in the implementation technology. Semantically, language tagging of strings follows a default-logic. If nothing else is specified the local language is assumed. If a language is set for an entire message or document, that language is the default. If any information element or value that is superior in the syntax hierarchy specifies a language, that language is the default for all subordinate text values.

If language tags are present in the beginning of the encoded binary text (e.g., through Unicode's plane-14 tags) this is the source of the language property of the encapsulated data value.

2.2.2.4
Compression (compression : CS, default NULL)

Definition:      Indicates whether the raw byte data is compressed, and what compression algorithm was used.

Table 8: Domain CompressionAlgorithm:
code name definition
DF deflate The deflate compressed data format as specified in RFC 1951 [ftp://ftp.isi.edu/in-notes/rfc1951.txt].
GZ gzip A compressed data format that is compatible with the widely used GZIP utility as specified in RFC 1952 [ftp://ftp.isi.edu/in-notes/rfc1952.txt] (uses the deflate algorithm.)
ZL zlib A compressed data format that also uses the deflate algorithm. Specified as RFC 1950 [ftp://ftp.isi.edu/in-notes/rfc1950.txt]
Z compress Original UNIX compress algorithm and file format using the LZC algorithm (a variant of LZW). Patent encumbered and less efficient than deflate.

ST may never be compressed.

2.2.2.5
Reference (reference : TEL)

Definition:      A telecommunication address (TEL), such as a URL for HTTP or FTP, which will resolve to precisely the same binary data that could as well have been provided as inline data.

The semantic value of an encapsulated data value is the same, regardless whether the data is present inline data or just by-reference. However, an encapsulated data value without inline data behaves differently, since any attempt to examine the data requires the data to be downloaded from the reference.

An encapsulated data value may have both inline data and a reference. The reference must point to the same data as provided inline.

By-reference encapsulated data may not be allowed depending on the attribute or component that is declared encapsulated data. ST must always be inline.

2.2.2.6
Integrity Check (integrityCheck : BIN)

Definition:      The integrity check is a short binary value representing a cryptographically strong checksum that is calculated over the binary data. The purpose of this property, when communicated with a reference is for anyone to validate later whether the reference still resolved to the same data that the reference resolved to when the encapsulated data value with reference was created.

The integrity check is calculated according to the ED.integrityCheckAlgorithm. By default, the Secure Hash Algorithm-1 (SHA-1) shall be used. The integrity check is binary encoded according to the rules of the integrity check algorithm.

The integrity check is calculated over the raw binary data that is contained in the data component, or that is accessible through the reference. No transformations are made before the integrity check is calculated. If the data is compressed, the Integrity Check is calculated over the compressed data.

2.2.2.7
Integrity Check Algorithm (integrityCheckAlgorithm : CS, default SHA-1)

Definition:      Specifies the algorithm used to compute the integrityCheck value.19

Table 9: Domain IntegrityCheckAlgorithm:
code name definition
SHA-1 secure hash algorithm - 1 This algorithm is defined in FIPS PUB 180-1: Secure Hash Standard. As of April 17, 1995.
SHA-256 secure hash algorithm - 256 This algorithm is defined in FIPS PUB 180-2: Secure Hash Standard.
2.2.2.8
Thumbnail (thumbnail : ED, default NULL)

Definition:      A thumbnail is an abbreviated rendition of the full data. A thumbnail requires significantly fewer resources than the full data, while still maintaining some distinctive similarity with the full data. A thumbnail is typically used with by-reference encapsulated data. It allows a user to select data more efficiently before actually downloading through the reference.

A thumbnail is an abbreviated rendition of the full data.20 A thumbnail requires significantly fewer resources than the full data, while still maintaining some distinctive similarity with the full data. A thumbnail is typically used with by-reference encapsulated data. It allows a user to select data more efficiently before actually downloading through the reference.

Thumbnails may not be allowed depending on the attribute or component that is declared encapsulated data. ST never have thumbnails, and a thumbnail may not itself contain a thumbnail.

Definition 38:
invariant(ED x) where x.thumbnail.nonNull {
  x.thumbnail.thumbnail.isNull;
};
          
NOTE: The ITS should consider the case where the thumbnail and the original both have the same properties of type, charset and compression. In this case, these properties need not be represented explicitly for the thumbnail but might be "inherited" from the main encapsulated data value to its thumbnail.
2.2.2.9
Equality (equals : BL, inherited from ANY)

Two values of type Encapsulated Data are equal if and only if their type and referenced data are equal. For those ED values with compressed data or remote data, only the de-referenced and uncompressed data counts for the equality test. The compression and reference property themselves are excluded from the equality test, as is the thumbnail and the language property. If the ED.mediaType is character based and the charset property is not equal, the charset property must be resolved through mapping of the data between the different character sets.

The integrity check algorithm and integrity check is excluded from the equality test. However, since equality of integrity check value is strong indication for equality of the data, the equality test can be practically based on the integrity check, given equal integrity check algorithm properties.

2.3

Character String (ST)

Definition:      The character string data type stands for text data, primarily intended for machine processing (e.g., sorting, querying, indexing, etc.) Used for names, symbols, and formal expressions.

The character string is a restricted encapsulated data type (ED), whose type property is fixed to text/plain, and whose data must be inlined and not compressed. Thus, the properties compression, reference, integrity check, algorithm, and thumbnail are not applicable. The character string data type is used when the appearance of text does not bear meaning, which is true for formalized text and all kinds of names.

Table 10: Property Summary of Character String
Name Type Description
mediaType CS Identifies the encoding of the encapsulated data and identifies a method to interpret or render the data.
charset CS For character-based encoding types, this property specifies the character set and character encoding used. The charset is defined according to Internet RFC 2278, [].
language CS For character based information the language property specifies the human language of the text.

The character string (ST) data type interprets the encapsulated data as character data (as opposed to bits), depending on the charset property of the encapsulated data type.

Definition 39:
type CharacterString alias ST restricts ED {
    INT   length;
    ST    head;
    ST    tail;
};

invariant(ST x) where x.nonNull {
  x.mediaType.equals("text/plain");
  x.compression.notApplicable;
  x.reference.notApplicable;
  x.integrityCheck.notApplicable;
  x.integrityCheckAlgorithm.notApplicable;
  x.thumbnail.notApplicable;
}
    
NOTE: Because many of the properties of the encapsulated data are bound to a default value, an ITS need not represent these properties at all. In fact, if the character encoding is also fixed, the ITS only represents the encoded character data.

The character string inherits the properties head, tail, and length from BIN (via ED). These properties head, tail, and length, are redefined so that the character string appears as a sequence of entities each of which uniquely identifies one character from the joint set of all characters known by any language of the world.21 The properties head, tail, and length therefore refer to character, string, and character counts respectively, rather than bits and bit counts.

The head of a string is a string of only one character. A character string must at least have one character or else it is NULL. The length of a character string is the number of characters in the string. A zero-length string is an exceptional value (NULL), not a proper character string value.

Definition 40:
invariant(ST x) where x.nonNull {
  x.head.nonEmpty;
  x.head.tail.isEmpty;
  x.tail.isEmpty.implies(x.length.equals(1));
  x.tail.nonEmpty.implies(x.length.equals(x.tail.length.successor));
};
    

The length of a string is the number of characters, not the number of encoded bytes. Byte encoding is an ITS issue and is not relevant on the application layer.

2.3.1

Properties of Character String (ST)

2.3.1.1
Media Type (mediaType : CS, default text/plain, inherited from ED)

Fixed to be "text/plain".

2.3.1.2
Charset (charset : CS, inherited from ED)
2.3.1.3
Language (language : CS, inherited from ED)
2.3.1.4
Literal Form

Two variations of character string literals are defined, a token form and a quoted string.22 The token form consists only of the lower case and upper case English alphabet, the ten decimal digits and the underscore. The quoted string can contain any character between double-quotes. The double quotes prevent a character string from being interpreted as some other literal. The token form allows keywords and names to be parsed from the data type specification language.

Definition 41:
ST.literal ST {
  ST : /"[^]*"/ { $.equals($1); }   /* quoted string */
     | /[a-zA-Z0-9_]+/  { $.equals($1); };    /* token form */
};
      
NOTE: Since character string literals are so fundamental to implementation technology, most ITS will specify some modified character string literal form. However, ITS designers must be aware of the interaction between the character string literal form and the literal forms defined for other data types. This is particularly critical if the other data type's literal form is structured with major components separated by break-characters (e.g., real number, physical quantity, set, and list literals, etc.)

2.4

Concept Descriptor (CD)

Definition:      A concept descriptor represents any kind of concept usually by giving a code defined in a code system. A concept descriptor can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems. A concept descriptor can also contain qualifiers to describe, e.g., the concept of a "left foot" as a postcoordinated term built from the primary code "FOOT" and the qualifier "LEFT". In exceptional cases, the concept descriptor need not contain a code but only the original text describing that concept.

Table 11: Property Summary of Concept Descriptor
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST A common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
translation SET<CD> A set of other concept descriptors that translate this concept descriptor into other code systems.
qualifier LIST<CR> Specifies additional codes that increase the specificity of the the primary code.
Definition 42:
type ConceptDescriptor alias CD extends ANY {
            ST    code;
            ST    displayName;
            OID   codeSystem;
            ST    codeSystemName;
            ST    codeSystemVersion;
            ED    originalText;
            LIST<CR>  qualifier;
            SET<CD>   translation;
            BL  equals(CD x);
            BL  implies(CD x);
  demotion  ED;
};
    

The concept descriptor is mostly used in one of its restricted or “profiled” forms, CS, CE, CV.

The Concept Descriptor information model.

Figure 4: The Concept Descriptor information model.

2.4.1

Concept Role (CR)

Definition:      A concept qualifier code with optionally named role. Both qualifier role and value codes must be defined by the coding system. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg".

Table 12: Property Summary of Concept Role
Name Type Description
name CV Specifies the manner in which the concept role value contributes to the meaning of a code phrase. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "has-laterality" is the CR.name.
value CD The concept that modifies the primary code of a code phrase through the role relation. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows adding the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "left" is the CR.value.
inverted BL Indicates if the sense of the role name is inverted. This can be used in cases where the underlying code system defines inversion but does not provide reciprocal pairs of role names. By default, inverted is false.

The use of qualifiers is strictly governed by the code system used. The CD data type does not permit using code qualifiers with code systems that do not provide for qualifiers (e.g. pre-coordinated systems, such as LOINC, ICD-10 PCS.)

Definition 43:
protected type ConceptRole alias CR extends ANY {
  CV  name;
  BL  inverted;
  CD  value;
};
    
2.4.1.1
Name (name : CV, default NULL)

Definition:      Specifies the manner in which the concept role value contributes to the meaning of a code phrase. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "has-laterality" is the CR.name.

If a coding system allows postcoordination but no role names (e.g. SNOMED) the name attribute can be NULL.

Definition 44:
invariant(CR x) where x.nonNull {
  x.name.qualifier.isNull;
};
      
2.4.1.2
Value (value : CD, default NULL)

Definition:      The concept that modifies the primary code of a code phrase through the role relation. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows adding the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "left" is the CR.value.

This property is of type concept descriptor and thus can in turn have qualifiers. This allows qualifiers to nest. Qualifiers can only be used as far as the underlying code system defines them. It is not allowed to use any kind of qualifiers for code systems that do not explicitly allow and regulate such use of qualifiers.

Definition 45:
invariant(CR x) where x.nonNull {
  x.value.nonNull;
};
      
2.4.1.3
Inversion Indicator (inverted : BL, default false)

Definition:      Indicates if the sense of the role name is inverted. This can be used in cases where the underlying code system defines inversion but does not provide reciprocal pairs of role names. By default, inverted is false.

For example, a code system may define the role relation "causes" besides the concepts "Streptococcus pneumoniae" and "Pneumonia". If that code system allows its roles to be inverted, one can construct the post-coordinated concept "Pneumococcus pneumonia" through "Pneumonia - causes, inverted - Streptococcus pneumoniae."

Roles may only be inverted if the underlying coding system allows such inversion. Notably, if a coding system defines roles in inverse pairs or intentionally does not define certain inversions, the appropriate role code (e.g. "caused-by") must be used rather than inversion. It must be known whether the inverted property is true or false, if it is NULL, the role cannot be interpreted.

Definition 46:
invariant(CR x) where x.nonNull {
  x.inverted.nonNull;
};
      
NOTE: the property "inverted" should be conveyed in an indicator attribute, whose default value is false. That way the inverted indicator does not have to be sent when the role is not inverted.

2.4.2

Properties of Concept Descriptor (CD)

2.4.2.1
Code (code : ST, default NULL)

Definition:      The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.

A non-exceptional CD value has a non-NULL code property whose value is a character string that is a symbol defined by the coding system identified by the codeSystem property. Conversely, a CD value without a value for the code property, or with a value that is not from the cited coding system is an exceptional value (NULL of flavor other).

Definition 47:
invariant(CD x) where x.nonNull {
  x.code.nonNull;
};
      
2.4.2.2
Code System (codeSystem : UID)

Definition:      Specifies the code system that defines the code.

Code systems shall be referred to by Unique Identifier (UID). The UID allows unambiguous reference to standard HL7 codes, other standard code systems, as well as local codes. HL7 shall assign an UID to each of its code tables as well as to external standard coding systems that are being used with HL7. Local sites must use their ISO Object Identifier (OID) to construct a globally unique local coding system identifier.

Under HL7's branch, 2.16.840.1.113883, the sub-branches 5 and 6 contain HL7 standard and external code system identifiers respectively. The HL7 Vocabulary Technical Committee maintains these two branches.

A non-exceptional CD value (i.e. a CD value that has a non-null code property) has a non-NULL code system specifying the system of concepts that defines the code. In other words whenever there is a code there is also a code system.

NOTE: although every non-NULL CD value has a defined code system, in some circumstances, the external representation of the CD value needs not explicitly mention the code system. For example, when the context mandates one and only one code system to be used specifying the code system explicitly would be redundant. However, in that case the code system property assumes that context-specific default value and is not NULL.
Definition 48:
invariant(CD x) where x.code.nonNull {
  x.codeSystem.nonNull;
};
      

An exceptional CD of NULL-flavor "other" indicates that a concept could not be coded in the coding system specified. Thus, for these coding exceptions, the code system that did not contain the appropriate concept must be provided in the code system property.

Some code domains are qualified such that they include the portion of any pertinent local coding system that does not simply paraphrase the standard coding system (coded with extensibility, CWE.) If a CWE qualified field actually contains such a local code, the coding system must specify the local coding system from which the local code was taken. However, for CWE domains the local code is a valid member of the domain, so that local codes in CWE domains constitute neither an error nor an exceptional (NULL/other) value in the sense of this specification.

Definition 49:
invariant(CD x) where x.other {
  x.code.isNull;
  x.codeSystem.nonNull;
};
      
2.4.2.3
Code System Name (codeSystemName : ST, default NULL)

Definition:      A common name of the coding system.

The code system name is optional and has no function in communication. The purpose of a code system name is to assist an unaided human interpreter of a code value to interpret the code system UID. It is suggested — though not absolutely required — that ITS provide for code system name fields in order to annotate the UID for human comprehension.

HL7 systems must not functionally rely on the code system name. The code system name can never modify the meaning of the code system UID value and cannot exist without the UID value.

Definition 50:
invariant(CD x) {
  x.codeSystemName.nonNull.implies(x.codeSystem.nonNull);
};
      
2.4.2.4
Code System Version (codeSystemVersion : ST, default NULL)

Definition:      If applicable, a version descriptor defined specifically for the given code system

HL7 shall specify how these version strings are formed for each external code system. If HL7 has not specified how version strings are formed for a particular coding system, version designations have no defined meaning for such coding system.

Different versions of one code system must be compatible. Whenever a code system changes in an incompatible way, it will constitute a new code system, not simply a different version, regardless of how the vocabulary publisher calls it.

For example, the publisher of ICD-9 and ICD-10 calls these code systems, "revision 9" and "revision 10" respectively. However, ICD-10 is a complete redesign of the ICD code, not a backward compatible version. Therefore, for the purpose of this data type specification, ICD-9 and ICD-10 are different code systems, not just different versions. By contrast, when LOINC updates from revision "1.0j" to "1.0k", HL7 would consider this to be just another version of LOINC, since LOINC revisions are backwards compatible.

Definition 51:
invariant(CD x) {
  x.codeSystemVersion.nonNull.implies(x.codeSystem.nonNull);
};
      
2.4.2.5
Display Name (displayName : ST, default NULL)

Definition:      A name or title for the code, under which the sending system shows the code value to its users.

The display name is included both as a courtesy to an unaided human interpreter of a code value and as a documentation of the name used to display the concept to the user. The display name has no functional meaning; it can never exist without a code; and it can never modify the meaning of the code.

NOTE: display names may not alter the meaning of the code value. Therefore, display names should not be presented to the user on a receiving application system without ascertaining that the display name adequately represents the concept referred to by the code value. Communication must not simply rely on the display name. The display name's main purpose is to support debugging of HL7 protocol data units (e.g., messages.)
Definition 52:
invariant(CD x) {
  x.displayName.nonNull.implies(x.code.nonNull);
};
      
2.4.2.6
Original Text (originalText : ED, default NULL)

Definition:      The text or phrase used as the basis for the coding.

The original text exists in a scenario where an originator of the information does not assign a code, but where the code is assigned later by a coder (post-coding.) In the production of a concept descriptor, original text may thus exist without a code.23

Although the concept descriptor's value property is NULL, original text may still exist for the CD value. Any CD value with the code property of NULL signifies a coding exception. In this case, the text property is a name or description of the concept that was not coded. Such exceptional CD may contain translations. Such translations directly encode the concept described in the original text property.

Neither display name nor original text is part of the information a receiving system must automatically recognize. An information producer is responsible for the proper coding of all information in the value attribute, for any information consumer may safely ignore the display name and original text attributes.

A concept descriptor can be demoted into a character string (ST) value representing only the original text of the CD value.

Definition 53:
invariant(CD x) where x.text.nonNull {
  ((ST)x).equals(x.text);
};
      
2.4.2.7
Translation (translation : SET<CD>, default NULL)

Definition:      A set of other concept descriptors that translate this concept descriptor into other code systems.

The translation property is a set of other concept descriptors thate each translate the first concept descriptor into different code systems. Each element of the translation set was translated from the first concept descriptor. Each translation may, however, also contain translations. Thus, when a code is translated multiple times the information about which code served as the input to which translation will be preserved.

NOTE: the translations are quasi-synonyms of one real-world concept. Every translation in the set is supposed to express the same meaning "in other words." However, exact synonymy rarely exists between two structurally different coding systems. For this reason, not all of the translations will be equally exact.
2.4.2.8
Qualifier (qualifier : LIST<CR>, default NULL)

Definition:      Specifies additional codes that increase the specificity of the the primary code.

The primary code and all the qualifiers together make up one concept. A concept descriptor with qualifiers is also called a code phrase.

Qualifiers constrain the meaning of the primary code, but do not shift or even invert the meaning of the primary code. The meaning of the primary code without qualifiers must not be wrong, although less specific.

Qualifiers can only be used according to well-defined rules of post-coordination. A concept descriptor may only have qualifiers if the code system defines the use of such qualifiers or if there is a third code system that specifies how other code systems may be combined.

For example, SNOMED allows constructing concepts as a combination of multiple codes. SNOMED RT defines a concept "cellulitis (morphologic abnormality)" (M-41650) a role "associated topography" (G-C505) and another concept "left foot (body structure)" (T-D9720). SNOMED-RT allows one to combine these codes in a code phrase:

Example 1:
<observation>
  ...
  <value code="M-41650" codeSystem="&SNM;" displayName="cellulites (morphologic abnormality)"/>
  <qualifier code="T-D9720" display="left foot">
    <name code="G-C505" displayName="associated topography"/>
  </qualifier>
  ...
</observation>
       

In this example, there is one code system, SNOMED-RT that defines all the primary code and the qualifiers and how these are used, which is why in our example representation the codeSystem does not need to be mentioned for the qualifier name and value (the codeSystem is inherited from the primary code.)

Another common example is the U.S. Health Care Financing Administration (HCFA) procedure codes. HCFA procedure codes (HCPCS) are based on CPT-4 and add additional qualifiers to it. For example, the patient with above finding (plus peripheral arterial disease, diabetes mellitus, and a chronic skin lesion at the left great toe) may have an amputation of that toe. The CPT-4 concept is "Amputation, toe metatarsophalangeal joint" (28820) and a HCPCS qualifier needs to be added to indicate "left foot, great toe" (TA). Thus we code:

Example 2:
<procedure>
  ...
  <cd code="28820" codeSystem="&CP4;" displayName="Amputation, toe metatarsophalangeal joint">
    <qualifier code="TA" codeSystem="&HCP;" displayName="left foot, great toe"/>
  </cd>
  ...
</procedure>
       

In this example, the code system of the qualifier (HCPCS) is different than the code system of the primary code (CPT-4.) It is only because there are well-defined rules that define how these codes can be combined, that the qualifier may be used. Note also, that the role name is optional, and for HCPCS codes there are no distinguished role names.

The order of qualifiers is preserved, particularly for the case where the coding system allows post-coordination but defines no role names. (e.g., some ICD-9CM codes, or the old SNOMED "multiaxial" coding.)

The main use of concept descriptors is for the purpose of indexing, querying and decision-making based on a coded value. A semantically unambiguous specification of coded values therefore requires a clear definition of what equality of concept descriptor values means and how CD values should be compared.

2.4.2.9
Equality (equals : BL, inherited from ANY)

The equality of two concept descriptor values is determined solely based upon the code and coding system. The code system version is excluded from the equality test.24 If qualifiers are present, the qualifiers are included in the equality test. Translations are not included in the equality test.25 Exceptional concept descriptor values are not equal even if they have the same NULL-flavor or the same original text.26

Definition 54:
invariant(CD x, y) x.nonNull.and(y.nonNull) {
  x.equals(y).equals(x.code.equals(y.code)
                .and(x.codeSystem.equals(y.codingSystem))
                .and(x.qualifier.equals(y.qualifier)));
};
      

Some code systems define certain style options to their code values. For example, the U.S. National Drug Code (NDC) has a dash and a non-dash form. An example for the dash form may be 1234-5678-90 when the non-dash form is 01234567890. Another example for this problem is when certain ISO or ANSI code tables define optional alphanumeric and numeric forms of two or three character lengths all in one standard.

In the case where code systems provide for multiple representations, HL7 shall make a ruling about which is the preferred form. HL7 shall document that ruling where that respective external coding system is recognized. HL7 shall decide upon the preferred form based on criteria of practicality and common use. In absence of clear criteria of practicality and common use, the safest, most extensible, and least stylized (the least decorated) form shall be given preference.27

2.4.2.10
Implies (implies : BL)

Definition:      Specifies whether this concept descriptor is a specialization of the operand concept descriptor.

Naturally, concepts can be narrowed and widened to include or exclude other concepts. Many coding systems have an explicit notion of concept specialization and generalization. The HL7 vocabulary principles also provide for concept specialization for HL7 defined value sets. The implies-property is a predicate that compares whether one concept is a specialization of another concept, and therefore implies that other concept.

When writing predicates (e.g., conditional statements) that compare two codes, one should usually test for implication not equality of codes.

For example, in Table 19 the "telecommunication use" concepts: work (W), home (H), primary home (HP), and vacation home (HV) are defined, where both HP and HV imply H. When selecting any home phone number, one should test whether the given use-code cimplies H. Testing for cequals H would only find unspecified home phone numbers, but not the primary home phone number.

Operationally, implication can be evaluated in one of two ways. The code system literals may be designed such that one single hierarchy is reflected in the code literal itself (e.g., ICD-9.) Apart from such special cases, however, a terminological knowledge base and an appropriate subsumption algorithm will be required to evaluate implication statements. For post-coordinated coding systems, designing such a subsumption algorithm is a non-trivial task.28

Specializations of Concept Descriptor (CD)

Use of the full concept descriptor data type is exceptional. It requires a conscious decision and documented rationale. In all other cases, one of the CD restrictions shall be used.29

All CD restrictions constrain certain properties of the CD. Properties may be constraint to the extent that only one value may be allowed for that property, in which case mentioning the property becomes redundant. Constraining a property to one value is referred to as suppressing that property. Although, conceptually a suppressed property is still semantically applicable, it is safe for an HL7 interface to assume the implicit default value without testing.

2.4.3

Coded Simple Value (CS) restricts CD

Definition:      Coded data in its simplest form, where only the code and display name is not predetermined. The code system and code system version is fixed by the context in which the CS value occurs. CS is used for coded attributes that have a single HL7-defined value set.

Table 13: Property Summary of Coded Simple Value
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
Definition 55:
type CodedSimpleValue alias CS restricts CD {
  ST    code;
  ST    displayName;
};
    

CS can only be used in either of the following cases:

  1. for a coded attribute which has a single HL7-defined code system, and where code additions to that value set require formal HL7 action (such as harmonization.) Such coded attributes that are designated "structural" codes must be assigned the CS restriction.


  2. for a technical property in this specification that is assigned to a single code system defined either in this specification or defined outside HL7 by a body that has authority over the concept and the maintenance of that code system.


For example, since the ED type subscribes to the MIME design, it trusts IETF to manage the media type. This includes that this specification subscribes to the extension mechanism built into the MIME media type code (e.g., "application/x-myapp").

For CS values, the designation of the domain qualifier will always be CNE (coded, non-extensible) and the context determines unambiguously which HL7 value set applies.30

2.4.3.1
Code (code : ST, default NULL, inherited from CD)
Definition 56:
invariant(CS x) where x.nonNull {
  x.code.nonNull;
};
      
2.4.3.2
Code System (codeSystem : UID, inherited from CD)

Every non-NULL CS value has a defined code system. The external representation of the CS needs not explicitly mention the code system, because the context mandates one and only one code system to be used. Specifying the code system explicitly would be redundant. However, the code system property assumes that context-specific default value and is not NULL.

Definition 57:
invariant(CS x) where x.code.nonNull {
  x.codeSystem.nonNull;
};
      
Definition 58:
invariant(CS x) where x.other {
  x.code.isNull;
  x.codeSystem.nonNull;
};
      
2.4.3.3
Code System Name (codeSystemName : ST, default NULL, excluded)
Definition 59:
invariant(CS x) {
  x.codeSystem.equals(CONTEXT.codeSystem);
};
      
2.4.3.4
Code System Version (codeSystemVersion : ST, default NULL, excluded)
Definition 60:
invariant(CS x) {
  x.codeSystemVersion.equals(CONTEXT.codeSystemVersion);
};
      
2.4.3.5
Display Name (displayName : ST, default NULL, inherited from CD)
2.4.3.6
Original Text (originalText : ED, default NULL, excluded)
Definition 61:
invariant(CS x) {
  x.originalText.isNull;
};
      
2.4.3.7
Translation (translation : SET<CD>, default NULL, excluded)
Definition 62:
invariant(CS x) {
  x.translation.isNull;
};
      
2.4.3.8
Qualifier (qualifier : LIST<CR>, default NULL, excluded)
Definition 63:
invariant(CS x) {
  x.qualifier.notApplicable;
};
      

2.4.4

Coded Value (CV) restricts CD

Definition:      Coded data, specifying only a code, code system, and optionally display name and original text. Used only as the data type for other data types' properties.

Table 14: Property Summary of Coded Value
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST A common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
Definition 64:
type CodedValue alias CV restricts CD {
  ST    code;
  OID   codeSystem;
  ST    codeSystemName;
  ST    codeSystemVersion;
  ST    displayName;
  ST    originalText;
};
    

This type is used when any reasonable use case will require only a single code value to be sent. Thus, it should not be used in circumstances where multiple alternative codes for a given value are desired. This type may be used with both the CNE (coded, non-extensible) and the CWE (coded, with extensibility) domain qualifiers.

2.4.4.1
Code (code : ST, default NULL, inherited from CD)
2.4.4.2
Code System (codeSystem : UID, inherited from CD)
2.4.4.3
Code System Name (codeSystemName : ST, default NULL, inherited from CD)
2.4.4.4
Code System Version (codeSystemVersion : ST, default NULL, inherited from CD)
2.4.4.5
Display Name (displayName : ST, default NULL, inherited from CD)
2.4.4.6
Original Text (originalText : ST, default NULL, inherited from CD)
2.4.4.7
Translation (translation : SET<CD>, default NULL, excluded)
Definition 65:
invariant(CV x) {
  x.translation.isNull;
};
      
2.4.4.8
Qualifier (qualifier : LIST<CR>, default NULL, excluded)
Definition 66:
invariant(CV x) {
  x.qualifier.notApplicable;
};
      

2.4.5

Coded With Equivalents (CE)

Definition:      Coded data that consists of a coded value (CV) and, optionally, coded value(s) from other coding systems that identify the same concept. Used when alternative codes may exist.

Table 15: Property Summary of Coded With Equivalents
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST A common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
translation SET<CD> A set of other concept descriptors that translate this concept descriptor into other code systems.
Definition 67:
type CodedWithEquivalents alias CE restricts CD {
    ST    code;
    ST    displayName;
    OID   codeSystem;
    ST    codeSystemName;
    ST    codeSystemVersion;
    ED  originalText;
    SET<CV>   translation;
};
    

The CE type is used when the use case indicates that alternative codes may exist and where it is useful to communicate these. The CE type provides for a primary code value, plus a set of alternative or equivalent representations.

2.4.5.1
Code (code : ST, default NULL, inherited from CD)
2.4.5.2
Code System (codeSystem : UID, inherited from CD)
2.4.5.3
Code System Name (codeSystemName : ST, default NULL, inherited from CD)
2.4.5.4
Code System Version (codeSystemVersion : ST, default NULL, inherited from CD)
2.4.5.5
Display Name (displayName : ST, default NULL, inherited from CD)
2.4.5.6
Original Text (originalText : ED, default NULL, inherited from CD)
2.4.5.7
Translation (translation : SET<CD>, default NULL, inherited from CD)
2.4.5.8
Qualifier (qualifier : LIST<CR>, default NULL, excluded)
Definition 68:
invariant(CE x) {
  x.qualifier.notApplicable;
};
      

2.5

Instance Identifier (II)

Definition:      An identifier that uniquely identifies a thing or object. Examples are object identifier for HL7 RIM objects, medical record number, order id, service catalog item id, Vehicle Identification Number (VIN), etc. Instance identifiers are defined based on ISO object identifiers.

Instance Identifier data types.

Figure 5: Instance Identifier data types.

Table 16: Property Summary of Instance Identifier
Name Type Description
root UID A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.
extension ST A character string as a unique identifier within the scope of the identifier root.
assigningAuthorityName ST A human readable name or mnemonic for the assigning authority. This name may be provided solely for the convenience of unaided humans interpreting an II value. Note: no automated processing must depend on the assigning authority name to be present in any form.
displayable BL Specifies if the identifier's extension is intendended for human display and data entry (displayable = true) as opposed to pure machine interoperation (displayable = false).
Valid Time IVL<TS> If applicable, specifies during what time the identifier is valid. By default, the identifier is valid indefinitely. Any specific interval may be undefined on either side indicating unknown effective or expiry time. Note: identifiers for information objects in computer systems should not have restricted valid times, but should be globally unique at all times. The identifier valid time is provided mainly for real-world identifiers, whose maintenance policy may include expiry (e.g., credit card numbers.)
Definition 69:
type InstanceIdentifier alias II extends ANY {
  ST      extension;
  UID     root;
  ST      assigningAuthorityName;
  CV      type;
  IVL<TS> validTime;
  BL      equals(II x);
};
    

2.5.1

Unique Identifier String (UID)

Definition:      A unique identifier string is a character string which identifies an object in a globally unique and timeless manner. The allowable formats and values and procedures of this data type are strictly controlled by HL7. At this time, user-assigned identifiers may be certain character representations of ISO Object Identifiers (OID) and DCE Universally Unique Identifiers (UUID). HL7 also reserves the right to assign other forms of UIDs, such as mnemonic identifiers for code systems.

The sole purpose of the UID is to be a globally and timelessly unique identifier. The form of the UID, whether it is an OID, an UUID or any other form is entirely irrelevant. As far as HL7 is concerned, the only thing one can do with a UID is denote to the object for which it stands. Comparison of UIDs is literal, i.e. if two UIDs are literally identical, they are assumed to denote to the same object. If two UIDs are not literally identical they may not denote to the same object (and in general are assumed to denote to different objects.)

Definition 70:
type UniqueIdentifierString alias UID extends ST { };
    

No difference in semantics is recognized between the different allowed forms of the UID. The different forms are not distinguished by a component within or aside from the identifier string itself.

Even though this specification recognizes no semantic difference between the different forms of the unique identifier forms, there are differences of how these identifiers are built and managed, which is the sole reason to define subtypes to the UID for each of the variants.

2.5.2

ISO Object Identifier (OID) extends UID

Definition:      A globally unique string representing an ISO Object Identifier (OID) in a form that consists only of numbers and dots (e.g., "2.16.840.1.113883.3.1"). According to ISO, OIDs are paths in a tree structure, with the left-most number representing the root and the right-most number representing a leaf.

Each branch under the root corresponds to an assigning authority. Each of these assigning authorities may, in turn, designate its own set of assigning authorities that work under its auspices, and so on down the line. Eventually, one of these authorities assigns a unique (to it as an assigning authority) number that corresponds to a leaf node on the tree. The leaf may represent an assigning authority (in which case the OID identifies the authority), or an instance of an object. An assigning authority owns a namespace, consisting of its sub-tree.

OIDs are the preferred scheme for unique identifiers. OIDs should always be used except if one of the inclusion criteria for other schemes apply.

ISO/IEC 8824:1990(E) clause 28 defines the Object Identifier as

28.9 The semantics of an object identifier value are defined by reference to an object identifier tree. An object identifier tree is a tree whose root corresponds to [the ISO/IEC 8824 standard] and whose vertices [i.e. nodes] correspond to administrative authorities responsible for allocating arcs [i.e. branches] from that vertex. Each arc from that tree is labeled by an object identifier component, which is [an integer number]. Each information object to be identified is allocated precisely one vertex (normally a leaf) and no other information object (of the same or a different type) is allocated to that same vertex. Thus an information object is uniquely and unambiguously identified by the sequence of [integer numbers] (object identifier components) labeling the arcs in a path from the root to the vertex allocated to the information object.

28.10 An object identifier value is semantically an ordered list of object identifier component values. Starting with the root of the object identifier tree, each object identifier component value identifies an arc in the object identifier tree. The last object identifier component value identifies an arc leading to a vertex to which an information object has been assigned. It is this information object, which is identified by the object identifier value. [...]

Definition 71:
type ObjectIdentifier alias OID extends UID, LIST<INT> {
  INT   leaf;
  OID   butLeaf;
  OID value(namespace OID);
  literal ST;
};
    

According to ISO/IEC 8824 an object identifier is a sequence of object identifier component values, which are integer numbers. These component values are ordered such that the root of the object identifier tree is the head of the list followed by all the arcs down to the leaf representing the information object identified by the OID. The fact that OID extends LIST<INT> represents this path of object identifier component values from the root to the leaf.

The leaf and "butLeaf" properties take the opposite view. The leaf is the last object identifier component value in the list, and the "butLeaf" property is all of the OID but the leaf. In a sense, the leaf is the identifier value and all of the OID but the leaf refers to the namespace in which the leaf is unique and meaningful.

However, what part of the OID is considered value and what is namespace may be viewed differently. In general, any OID component sequence to the left can be considered the namespace in which the rest of the sequence to the right is defined as a meaningful and unique identifier value. The value-property with a namespace OID as its argument represents this point of view.31

Definition 72:
invariant(OID x) x.nonNull {
  x.nonEmpty;
  x.tail.isEmpty.implies(x.leaf.equals(x.tail));
  x.tail.nonEmpty.implies(x.leaf.equals(x.tail.leaf);
  x.tail.isEmpty.implies(x.butLeaf.isNull);
  x.tail.nonEmpty.implies(x.butLeaf.head.equals(x.head)
                     .and(x.butLeaf.tail.equals(x.butLeaf(x.tail))));
  forall(OID v; OID n) where v.equals(x.value(n)) {
    n.isEmpty.implies(v.equals(x));
    n.nonEmpty.implies(v.equals(x.value(n.tail)));
  };
};
    

2.5.2.1

HL7-Assigned OIDs

HL7 shall establish an OID registry and assign OIDs in its branch for HL7 users and vendors upon their request. HL7 shall also assign OIDs to public identifier-assigning authorities both U.S. nationally (e.g., the U.S. State driver license bureaus, U.S. Social Security Administration, HIPAA Provider ID registry, etc.) and internationally (e.g., other countries Social Security Administrations, Citizen ID registries, etc.) The HL7 assigned OIDs must be used for these organizations, regardless whether these organizations have other OIDs assigned from other sources.

When assigning OIDs to third parties or entities, HL7 shall investigate whether an OID is already assigned for such entities through other sources. It this is the case, HL7 shall record such OID in a catalog, but HL7 shall not assign a duplicate OID in the HL7 branch. If possible, HL7 shall notify a third party when an OID is being assigned for that party in the HL7 branch.

Though HL7 shall exercise diligence before assigning an OID in the HL7 branch to third parties, given the lack of a global OID registry mechanism, one cannot make absolutely certain that there is no preexisting OID assignment for such third-party entity. Also, a duplicate assignment can happen in the future through another source. If such cases of supplicate assignment become known to HL7, HL7 shall make efforts to resolve this situation. For continued interoperability in the meantime, the HL7 assigned OID shall be the preferred OID used.

While most owners of an OID will "design" their namespace sub-tree in some meaningful way, there is no way to generally infer any meaning on the parts of an OID. HL7 does not standardize or require any namespace sub-structure. An OID owner, or anyone having knowledge about the logical structure of part of an OID, may still use that knowledge to infer information about the associated object; however, the techniques cannot be generalized.

Example for a tree of ISO object identifiers. HL7's OID
        is 2.16.840.1.113883.

Figure 6: Example for a tree of ISO object identifiers. HL7's OID is 2.16.840.1.113883.

An HL7 interface must not rely on any knowledge about the substructure of an OID for which it cannot control the assignment policies.

2.5.2.2
Literal Form

The structured definition of the OID is provided mostly to be faithful to the OID specification. Within HL7, OIDs are used as UID strings only, i.e., the literal string value is the only thing that is communicated and is the only thing that a reciever should have to consider when working with UIDs in the scope of the HL7 specification.

Definition 73:
OID.literal ST {
    OID : INT "." OID { $.head.equals($1);
      $.tail.equals($3); }
        | INT   { $.head.equals($1);
      $.tail.isEmpty; }
}
      

For compatibility with the DICOM standard, the literal form of the OID should not exceed 64 characters. (see DICOM part 5, section 9).

2.5.3

DCE Universal Unique Identifier (UUID) extends UID

Definition:      A globally unique string representing a DCE Universal Unique Identifier (UUID) in the common UUID format that consists of 5 hyphen-separated groups of hexadecimal digits having 8, 4, 4, 4, and 12 places respectively.

Both the UUID and its string representation are defined by the Open Group, CDE 1.1 Remote Procedure Call specification, Appendix A.

UUIDs are assigned based on Ethernet MAC addresses, the point in time of creation and some random component. This mix is believed to generate sufficiently unique identifiers without any organizational policy for identifier assignment (in fact this piggy-backs on the organization of MAC address assignment.)

UUIDs are not the preferred identifier scheme for use as HL7 UIDs. UUIDs may be used when identifiers are issued to objects representing individuals (e.g., entity instance identifiers, act event identifiers, etc.) For objects describing classes of things or events (e.g., catalog items), OIDs are the preferred identifier scheme.

Definition 74:
type UniversalUniqueIdentifier alias UUID extends UID {
  INT timeLow;
  INT timeMid;
  INT timeHighAndVersion;
  INT clockSequence;
  INT node;
}
    
2.5.3.1
Literal Form

The structured definition of the UUID is provided mostly to be faithful to the UUID specification. Within HL7, UUIDs are used as UID strings only, i.e., the literal string value is the only thing that is communicated and is the only thing that a reciever should have to consider when working with UIDs in the scope of the HL7 specification.

The literal form for the UUID is defined according to the original specification of the UUID. However, because the HL7 UIDs are case sensitive, for use with HL7, the hexadecimal digits A-F in UUIDs must be converted to upper case.

Definition 75:
UUID.literal ST {
  UUID : hex8 "-" hex4 "-" hex4 "-" hex4 "-" hex12 {
          $.timeLow.equals($1);
          $.timeMid.equals($3);
          $.timeHighAndVersion.equals($5);
          $.clockSequence.equals($7);
          $.node.equals($9);
  }

  INT hex4 :  hexDigit hexDigit hexDigit hexDigit {
          $.equals($1.times(16).plus($2)
	             .times(16).plus($3)
		     .times(16).plus($4);
  }

  INT hex8 :  hexDigit hexDigit hexDigit hexDigit
              hexDigit hexDigit hexDigit hexDigit {
          $.equals($1.times(16).plus($2)
	             .times(16).plus($3)
		     .times(16).plus($4)
	             .times(16).plus($5)
		     .times(16).plus($6)
	             .times(16).plus($7)
		     .times(16).plus($8);
  }

  INT hex12 : hexDigit hexDigit hexDigit hexDigit
              hexDigit hexDigit hexDigit hexDigit
              hexDigit hexDigit hexDigit hexDigit {
          $.equals($1.times(16).plus($2)
	             .times(16).plus($3)
		     .times(16).plus($4)
	             .times(16).plus($5)
		     .times(16).plus($6)
	             .times(16).plus($7)
		     .times(16).plus($8)
		     .times(16).plus($9)
	             .times(16).plus($10)
		     .times(16).plus($11)
		     .times(16).plus($12);
  }

  INT hexDigit
  : "0" { $.equals(0); }
  | "1" { $.equals(1); }
  | "2" { $.equals(2); }
  | "3" { $.equals(3); }
  | "4" { $.equals(4); }
  | "5" { $.equals(5); }
  | "6" { $.equals(6); }
  | "7" { $.equals(7); }
  | "8" { $.equals(8); }
  | "9" { $.equals(9); }
  | "A" { $.equals(10); }
  | "B" { $.equals(11); }
  | "C" { $.equals(12); }
  | "D" { $.equals(13); }
  | "E" { $.equals(14); }
  | "F" { $.equals(15); }
}
      
NOTE: The output of UUID related programs and functions may use all sorts of forms, upper case, lower case, and with or without the hyphens that group the digits. This variate output must be postprocessed to conform to the HL7 specification, i.e., the hyphens must be inserted for the 8-4-4-4-12 grouping and all hexadecimal digits must be converted to upper case.

2.5.4

HL7 Reserved Identifier Scheme (RUID) extends UID

Definition:      A globally unique string defined exclusively by HL7. Identifiers in this scheme are only defined by balloted HL7 specifications. Local communities or systems must never use such reserved identifiers based on bilateral negotiations.

HL7 reserved identifiers are strings that consist only of (US-ASCII) letters, digits and hyphens, where the first character must be a letter. HL7 may assign these reserved identifiers as mnemonic identifiers for major concepts of interest to HL7.

2.5.5

Properties of Instance Identifier (II)

2.5.5.1
Root (root : UID)

Definition:      A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.

In the presence of a non-null extension, the root is commonly interpreted as the "assigning authority", that is, it is supposed that the root somehow refers to an organization that assigns identifiers sent in the extension. However, the root does not have to be an organizational UID, it can also be a UID specifically registered for an identifier scheme.32

Definition 76:
invariant(II x) where x.nonNull {
  root.nonNull;
};
      
2.5.5.2
Extension (extension : ST, default NULL)

Definition:      A character string as a unique identifier within the scope of the identifier root.

The extension is a character string that is unique in the namespace designated by the root. If a non-NULL extension is exists, the root specifies a namespace (sometimes called "assigning authority" or "identifier type".) The extension property may be NULL in which case the root OID is the complete unique identifier.

It is recommended that systems use the OID scheme for external identifiers of their communicated objects. The extension property is mainly provided to accommodate legacy alphanumeric identifier schemes.

Some identifier schemes define certain style options to their code values. For example, the U.S. Social Security Number (SSN) is normally written with dashes that group the digits into a pattern "123-12-1234". However, the dashes are not meaningful and a SSN can just as well be represented as "123121234" without the dashes.

In the case where identifier schemes provide for multiple representations, HL7 shall make a ruling about which is the preferred form. HL7 shall document that ruling where that respective external identifier scheme is recognized. HL7 shall decide upon the preferred form based on criteria of practicality and common use. In absence of clear criteria of practicality and common use, the safest, most extensible, and least stylized (the least decorated) form shall be given preference.33

HL7 may also decide to map common external identifiers to the value portion of the II.root OID. For example, the U.S. SSN could be represented as 2.16.840.1.113883.4.1.123121234. The criteria of practicality and common use will guide HL7's decision on each individual case.

2.5.5.3
Assigning Authority Name (assigningAuthorityName : ST)

Definition:      A human readable name or mnemonic for the assigning authority. This name may be provided solely for the convenience of unaided humans interpreting an II value. Note: no automated processing must depend on the assigning authority name to be present in any form.

NOTE: no automated processing must depend on the assigning authority name to be present in any form.

The assigning authority name is not the name for the individually identified object, but for the namespace, that immediately contains that object identifier. Two cases exist.

  1. If the extension property is non-NULL, the root OID identifies the assigning authority; hence the assigning authority name is a name or mnemonic for the entire root OID.


  2. If the extension is NULL, the assigning authority name is the name or mnemonic of the namespace property of the OID value.


2.5.5.4
Displayable (displayable : BL)

Definition:      Specifies if the identifier's extension is intendended for human display and data entry (displayable = true) as opposed to pure machine interoperation (displayable = false).

2.5.5.5
validTime (Valid Time : IVL<TS>)

Definition:      If applicable, specifies during what time the identifier is valid. By default, the identifier is valid indefinitely. Any specific interval may be undefined on either side indicating unknown effective or expiry time. Note: identifiers for information objects in computer systems should not have restricted valid times, but should be globally unique at all times. The identifier valid time is provided mainly for real-world identifiers, whose maintenance policy may include expiry (e.g., credit card numbers.)

NOTE: identifiers for information objects in computer systems should not have restricted valid times, but should be globally unique at all times. The identifier valid time is provided mainly for real-world identifiers, whose maintenance policy may include expiry (e.g., credit card numbers).

The II type conforms to the history item data type extension (Section 0). This means that the data types HXIT<II> and II are the same.

2.5.5.6
Equality (equals : BL, inherited from ANY)

Two instance identifiers are equal if and only if their root and extension properties are equal.

Definition 77:
invariant(II x, y) where x.nonNull.and(y.nonNull) {
  x.equals(y).equals(x.root.equals(y.root)
                .and(x.extension.equals(y.extension)));
}
      

2.6

Telecommunication Address (TEL) extends URL

Definition:      A telephone number (voice or fax), e-mail address, or other locator for a resource mediated by telecommunication equipment. The address is specified as a Universal Resource Locator (URL) qualified by time specification and use codes that help deciding which address to use for a given time and purpose.

Table 17: Property Summary of Telecommunication Address
Name Type Description
validTime GTS Specifies the periods of time during which the telecommunication address can be used. For a telephone number, this can indicate the time of day in which the party can be reached on that telephone. For a web address, it may specify a time range in which the web content is promised to be available under the given address.
use SET<CS> One or more codes advising a system or user which telecommunication address in a set of like addresses to select for a given telecommunication need.

The semantics of a telecommunication address is that a communicating entity (the responder) listens and responds to that address, and therefore can be contacted by an other communicating entity (the initiator.)

The responder of a telecommunication address may be an automatic service that can respond with information (e.g., FTP or HTTP services.) In such case a telecommunication address is a reference to that information accessible through that address. A telecommunication address value can thus be resolved to some information (in the form of encapsulated data, ED.)

Definition 78:
type TelecommunicationAddress alias TEL extends URL {
  GTS   validTime;
  SET<CS>   use;
  BL  equals(TEL x);
};
    

The telecommunication address is an extension of the Universal Resource Locator (URL) specified as an Internet standard RFC 1738 [http://www.isi.edu/in-notes/rfc1738.txt]. The URL specifies the protocol and the contact point defined by that protocol for the resource. Notable use cases for the telecommunication address data type are for telephone and fax numbers, e-mail addresses, Hypertext references, FTP references, etc.

2.6.1

Universal Resource Locator (URL)

Definition:      A telecommunications address specified according to Internet standard RFC 1738 [http://www.isi.edu/in-notes/rfc1738.txt]. The URL specifies the protocol and the contact point defined by that protocol for the resource. Notable uses of the telecommunication address data type are for telephone and telefax numbers, e-mail addresses, Hypertext references, FTP references, etc.

The Internet standard RFC 1738 [http://www.isi.edu/in-notes/rfc1738.txt] defines URL as follows:

Just as there are many different methods of access to resources, there are several schemes for describing the location of such resources.

The generic syntax for URLs provides a framework for new schemes to be established using protocols other than those defined in this document.

URLs are used to "locate" resources, by providing an abstract identification of the resource location. Having located a resource, a system may perform a variety of operations on the resource, as might be characterized by such words as "access", "update", "replace", "find attributes". In general, only the "access" method needs to be specified for any URL scheme.

Definition 79:
protected type UniversalResourceLocator alias URL extends ANY {
  CS  scheme;
  ST  address;
  literal ST;
};
    
2.6.1.1
Scheme (scheme : CS)

Definition:      Identifies the protocol used to interpret the address string and to access the resource so addressed.

Some URL schemes are registered by the Internet Assigned Numbers Authority (IANA) [http://www.iana.org], however IANA only registers URL schemes that are defined in Internet RFC documents. In fact there are a number of URL schemes defined outside RFC documents, part of which are registered with the World Wide Web Consortium (W3C).34

Similar to the ED.mediaType, HL7 makes suggestions about . classifying them as required, recommended, other, and deprecated. Any scheme not mentioned has status other.

Table 18: Domain URLScheme:
code name definition
tel Telephone A voice telephone number [draft-antti-telephony-url-11.txt]. Required for HL7 use.
fax Fax A telephone number served by a fax device [draft-antti-telephony-url-11.txt]. Required for HL7 use.
mailto Mailto Electronic mail address [RFC 2368]. Required for HL7 use.
http HTTP Hypertext Transfer Protocol [RFC 2068]. Required for HL7 use.
ftp FTP The File Transfer Protocol (FTP) [RFC 1738]. Required for HL7 use.
file File Host-specific local file names [RCF 1738]. Note that the file scheme works only for local files. There is little use for exchanging local file names between systems, since the receiving system likely will not be able to access the file. Deprecated for
nfs NFS Network File System protocol [RFC 2224]. Some sites use NFS servers to share data files.
telnet Telnet Reference to interactive sessions [RFC 1738]. Some sites, (e.g., laboratories) have TTY based remote query sessions that can be accessed through telnet.
modem Modem A telephone number served by a modem device [draft-antti-telephony-url-11.txt].

Note that this specification explicitly limits itself to URLs. Universal Resource Names (URN) are not covered by this specification. URNs are a kind of identifier scheme for other than accessible resources. This specification, however, is only concerned with accessible resources, which belong into the URL category.

2.6.1.2
Address (address : ST)

Definition:      The address is a character string whose format is entirely defined by the URL.scheme.

2.6.1.3
Literal Form

While conceptually URL has the properties scheme and address, the common appearance of a URL is as a string literal formed according to the Internet standard. The general syntax of the URL literal is:

Definition 80:
URL.literal ST {
  URL : /[a-z0-9+.-]+/ ":" ST { $.scheme.equals($1);
                                $.address.equals($3); }
};
      
Telephone and FAX Numbers

Note that there is no special data type for telephone numbers, telephone numbers are TEL and are specified as URL.

The telephone number URL is defined in Internet RFC 2806 [http://www.isi.edu/in-notes/rfc2806.txt]. Its definition is summarized in this subsection. This summary does not override or change any of the Internet specification's rulings.

The voice telephone URLs begin with "tel:" and fax URLs begin with "fax:"

The URL.address is the telephone number in accordance with ITU-T E.123 Telephone Network and ISDN Operation, Numbering, Routing and Mobile Service: Notation for National and International Telephone Numbers (1993). While HL7 does not add or withdraw from the URL specification, the preferred subset of the URL.address address syntax is given as follows:

Definition 81:
proctected type TelephoneURL restricts URL {
  literal ST {
    URL : /(tel)|(fax)/ ":" address   { $.scheme.equals($1);
                  $.address.equals($3); };
    ST address : "+" phoneDigits
    ST phoneDigits : digitOrSeparator phoneDigits | digitOrSeparator
    ST digitOrSeparator : digit | separator;
    ST digit : /[0..9]/;
    ST separator : /[().-]/;
  };
};
      

The global absolute telephone numbers starting with the "+" and country code are preferred. Separator characters serve as decoration but have no bearing on the meaning of the telephone number. For example: "tel:+13176307960" and "tel:+1(317)630-7960" are both the same telephone number; "fax:+49308101724" and "fax:+49(30)8101-724" are both the same fax number.

2.6.2

Properties of Telecommunication Address (TEL)

2.6.2.1
Valid Time (validTime : GTS)

Definition:      Specifies the periods of time during which the telecommunication address can be used. For a telephone number, this can indicate the time of day in which the party can be reached on that telephone. For a web address, it may specify a time range in which the web content is promised to be available under the given address.

The TEL data type where is constrained to a simple interval of time (IVL<TS>) conforms to the history item data type extension (HXIT). Thus, HXIT<TEL> is a simple restriction of TEL.

2.6.2.2
Use Code (use : SET<CS>)

Definition:      One or more codes advising a system or user which telecommunication address in a set of like addresses to select for a given telecommunication need.

Table 19: Domain TelecommunicationAddressUse:
code name definition
H home A communication address at a home, attempted contacts for business purposes might intrude privacy and chances are one will contact family or other household members instead of the person one wishes to call. Typically used with urgent cases, or if no othe
HP primary home The primary home, to reach a person after business hours.
HV vacation home A vacation home, to reach a person while on vacation.
WP work place An office address. First choice for business related contacts during business hours.
AS answering service An automated answering machine used for less urgent cases and if the main purpose of contact is to leave a message or access an automated announcement.
EC emergency contact A contact specifically designated to be used for emergencies. This is the first choice in emergencies, independent of any other use codes.
PG pager A paging device suitable to solicit a callback or to leave a very short message.
MC mobile contact A telecommunication device that moves and stays with its owner. May have characteristics of all other use codes, suitable for urgent matters, not the first choice for routine business.

The telecommunication use code is not a complete classification for equipment types or locations. Its main purpose is to suggest or discourage the use of a particular telecommunication address. There are no easily defined rules that govern the selection of a telecommunication address.

2.6.2.3
Equality (equals : BL, inherited from ANY)

Two telecommunication address values are considered equal if both their URLs are equal. Use code and valid time are excluded from the equality test.

Definition 82:
invariant(TEL x, y) x.nonNull.and(y.nonNull) {
  x.equals(y).equals(((URL)x).equals((URL)y));
}
      
Data types for Postal Address and Entity Names (Person,
Organization, and Trivial Names) are all based on extensions of
a character string.

Figure 7: Data types for Postal Address and Entity Names (Person, Organization, and Trivial Names) are all based on extensions of a character string.

2.7

Postal Address (AD) extends

Definition:      Mailing and home or office addresses. A sequence of address parts, such as street or post office Box, city, postal code, country, etc.

The AD is primarily used to communicate data that will allow printing mail labels, that will allow a person to physically visit that address. The postal address data type is not supposed to be a container for additional information that might be useful for finding geographic locations (e.g., GPS coordinates) or for performing epidemiological studies. Such additional information is captured by other, more appropriate HL7 elements.

Table 20: Property Summary of Postal Address
Name Type Description
use CS A set of codes advising a system or user which address in a set of like addresses to select for a given purpose.
validTime GTS A General Timing Specification (GTS) specifying the periods of time during which the address can be used. This is used to specify different addresses for different times of the year or to refer to historical addresses.
formatted ST A character string value with the address formatted in lines and with proper spacing. This is only a semantic property to define the function of some of the address part types.

Addresses are conceptualized as text with added logical mark-up. The mark-up may break the address into lines and may describe in detail the role of each address part if it is known. Address parts occur in the address in the order in which they would be printed on a mailing label. The approach is similar to HTML or XML markup of text (but it is not technically limited to XML representations.)

Addresses are essentially sequences of address parts, but add a "use" code and a valid time range for information about if and when the address can be used for a given purpose.

Definition 83:
type PostalAddress alias AD extends LIST<ADXP> {
  SET<CS>   use;
  GTS validTime;
  BL  equals(AD x);
  ST  formatted;
};
    

2.7.1

Address Part (ADXP) extends ST

Definition:      A character string that may have a type-tag signifying its role in the address. Typical parts that exist in about every address are street, house number, or post box, postal code, city, country but other roles may be defined regionally, nationally, or on an enterprise level (e.g. in military addresses). Addresses are usually broken up into lines, which are indicated by special line-breaking delimiter elements (e.g., DEL).

Table 21: Property Summary of Address Part
Name Type Description
partType CS Specifies whether an address part names the street, city, country, postal code, post box, etc. If the type is NULL the address part is unclassified and would simply appear on an address label as is.
Definition 84:
protected type AddressPart alias ADXP extends ST {
  CS  type;
};
    
2.7.1.1
Address Part Type (partType : CS)

Definition:      Specifies whether an address part names the street, city, country, postal code, post box, etc. If the type is NULL the address part is unclassified and would simply appear on an address label as is.

Table 22: Domain AddressPartType:
code name definition
DEL delimiter Delimiters are printed without framing white space. If no value component is provided, the delimiter appears as a line break.
CNT country Country
STA state or province A sub-unit of a country with limited sovereignty in a federally organized country.
CPA County or Parish A sub-unit of a state or province. (49 of the United States of America use the term "county;" Louisiana uses the term "parish").
CTY city City
ZIP postal code A postal code designating a region defined by the postal service.
STR street name Street name or number.
HNR house number The number of a house or lot alongside the street. Also known as "primary street number", but does not number the street but the house.
SAL Street Address Line A street address line is often used instead of separately distinguishing street name and house number. The street address line can repeat to represent "street address line 1" and "street address line 2".
DIR direction direction (e.g., N, S, W, E)
ADL additional locator This can be a unit designator, such as apartment number, suite number, or floor. There may be several unit designators in an address (e.g., "3rd floor, Appt. 342".) This can also be a designator pointing away from the location, rather than specifying a s
POB post box A numbered box located in a post station.
CEN census tract A sub-unit of country delineated for demographic purposes.

2.7.2

Properties of Postal Address (AD)

2.7.2.1
Use Code (use : CS)

Definition:      A set of codes advising a system or user which address in a set of like addresses to select for a given purpose.

Table 23: Domain PostalAddressUse:
code name definition
PHYS visit address A physical address, used primarily to visit the addressee.
PST postal address Used to send mail.
TMP temporary address A temporary address, may be good for visit or mailing. Note that an address history can provide more detailed information.
BAD bad address A flag indicating that the address is bad, in fact, useless.
H home A communication address at a home, attempted contacts for business purposes might intrude privacy and chances are one will contact family or other household members instead of the person one wishes to call. Typically used with urgent cases, or if no othe
HP primary home The primary home, to reach a person after business hours.
HV vacation home A vacation home, to reach a person while on vacation.
WP work place An office address. First choice for business related contacts during business hours.
ABC Alphabetic Alphabetic transcription of name (Japanese: romaji)Alphabetic transcription of name (Japanese: romaji)Alphabetic transcription of name (Japanese: romaji)
SYL Syllabic Syllabic transcription of name (e.g., Japanese kana, Korean hangul)Syllabic transcription of name (e.g., Japanese kana, Korean hangul)Syllabic transcription of name (e.g., Japanese kana, Korean hangul)
IDE Ideographic Ideographic representation of name (e.g., Japanese kanji, Chinese characters)Ideographic representation of name (e.g., Japanese kanji, Chinese characters)Ideographic representation of name (e.g., Japanese kanji, Chinese characters)

An address without specific use code might be a default address useful for any purpose, but an address with a specific use code would be preferred for that respective purpose.

2.7.2.2
Valid Time (validTime : GTS)

Definition:      A General Timing Specification (GTS) specifying the periods of time during which the address can be used. This is used to specify different addresses for different times of the year or to refer to historical addresses.

The AD where is constrained to a simple interval of time (IVL<TS>) conforms to the history item data type extension (HXIT). Thus, HXIT<AD> is a simple restriction of AD.

2.7.2.3
Equality (equals : BL, inherited from ANY)

Two address values are considered equal if both contain the same address parts, independent of ordering. Use code and valid time are excluded from the equality test.

Definition 85:
invariant(AD x, y) x.nonNull.and(y.nonNull) {
  x.equals(y).equals((
        forall(ADXP p) where x.contains(p) {
	  y.contains(p);
	}).and.(
        forall(ADXP p) where x.contains(p) {
	  y.contains(p);
	}));
};
      
2.7.2.4
Formatting Address (formatted : ST)

Definition:      A character string value with the address formatted in lines and with proper spacing. This is only a semantic property to define the function of some of the address part types.35

The AD data type's main purpose is to capture postal addresses, such that one can visit that address or send mail to it. Humans will look at addresses in printed form, such as on a mailing label. The AD data type defines precise rules of how its data is formatted.36

Addresses are ordered lists of address parts. Each address part is printed in the order of the list from left to right and top to bottom (or in any other language-specific reading direction, which to determine is outside the scope of this specification.) Every address part value is printed. Most address parts are framed by white space. The following six rules govern the setting of white space.

  1. White space never accumulates, i.e. two subsequent spaces are the same as one. Subsequent line breaks can be reduced to one. White space around a line break is not significant.


  2. Literals may contain explicit white space, subject to the same white space reduction rules. There is no notion of a literal line break within the text of a single address part.


  3. Leading and trailing explicit white space is insignificant in all address parts, except for delimiter (DEL) address parts.


  4. By default, an address part is surrounded by implicit white space.


  5. Delimiter (DEL) address parts are not surrounded by any implicit white space.


  6. Leading and trailing explicit white space is significant in delimiter (DEL) address parts.


This means that all address parts are generally surrounded by white space, but white space does never accumulate. Delimiters are never surrounded by implicit white space and every white space contributed by preceding or succeeding address parts is discarded, whether it was implicit or explicit.

The following shows examples of addresses in the XML ITS form.

1050 W Wishard Blvd,
RG 5th floor,
Indianapolis, IN 46240.

Can be encoded in any of the following forms:37

The first form would result from a system that only stores addresses as free text or in a list of fields line1, line2, etc.:

Example 3:
<addr use="WP">
  1050 W Wishard Blvd,
  RG 5th floor,
  Indianapolis, IN 46240
</addr>
 

The second form is more specific about the role of the address parts than the first one:

Example 4:
<addr use="WP">
  <streetAddressLine>1050 W Wishard Blvd</streetAddressLine>,
  <streetAddressLine>RG 5th floor</streetAddressLine>,
  <city>Indianapolis</city>, <state>IN</state> <postalCode>46240</postalCode>
</addr>
 

This form is the typical form seen in the U.S., where street address is sometimes separated, and city, state and ZIP code are always separated.

The third is even more specific:

Example 5:
<addr use="WP">
  <houseNumber>1050</houseNumber> <direction>W</direction> <streetName>Wishard Blvd</streetName>,
  <additionalLocator>RG 5th floor</additionalLocator>,
  <city>Indianapolis</city>, <state>IN</state> <postalCode>46240</postalCode>
</addr>
 

The latter form above is not used in the USA. However, it is useful in Germany, where many systems keep house number as a distinct field. For example, the German address:

Windsteiner Weg 54a,
D-14165 Berlin

would most likely be encoded as follows:38

Example 6:
<addr use="HP">
  <streetName>Windsteiner Weg</streetName> <houseNumber>45a</houseNumber>,
  <country>D</country>-<postalCode>14165</postalCode> <CTY>Berlin</CTY>
</addr>
 

2.8

Entity Name (EN)

Definition:      A name for a person, organization, place or thing. A sequence of name parts, such as first name or family name, prefix, suffix, etc. Examples for entity name values are "Jim Bob Walton, Jr.", "Health Level Seven, Inc.", "Lake Tahoe", etc. An entity name may be as simple as a character string or may consist of several entity name parts, such as, "Jim", "Bob", "Walton", and "Jr.", "Health Level Seven" and "Inc.", "Lake" and "Tahoe".

Entity names are conceptualized as text with added logical mark-up. Name parts occur in a natural order in which they would be displayed, as opposed to in a order detemined by name part. The ordeing of the name parts is significant a feature that replaces the need for a separate "display name" property. Applications may change that ordering of name parts to account for their user's customary ordering of name parts. The approach is similar to HTML or XML markup of text (but it is not technically limited to XML representations.)

Entity names are essentially sequences of entity name parts, but add a "use" code and a valid time range for information about when the name was used and how to choose between multiple aliases that may be valid at the same point in time.

Definition 86:
type EntityName alias EN extends LIST<PNXP> {
  SET<CS> use;
  IVL<TS> validTime;
  BL  equals(AD x);
  ST  formatted;
};
    

2.8.1

Entity Name Part (ENXP)

Definition:      A character string token representing a part of a name. May have a type code signifying the role of the part in the whole entity name, and a qualifyer code for more detail about the name part type. Typical name parts for person names are given names, and family names, titles, etc.

Table 25: Property Summary of Entity Name Part
Name Type Description
partType CS Indicates whether the name part is a given name, family name, prefix, suffix, etc.
qualifier CS The qualifier is a set of codes each of which specifies a certain subcategory of the name part in addition to the main name part type. For example, a given name may be flagged as a nickname, a family name may be a pseudonym or a name of public records
Definition 87:
protected type EntityNamePart alias ENXP extends ST {
  CS  type;
  SET<CS>   qualifier;
};
    
2.8.1.1
Name Part Type (partType : CS)

Definition:      Indicates whether the name part is a given name, family name, prefix, suffix, etc.

Table 26: Domain EntityNamePartType:
code name definition
FAM family Family name, this is the name that links to the genealogy. In some cultures (e.g. Eritrea) the family name of a son is the first name of his father.
GIV given Given name (don't call it "first name" since this given names do not always come first)
PFX prefix A prefix has a strong association to the immediately following name part. A prefix has no implicit trailing white space (it has implicit leading white space though). Note that prefixes can be inverted.A prefix has a strong association to the immediately following name part. A prefix has no implicit trailing white space (it has implicit leading white space though). Note that prefixes can be inverted.
SFX suffix A suffix has a strong association to the immediately preceding name part. A prefix has no implicit leading white space (it has implicit trailing white space though). Suffices can not be inverted.A suffix has a strong association to the immediately preceding name part. A prefix has no implicit leading white space (it has implicit trailing white space though). Suffices can not be inverted.
DEL delimiter A delimiter has no meaning other than being literally printed in this name representation. A delimiter has no implicit leading and trailing white space.A delimiter has no meaning other than being literally printed in this name representation. A delimiter has no implicit leading and trailing white space.

Not every name part must have a type code, if the type code is unknown, not applicable, or simply undefined this is expressed by a NULL value (type.isNull). For example, a name may be "Rogan Sulma" and it may not be clear which one is a first name or which is a last name, or whether Rogan may be a title.

Entity names are conceptualized as text with added mark-up. The mark-up may break the address into lines and may describe in detail the role of each name part if it is known. Name parts occur in the order in which they would be printed on a mailing label. The model is similar to HTML or XML markup of text.

Applications are not required to preserve the ordering of the name parts.

2.8.1.2
Qualifier (qualifier : CS)

Definition:      The qualifier is a set of codes each of which specifies a certain subcategory of the name part in addition to the main name part type. For example, a given name may be flagged as a nickname, a family name may be a pseudonym or a name of public records

Table 27: Domain EntityNamePartQualifier:
code name definition
BR birth A name that a person had shortly after being born. Usually for family names but may be used to mark given names at birth that may have changed later.
SP spouse The name assumed from the partner in a marital relationship (hence the "M"). Usually the spouse's family name. Note that no inference about gender can be made from the existence of spouse names.
VV voorvoegsel A Dutch "voorvoegsel" is something like "van" or "de" that might have indicated nobility in the past but no longer so. Similar prefixes exist in other languages such es Spanish, French or Portugese.
AC academic Indicates that a prefix like "Dr." or a suffix like "M.D." or "Ph.D." is an academic title.
PR professional Primarily in the British Imperial culture people tend to have an abbreviation of their professional organization as part of their credential suffices.
NB nobility In Europe and Asia, there are still people with nobility titles (aristocrats.) German "von" is generally a nobility title, not a mere voorvoegsel. Others are "Earl of" or "His Majesty King of..." etc. Rarely used nowadays, but some systems do keep trac
LS Legal status For organizations a suffix indicating the legal status, e.g., "Inc.", "Co.", "AG", "GmbH", "B.V." "S.A.", "Ltd." etc.
CL callme A callme name is (usually a given name) that is preferred when a person is directly addressed.
IN initial Indicates that a name part is just an initial. Initials do not imply a trailing period since this would not work with non-Latin scripts. Initials may consist of more than one letter, e.g., "Ph." could stand for "Philippe" or "Th." for "Thomas".

2.8.2

Properties of Entity Name (EN)

2.8.2.1
Use Code (use : CS)

Definition:      A set of codes advising a system or user which name in a set of names to select for a given purpose.

Table 28: Domain EntityNameUse:
code name definition
L Legal known as/conventional/the one you useknown as/conventional/the one you use
A Artist/Stage Includes writer's pseudonym, stage name, etc
I Indigenous/Tribal e.g. Chief Red Cloud
R Religious e.g. Sister Mary Francis, Brother John
ABC Alphabetic Alphabetic transcription of name (Japanese: romaji)Alphabetic transcription of name (Japanese: romaji)Alphabetic transcription of name (Japanese: romaji)
SYL Syllabic Syllabic transcription of name (e.g., Japanese kana, Korean hangul)Syllabic transcription of name (e.g., Japanese kana, Korean hangul)Syllabic transcription of name (e.g., Japanese kana, Korean hangul)
IDE Ideographic Ideographic representation of name (e.g., Japanese kanji, Chinese characters)Ideographic representation of name (e.g., Japanese kanji, Chinese characters)Ideographic representation of name (e.g., Japanese kanji, Chinese characters)

A name without specific use code might be a default address useful for any purpose, but an address with a specific use code would be preferred for that respective purpose.

2.8.2.2
Valid Time (validTime : IVL<TS>)

Definition:      An interval of time specifying the time during which the name is or was used for the entity. This accomodates the fact that people change names for people, places and things.

The EN conforms to the history item data type extension (HXIT).

2.8.2.3
Equality (equals : BL, inherited from ANY)

Two name values are considered equal if both conatain the same name parts, independent of ordering. Use code and valid time are excluded from the equality test.

Definition 88:
invariant(EN x, y) x.nonNull.and(y.nonNull) {
  x.equals(y).equals((
        forall(ENXP p) where x.contains(p) {
	  y.contains(p);
	}).and.(
        forall(ENXP p) where x.contains(p) {
	  y.contains(p);
	}));
      
2.8.2.4
Formatting Entity Names (formatted : ST)

Definition:      A character string value with the entity name formatted with proper spacing. This is only a semantic property to define the function of some of the name part types.39

The EN data type's main purpose is to capture names of people, places, and things (entities), so that one can address and refer to these entities in speech and writing. Humans will look at names in printed form, such as on a mailing label. The EN data type therefore defines precise rules of how its data is formatted.40

Entity names are ordered lists of entity name parts. Each entity name part is printed in the order of the list from left to right (or in any other language-specific reading direction.) Every entity name part (except for those marked "invisible") is printed. Most entity name parts are framed by whitespace. The following six rules govern the setting of whitespace.

  1. White space never accumulates, i.e. two subsequent spaces are the same as one.


  2. Literals may contain explicit white space subject to the same white space reduction rules.


  3. Except for prefix, suffix and delimiter name parts, every name part is surrounded by implicit white space. Leading and trailing explicit whitespace is insignificant in all those name parts.


  4. Delimiter name parts are not surrounded by any implicit white space. Leading and trailing explicit whitespace is significant in delimiter name parts.


  5. Prefix name parts only have implicit leading white space but no implicit trailing white space. Trailing explicit whitespace is significant in prefix name parts.


  6. Suffix name parts only have implicit trailing white space but no implicit leading white space. Leading explicit whitespace is significant in suffix name parts.


  7. This means that all entity name parts are generally surrounded by whitespace, but whitespace does never accumulate. Delimiters are never surrounded by implicit white space, prefixes are not followed by implicit white space and suffixes are not preceded by implicit white space. Every whitespace contributed by preceding or succeeding name parts around those special name parts is discarded, whether it was implicit or explicit.


Specializations of Entity Name (EN)

Three restrictions to Entity Name are defined in order to allow making specific constraints for certain kinds of entities, trivial name (TN), person name (PN), and organization name (ON).

2.8.3

Trivial Name (TN) restricts EN

Definition:      A restriction of entity name that is effectively a simple string used for a simple name for things and places.

The TN is a EN that consists of only one name part without any name part type or qualifier. The TN, and its single name part are therefore equivalent to a simple character string. This equivalence is expressed by a defined demotion to ST and promotion from ST.

Definition 89:
type TrivialName alias TN extends LIST<ST> {
  SET<CS> use;
  IVL<TS> validTime;
  BL  equals(AD x);
  ST  formatted;
 demotion ST;
 promotion  TN  (ST x);
};

invariant(TN x) where x.nonNull {
  x.head.nonNull;
  x.tail.isEmpty;
  x.formatted.equals(x.head);
};

invariant(ST x) {
  ((TN)x).head.equals(x);
};
    

Trivial names are typically used for places and things, such as Lake Erie or Reagan National Airport:

Example 7:
<name>Lake Erie</name>
<name>Washington National Airport</name>
     
2.8.3.1
Valid Time (validTime : IVL<TS>, inherited from EN)
2.8.3.2
Use Code (use : CS, inherited from EN)

2.8.4

Person Name (PN)

Definition:      A name for a person. A sequence of name parts, such as first name or family name, prefix, suffix, etc.

Since most of the functionality of entity name is in support of person names, the person name (PN) is only a very minor restriction on the entity name part qualifier.

Definition 90:
type PersonName alias PN extends LIST<PNXP> {
  SET<CS> use;
  IVL<TS> validTime;
  BL  equals(AD x);
  ST  formatted;
};
    
2.8.4.1
Valid Time (validTime : IVL<TS>, inherited from EN)
2.8.4.2
Use Code (use : CS, inherited from EN)

2.8.4.3

Examples

A very simple encoding of "John W. Doe" would be:

Example 8:
<name>
  <given>John</given> <given>W.</given> <family>Doe</family>
</name>
       

None of the special qualifiers need to be mentioned if they are unknown or irrelevant. The next example shows extensive use of multiple given names, prefixes, suffixes, for academic degrees, nobility titles, vorvoegsels ("van"), and professional designations.

Example 9:
<name>
  <prefix qualifier="AC">Dr. phil. </prefix>
  <given>Regina</given>
  <given>Johanna</given>
  <given>Maria</given>
  <prefix qualifier="NB">Gräfin </prefix>
  <prefix qualifier="VV">von </prefix>
  <family qualifier="BR">Hochheim</family>-<family qualifier="SP">Weilenfels</family>
  <suffix qualifier="PR">NCFSA</suffix>
</name>
       

The next example is an organization name, "Health Level Seven, Inc." in simple string form:

Example 10:
<name>Health Level Seven, Inc.</name>
 

and as a fully parsed name

Example 11:
<name>Health Level Seven, <suffix qualifier="LS">Inc.</suffix></name>
 

The following example shows a Japanese name in the three forms: ideographic (Kanji), syllabic (Hiragana), and alphabetic (Romaji).

Example 12:
<name use="IDE"><family>木村</family> <given>道男</given></name>
<name use="SYL"><family>きもら</family> <given>みちお</given></name>
<name use="ABC"><family>KIMURA</family> <given>MICHIO</given></name>
 

2.8.5

Person Name Part (PNXP) restricts ENXP

Definition:      A restriction of entity name part that only allows those entity name parts qualifiers applicable to person names. Since the structure of entity name is mostly determined by the requirements of person name, the restriction is very minor.

Definition 91:
protected type PersonNamePart alias PNXP extends ST {
  CS  type;
  SET<CS>   qualifier;
};

invariant(PNXP x) where x.nonNull {
  x.qualifier.contains("LS").not;
};
    
2.8.5.1
Name Part Type (partType : CS, inherited from EN)
2.8.5.2
Qualifier (qualifier : CS, inherited from EN)

2.8.6

Organization Name (ON)

Definition:      A name for an organization. A sequence of name parts.

A name for an organization, such as "Health Level Seven, Inc." An organization name consists only of untyped name parts, prefixes, suffixes, and delimiters.

Definition 92:
type OrganizationName alias ON extends LIST<ONXP> {
  SET<CS> use;
  IVL<TS> validTime;
  BL  equals(AD x);
  ST  formatted;
};
    
2.8.6.1
Valid Time (validTime : IVL<TS>, inherited from EN)
2.8.6.2
Use Code (use : CS, inherited from EN)

2.8.6.3

Examples

The following is the organization name, "Health Level Seven, Inc." in a simple string form:

Example 13:
<name>Health Level Seven, Inc.</name>;
 

And with the legal status "Inc." as a distinguished name part:

Example 14:
<name>Health Level Seven, <suffix qualifier="L">Inc.</suffix></name>
 

2.8.7

Person Name Part (ONXP) restricts ENXP

Definition:      A restriction of entity name part that only allows those entity name parts qualifiers applicable to organization names.

Definition 93:
protected type PersonNamePart alias ONXP extends ST {
  CS  type;
  SET<CS>   qualifier;
};

invariant(ONXP x) where x.nonNull {
  x.partType.implies("FAM").not;
  x.partType.implies("GIV").not;
};
    
2.8.7.1
Name Part Type (partType : CS, inherited from EN)
2.8.7.2
Qualifier (qualifier : CS, inherited from EN)

2.9

Abstract Type Quantity (QTY)

Definition:      The quantity data type is an abstract generalization for all data types (1) whose value set has an order relation (less-or-equal) and (2) where difference is defined in all of the data type's totally ordered value subsets. The quantity type abstraction is needed in defining certain other types, such as the interval and the probability distribution.

Definition 94:
abstract type Quantity alias QTY extends ANY {
  BL   lessOrEqual(QTY x);
  BL   compares(QTY x);
  type QTY diff;
  diff minus(QTY x);
  QTY  plus(diff x);
  BL   isZero;
  BL   lessThan(QTY x);
  BL   greaterOrEqual(QTY x);
  BL   greaterThan(QTY x);
};
    
Quantity Data Types"

Figure 8: Quantity Data Types"

2.9.1

Properties of Abstract Type Quantity (QTY)

2.9.1.1
Ordering: less-or-equal (lessOrEqual : BL)

Definition:      A predicate expressing an order relation that is reflexive, asymmetric and transitive, between this quantitity and another quantity.

The relation is defined on any totally ordered partition of the quantity data type. A totally ordered partition is a subset of the data types's defined values where all elements have a defined order (e.g., the integer and real numbers are totally ordered.)

By contrast, a partially ordered set is a set where some, but not all pairs of elements are comparable through the order relation (e.g., a tree structure or the set of physical quantities is a partially ordered set.) Two data values x and y of an ordered type are comparable (x.compares(y)) if the less-or-equal relation holds in either way (xy or yx).

A partial order relation generates totally ordered subsets whose union is the entire set (e.g., the set of all length is a totally ordered subset of the set of all physical quantities.)

For example, a tree structure is partially ordered, where the root is considered less or equal to a leaf, but there may not be an order among the leafs. Also, physical quantities are partially ordered, since an order exists only among quantities of the same dimension (e.g., between two lengths, but not between a length and a time.) A totally ordered subset of a tree is a path that transitively connects a leaf to the root. The physical dimension of time is a totally ordered subset of physical quantities.

Definition 95:
invariant (QTY x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.lessOrEqual(x);                         /* reflexive */
  x.lessOrEqual(y)                          /* asymmetric */
     .implies(y.lessOrEqual(x)).not();
  x.lessOrEqual(y).and(y.lessOrEqual(z))    /* transitive */
     .implies(x.lessOrEqual(z))
};
      
2.9.1.2
Equality (equals : BL, inherited from ANY)
2.9.1.3
Comparability (compares : BL)

Definition:      A predicate indicating if this value and the operand can be compared as to which is greater than the other.

Two quantities are comparable if they are both elements of a common totally ordered partition of their data types' value space. The definition is based on QTY.lessOrEqual.

Definition 96:
invariant (QTY x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.compares(y).equals(x.lessOrEqual(y).or(y.lessOrEqual(x)));
};
      
2.9.1.4
Difference (minus : QTY)

Definition:      A quantity expressing the "distance" of this quantity from the operand quantity, that must be comparable. The data type of the difference quantity is related to the operand quantities but need not be the same.

A difference is defined in an ordered set if it is semantically meaningful to state that Δ is the difference between the values x and y. This difference Δ must be meaningful independently from the values x and y. This independence exists if for all values u one can meaningfully derive a value v such that Δ would also be the difference between u and v. The judgment for what is meaningful cannot be defined formally.41

The has a data type that can express the difference between two values for which the ordering relation is defined (i.e., two elements of a common totally ordered subset.) For example, the difference data type of integer number is integer number, but the difference type of point in time is a physical quantity in the dimension of time. A difference data type is a totally ordered data type.

The difference between two values x minus y must be defined for all x and y in a common totally ordered subset of the data type's value set. Zero is the difference between a value and itself.

Definition 97:
invariant(QTY x, y) where x.compares(y) {
  x.minus(y).nonNull;
  x.minus(x).isZero;
};
      
2.9.1.5
Addition (plus : QTY)

Definition:      The sum of this quantity and its operand. The operand must be of a data type that can express the difference between two values of this quantity's data type.

Definition 98:
invariant(QTY x, y) where x.compares(y) {
  x.plus(y.minus(x)).equals(y);
};
      
2.9.1.6
The Zero-Quantity (isZero : BL)

Definition:      The neutral element in the difference and addition operations, i.e., if a quantity is zero, addition to, or subtraction from any other comparable quantity will result in that other quantity.

Definition 99:
invariant(QTY x, y) where x.compares(y) {
  x.minus(x).isZero;
};
      
2.9.1.7
Ordering: less-than (lessThan : BL)

Definition:      A predicate expressing an order relation that is asymmetric and transitive, between this quantitity and another quantity. The ordering is the same as QTY.lessOrEqual, but irreflexive.

Definition 100:
invariant (QTY x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.lessThan(y).equals(x.lessOrEqual(y).and(x.equals(y).not));
};
      
2.9.1.8
Ordering: greater-or-equal (greaterOrEqual : BL)

Definition:      A predicate expressing an order relation that is reflexive, asymmetric and transitive, between this quantitity and another quantity. This is the inverse order of QTY.lessOrEqual.

Definition 101:
invariant (QTY x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.greaterOrEqual(y).equals(y.lessOrEqual(x));
};
      
2.9.1.9
Ordering: greater-than (greaterThan : BL)

Definition:      A predicate expressing an order relation that is asymmetric and transitive, between this quantitity and another quantity. This is the invese of QTY.lessThan.

Definition 102:
invariant (QTY x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.greaterThan(y).equals(y.lessThan(x));
};
      

2.10

Integer Number (INT)

Definition:      Integer numbers (-1,0,1,2, 100, 3398129, etc.) are precise numbers that are results of counting and enumerating. Integer numbers are discrete, the set of integers is infinite but countable. No arbitrary limit is imposed on the range of integer numbers. Two NULL flavors are defined for the positive and negative infinity.

Definition 103:
type IntegerNumber alias INT extends QTY {
          INT   successor;
          INT   plus(diff x);
          INT   times(INT x);
  type    INT   diff;
          diff  minus(INT x);
          INT   predecessor;
          INT   negated;
          BL    isNegative;
          BL    nonNegative;
          INT   dividedBy(INT x);
          INT   remainder(INT y);
  literal ST;
};
    

The difference between two INT values is also a INT value.

Since the integer number data type includes all of the semantics of the mathematical integer number concept, the basic operations plus (addition) and times (multiplication) are defined. These operations are defined here as characterizing operations in the sense of ISO 11404, and because these operations are needed in other parts of this specification, namely the semantics of the literal form.

The traditional recursive definitions of addition and multiplication are due to Grassmann, and use the notion of INT.successor.42

Definition 104:
invariant(INT x, o, i) where x.nonNull.and(o.isZero()) {
  x.lessThan(x.successor);
  x.plus(o).equals(x);
  x.plus(y.successor).equals(x.plus(y).successor);
  x.times(o).equals(o);
  x.times(y.successor).equals(x.times(y)).plus(x);
};
    

2.10.1

Properties of Integer Number (INT)

2.10.1.1
Successor (successor : INT)

Definition:      The INT value that is greater than this INT value but where no INT value exists between this value and its successor.

Definition 105:
invariant(INT x, y) where x.successor(y) {
  x.lessThan(y).and.not(exists(INT z) {
      x.lessThan(z);
      z.lessThan(y);
    });
};
      
2.10.1.2
Addition (plus : INT, inherited from QTY)
Definition 106:
invariant(INT x, y, o) where x.nonNull.and(y.nonNull).and(o.isZero) {
  x.plus(o).equals(x);
  x.plus(y.successor).equals(x.plus(y).successor);
};
      
2.10.1.3
Multiplication (times : INT)

Definition:      The result of multiplying this integer with the operand, equivalent to repeated additions of this integer.

Definition 107:
invariant(REAL x, y, o) where x.compares(y).and(o.isZero) {
  x.times(o).equals(o);
  x.times(y.successor).equals(x.times(y)).plus(x);
};
      
2.10.1.4
Predecessor (predecessor : INT)

Definition:      The inverse of INT.successor.

Definition 108:
invariant(INT x, y) where x.successor(y) {
  x.successor.predecessor.equals(x);
};
      
2.10.1.5
Negation (negated : INT)

Definition:      The inverse element of the INT value, another INT value, which, when added to that value yields zero (the neutral element.)

Definition 109:
invariant(INT x) where x.nonNull {
  x.plus(x.negated).isZero;
};
      
2.10.1.6
Non-Negative (nonNegative : BL)

Definition:      A predicate indicating whether the INT zero (neutral element) is less or equal to this INT.

Definition 110:
invariant(INT x, o) where x.nonNull.and(o.isZero) {
  x.nonNegative.equals(o.lessOrEqual(x));
};
      
2.10.1.7
Negative (isNegative : BL)

Definition:      A predicate indicating whether this INT is less than zero (not non-negative.)

Definition 111:
invariant(INT x) where x.nonNull {
  x.isNegative.equals(x.nonNegative.not);
};
      
2.10.1.8
Integer Division (dividedBy : INT)

Definition:      The integer division operation of this integer (dividend) with another integer (divisor) is the integer number of times the divisor fits into the dividend.

Definition 112:
invariant(INT dividend, divisor, o, i) 
    where divisor.isZero.not
     .and(o.isZero) {
  dividend.isZero
    .implies(dividend.dividedBy(divisor).equals(o));
  dividend.isZero.not
    .implies(dividend.dividedBy(divisor).equals(
                  absolute(dividend).minus(absolute(divisor))
       .dividedBy(absolute(divisor))
          .successor.times(sign(dividend).times(sign(divisor)))));
};
      
2.10.1.9
Remainder (remainder : INT)

Definition:      The remainder of the integer division.

Definition 113:
invariant(INT x, y) where x.nonNull.and(y.nonNull) {
  x.reminder(y).equals(x.minus(x.dividedBy(z).times(y)));
};
      

This definition of the remainder matches the C and Java programming languages.

2.10.1.10
Literal Form

The literal form of an integer is a simple decimal number, i.e. a string of decimal digits.

Definition 114:
INT.literal ST {
  INT digit : "0"         { $.isZero; }
            | "1"         { $.equals(0.successor); }
            | "2"         { $.equals(1.successor); }
            | "3"         { $.equals(2.successor); }
            | "4"         { $.equals(3.successor); }
            | "5"         { $.equals(4.successor); }
            | "6"         { $.equals(5.successor); }
            | "7"         { $.equals(6.successor); }
            | "8"         { $.equals(7.successor); }
            | "9"         { $.equals(8.successor); };

  INT uint : digit        { $.equals($1); }
           | uint digit   { $.equals($1.times(9.successor).plus($2)); };

  INT : uint              { $.equals($1); }
      | "+" uint          { $.equals($2); }
      | "-" uint          { $.equals($2.negated); };
};
      

2.11

Real Number (REAL)

Definition:      Fractional numbers. Typically used whenever quantities are measured, estimated, or computed from other real numbers. The typical representation is decimal, where the number of significant decimal digits is known as the precision.

The term "Real number" in this specification is used to mean that fractional values are covered without necessarily implying the full set of the mathematical real numbers that would include irrational numbers such as ρ, Euler's number, etc.43

NOTE: This specification defines the real number data type in the broadest sense possible. However, it does not imply that any conforming ITS or implementation must be able to represent the full range of Real numbers, which would not be possible in any finite implementation. HL7's current use cases for the Real number data type are measured and estimated quantities and monetary amounts. These use cases can be handled with a restricted Real value space, rational numbers, and even just very limited decimals (scaled integers.) However, we declare the representations of the real value space as floating point, rational, scaled integer, or digit string, and their various limitations to be out of the scope of this specification.

This specification offers two choices for a number data type. The choice is made as follows: Any number attribute is a real if it is not known for sure that it is an integer. A number is an integer if it is always counted, typically representing an ordinal number. If there are conceivable use cases where such a number would be estimated or averaged, it is not always an integer and thus should use the Real data type.

Definition 115:
type RealNumber alias REAL extends QTY {
  type      REAL  diff;
            diff  minus(REAL x);
            REAL  plus(diff x);
            REAL  negated;
            REAL  times(REAL x);
            REAL  inverted;
            REAL  power(REAL x);
  literal   ST;
            INT   precision;
  demotion  INT;
  promotion REAL  (INT x);
  promotion PQ;
  promotion RTO;
};
    

The algebraic operations are specified here as characterizing operations in the sense of ISO 11404, and because these operations are needed in other parts of this specification.

Unlike the integer numbers, the real numbers semantics are not inductively constructed but only intuitively described by their axioms of their algebraic properties. The completeness axioms are intentionally left out so as to make no statement about irrational numbers.

2.11.1

Properties of Real Number (REAL)

2.11.1.1
Comparability (compares : BL, inherited from QTY)

The value set of REAL is totally ordered.

Definition 116:
invariant(REAL x, y)
    where x.nonNull.and(y.nonNull){
  x.compares(y);
}
      
2.11.1.2
Addition (plus : QTY, inherited from QTY)
Definition 117:
invariant(REAL x, y, z, o)
    where x.nonNull.and(y.nonNull).and(z.nonNull).and(o.isZero) {
  x.plus(o).equals(x)                             /* neutral element */
  x.plus(y).plus(z).equals(x.plus(y.plus(z)));    /* associative */
  x.plus(y).equals(y.plus(x))                     /* commutative */

  z.lessOrEqual(x).and(z.lessOrEqual(y).implies(z.lessOrEqual(x.plus(y));
  x.lessOrEqual(y).implies(x.plus(z).lessOrEqual(y.plus(z)));
};
      
2.11.1.3
Negation (Inverse Element of Addition) (negated : REAL)

Definition:      A REAL value, which, when added to another REAL value yields zero (the neutral element of addition.)

Definition 118:
invariant(INT x) where x.nonNull {
  x.plus(x.negated).isZero;
};
      
2.11.1.4
Neutral Element of Multiplication (isOne : BL)

Definition:      A predicate indicating if this value is the number one, i.e., the neutral element of multiplication. There is exactly one real number that has this property.

Definition 119:
invariant(REAL x, y) where x.nonNull.and(y.nonNull) {
  x.isOne.and(y.isOne).implies(x.equals(y));
  x.isOne.and(y.isZero).implies(x.equals(y).not);
};
      
2.11.1.5
Multiplication (times : REAL)

Definition:      An operation in REAL that forms an abelian group and is related to addition by the law of distribution.

Definition 120:
invariant(REAL x, y, z, i, o)
    where x.nonNull.and(y.nonNull).and(z.nonNull)
     .and(i.isOne).and(o.isZero) {
  x.times(o).equals(o);
  x.times(i).equals(x);                           /* neutral element */
  x.times(y).times(z).equals(x.times(y.times(z)));/* associative */
  x.times(y).equals(y.times(x));                  /* commutative */

  x.times(y.plus(z))                              /* distributive */
    .equals(x.times(y).plus(x.times(z));

  o.lessOrEqual(x).and(o.lessOrEqual(y).implies(o.lessOrEqual(x.times(y));
};
      
2.11.1.6
Inverse Element of Multiplication (inverted : REAL)

Definition:      A REAL value, which, when muliplied with another REAL value yields one (the neutral element of multiplication). Zero (the neutral element of addition) has no inverse element.

Definition 121:
invariant(REAL x, i) where x.isZero.not.and(i.isOne) {
  x.times(x.inverted).equals(i);
};
      
2.11.1.7
Homomorphism of INT into REAL ( : INT)

The INT and REAL data types are related by a homomorphism that maps every value in INT to a value in REAL whereby the algebraic properties of INT are preserved. This means, an integer can be promoted to a real and a real can be demoted to an integer by means of rounding off the fractional part.

Definition 122:
invariant(INT n, m) where n.nonNull.and(m.nonNull) {
  ((REAL)n.plus(m)).equals(((REAL)n).plus((REAL)m));
  ((REAL)n.times(m)).equals(((REAL)n).times((REAL)m));
};
      
2.11.1.8
Exponentiation (power : REAL)

Definition:      The basis of exponentiation is the iterative multiplication of a real number, and extended to rational exponents as the inverse operation.

We only list certain common properties of exponentiation.

Definition 123:
invariant(REAL x, y, z, o, i)
    where x.nonNull.and(y.nonNull).and(z.nonNull)
     .and(o.isZero).and.(i.isOne) {
  forall(INT n) where n.nonNull {
    n.greaterThan(o).implies(
       x.power(n).equals(x.times(x.power(n.predecessor))));
    n.lessThan(o).implies(
       x.power(n).equals(x.power(n.negated).inverted);
  }
  x.power(o).equals(i);
  x.power(i).equals(x);
  x.power(y).power(z).equals(x.power(y.times(z)));
  x.power(y).times(x.power(z)).equals(x.power(y.plus(z)));
  x.power(y).inverted.equals(x.power(y.negated));
  x.power(y).power(y.inverted).equals(x);
};
      
2.11.1.9
Literal Form

The literal form of an integer is a string of decimal digits with optional leading "+" or "-" sign, and optional decimal point, and optional exponential notation using a case insensitive "e" between the mantissa and the exponent. The number of significant digits must conform to the precision property.

Definition 124:
REAL.literal ST {
  REAL : mantissa                   { $.equals($1); }
       | mantissa /[eE]/ INT        { $.equals($1
                                         .times(10.power($3)); };

  REAL mantissa
       : /0*/ 0                     { $.isZero; $.precision.equals(1); }
       | /0*/ "." /0*/              { $.isZero; $.precision.equals(
                                                  $3.length.successor); }
       | /0*/ "." /0*/ fractional   { $.equals($4);
                                      $.precision.equals($4.precision); }
       | integer                    { $.equals($1); }
       | integer "." fractional     { $.equals($1.plus($2));
                                      $.precision.equals($1.precision
                                        .plus($3.precision)); };

  REAL integer
       : uintval                    { $.equals($2); }
       | "+" uintval                { $.equals($1.times($2)); }
       | "-" uintval                { $.equals($1.times($2).negated); };

  REAL uintval : /0*/ uint          { $.equals($2); };

  REAL uint : digit                 { $.equals($1);
                                      $.precision.equals(1); }
            | uint digit            { $.equals($1.times(10).plus($2));
                                      $.precision.equals(
                                        $1.precision.successor; };

  REAL fractional
       : digit                      { $.equals($1.times(10.inverted));
                                      $.precision.equals(1); }
       | digit fractional           { $.equals(
                                        $1.plus($2.times(10.inverted));
                                      $.precision.equals(
                                        $1.precision.successor); };

  INT digit : /[0-9]/               { $.equals($1); }
};
      

Examples of real literals for two thousand are 2000, 2000., 2e3, 2.0e+3, +2.0e+3.

Note that the literal form does not carry type information. For example, "2000" is a valid representation of both a real number and an integer number. No trailing decimal point is used to disambiguate from integer numbers. An ITS that uses this literal form must recover the type information from other sources.

2.11.1.10
Precision of the Decimal Form (precision : INT)

Definition:      The number of significant digits of the decimal representation.

Precision is formally defined based on the REAL.literal

The precision attribute is only the precision of a decimal digit representation, not the accuracy of the real number value.

The purpose of the precision property for the real number data type is to faithfully capture the whole information presented to humans in a number. The amount of decimal digits shown conveys information about the uncertainty (i.e., precision and accuracy) of a measured value.

NOTE: the precision of the representation is independent from uncertainty (precision accuracy) of a measurement result. If the uncertainty of a measurement result is important, one should specify uncertain values as PPD

The rules for what digits are significant are as follows:

  1. All non-zero digits are significant.


  2. All zeroes to the right of a significant digit are significant.


  3. When all digits in the number are zero the zero-digit immediately left to the decimal point is significant (and because of rule 2, all following zeroes are thus significant too.)


NOTE: these rules of significance differ slightly from the more casual rules taught in school. Notably trailing zeroes before the decimal point are consistently regarded significant here. Elsewhere, e.g., 2000 is ambiguous as to whether the zeroes are significant. This deviation from the common custom is warranted for the purpose of unambiguous communication.
Table 29: Examples for the Precision of Real Number Literals.
Literal Number of Significant Digits
2000 has 4 significant digits.
2e3 has 1 significant digit, used if one would naturally say "2000" but precision is only 1.
0.001 has 1 significant digit.
1e-3 has 1 significant digit, use this if one would naturally say "0.001" but precision is only 1.
0 has 1 significant digit.
0.0 has 2 significant digits.
000.0 has 2 significant digits.
0.00 has 3 significant digits.
4.10 has 3 significant digits.
4.09 has 3 significant digits.
4.1 has 2 significant digits.

The precision of the representation should match the uncertainty of the value. However, precision of the representation and uncertainty of the value are separate independent concepts. Refer to Section 4.4.2 for details about uncertain real numbers.

For example "0.123" has 3 significant digits in the representation, but the uncertainty of the value may be in any digit shown or not shown, i.e., the uncertainty may be 0.123±0.0005, 0.123±0.005 or 0.123±0.00005, etc. Note that external representations should adjust their representational precision with the uncertainty of the value. However, since the precision in the digit string is granular to 0.5 the least significant digit, while uncertainty may be anywhere between these "grid lines", 0.123±0.005 would also be an adequate representation for the value between 0.118 and 0.128.

NOTE: on a character based Implementation Technology the ITS need not represent the precision as an explicit attribute if numbers are represented as decimal digit strings. In that case, the ITS must abide by the rules of an unambiguous determination of significant digits. A number representation must not produce more or less significant digits than were originally in that number. Conformance can be tested through round-trip encoding — decoding — encoding.
2.11.1.11
Equality (equals : BL, inherited from ANY)

Equality of real numbers is determined based on the value and precision. The value with a higher precision is rounded to the precision of the other value and then the comparison made.

Table 30: Examples for equality and inequalities of REAL values.
value precision equals value precision
3.14 3 true 3.14 3
3.140000 7 true 3.14 3
3.1415 5 true 3.14 3
3.1415 5 false 3.1400 5
4 1 false 3 1
NOTE: a raw equality test on real numbers is unreasonable for most practical purposes, since infinitesimal equality is rarely meaningful in practice but may lead to false negatives. This definition of equality is designed to be reasonably useful for simple cases. For more sophisticated cases it is recommended to compare decimal numbers based on intervals, that is, to test whether a real value falls within a certain range (interval).

2.12

Ratio (RTO)

Definition:      A quantity constructed as the quotient of a numerator quantity divided by a denominator quantity. Common factors in the numerator and denominator are not automatically cancelled out. The RTO data type supports titers (e.g., "1:128") and other quantities produced by laboratories that truly represent ratios. Ratios are not simply "structured numerics", particularly blood pressure measurements (e.g. "120/60") are not ratios. In many cases the REAL should be used instead of the RTO.

Ratios are different from rational numbers, i.e., in ratios common factors in the numerator and denominator never cancel out. A ratio of two real or integer numbers is not automatically reduced to a real number.

NOTE: This data type is not defined to generally represent rational numbers. It is used only if common factors in numerator and denominator are not supposed to cancel out. This is only rarely the case. For observation values, ratios occur almost exclusively with titers.
Definition 125:
type Ratio<QTY N, QTY D> alias RTO extends QTY {
  N numerator;
  D denominator;
  demotion  REAL;
  demotion  PQ;
};
    

The default value for both numerator and denominator is the integer number 1 (one.) The denominator may not be zero.

NOTE: This data type is defined as a generic data type but discussed in the context of the other quantity-related data types. The reason for defining RTO as a generic data type is so that it can be constrained precisely as to what the numerator and denominator types should be. (§ )

2.12.1

Properties of Ratio (RTO)

2.12.1.1
Numerator (numerator : N, default 1)

Definition:      The quantity that is being devided in the ratio. The default is the integer number 1 (one.)

2.12.1.2
Denominator (denominator : D, default 1)

Definition:      The quantity that devides the numerator in the ratio. The default is the integer number 1 (one.) The denominator must not be zero.

Definition 126:
invariant(RTO x) where x.nonNull {
  x.denominator.isZero.not;
};
      
2.12.1.3
Literal Form

A ratio literal form exists for all ratios where both numerator and denominators have literal forms. A ratio is simply the numerator literal a colon as separator followed by the denominator literal. When the colon and denominator are missing, the integer number 1 is assumed as the denominator.

Definition 127:
RTO.literal ST {
  RTO : QTY                 { $.numerator.equals($1);
                              $.denominator.equals((INT)1); };
      | QTY ":" QTY         { $.numerator.equals($1);
                              $.denominator.equals($3); };
};
     

For example, the rubella virus antibody titer value 1:64 could be represented using the literal "1:64".

2.13

Physical Quantity (PQ)

Definition:      A dimensioned quantity expressing the result of measuring.

Table 32: Property Summary of Physical Quantity
Name Type Description
value REAL The magnitude of the quantity measured in terms of the unit.
unit CS The unit of measure specified in the Unified Code for Units of Measure (UCUM) [].
translation SET<PQR> An alternative representation of the same physical quantity expressed in a different unit, of a different unit code system and possibly with a different value.
canonical PQ A physical quantity expressed in a canonical unit. In any given unit system has every physical dimension can be assigned one canonical unit Defining the canonical unit is not subject of this specification, only asserting that such a canonical unit exists (and can be arbitrarily chosen) for every physical quantity. An abstract physical quantity is equal to its canonical form.
REAL
Definition 128:
type PhysicalQuantity alias PQ extends QTY {
            REAL  value;
            CS    unit;
            BL    equals(PQ x)
            BL    lessOrEqual(PQ x);
            BL    compares(PQ x);
            PQ    canonical;
	    SET<PQR> translation;

  type      PQ    diff
            diff  minus(PQ x);
            PQ    plus(diff x);
            PQ    negated;
            PQ    times(REAL x);
            PQ    times(PQ x);
            PQ    inverted;
            PQ    power(INT x);

  literal   ST;
  demotion  REAL;

            REAL  originalValue;
            CV    originalUnit;
};
    

2.13.1

Physical Quantity Representation (PQR) extends CV

Definition:      An extension of the coded value data type representating a physical quantity using a unit from any code system. Used to show alternative representation for a physical quantity.

Table 33: Property Summary of Physical Quantity Representation
Name Type Description
value REAL The magnitude of the measurement value in terms of the unit specified by this code.
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST A common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
Definition 129:
type PhysicalQuantityRepresentation alias PQR extends CV {
  REAL value;
}
    
2.13.1.1
Value (value : REAL)

Definition:      The magnitude of the measurement value in terms of the unit specified by this code.

2.13.1.2
Code (code : ST, default NULL, inherited from CD)
2.13.1.3
Code System (codeSystem : UID, inherited from CD)
2.13.1.4
Code System Name (codeSystemName : ST, default NULL, inherited from CD)
2.13.1.5
Code System Version (codeSystemVersion : ST, default NULL, inherited from CD)
2.13.1.6
Display Name (displayName : ST, default NULL, inherited from CD)
2.13.1.7
Original Text (originalText : ST, default NULL, inherited from CD)

2.13.2

Properties of Physical Quantity (PQ)

2.13.2.1
Maginitude Value (value : REAL)

Definition:      The magnitude of the quantity measured in terms of the unit.

2.13.2.2
Unit of Measure (unit : CS, default 1)

Definition:      The unit of measure specified in the Unified Code for Units of Measure (UCUM) [http://aurora.rg.iupui.edu/UCUM].

NOTE: Equality of physical quantities does not require the values and units to be equal independently. Value and unit is only how we represent physical quantities. For example, 1 m equals 100 cm. Although the units are different and the values are different, the physical quantities are equal! Therefore one should never expect a particular unit for a physical quantity but instead provide automated conversion between different comparable units.
2.13.2.3
Translation (translation : SET<PQR>)

Definition:      An alternative representation of the same physical quantity expressed in a different unit, of a different unit code system and possibly with a different value.

Physical quantities semantically are the results of measurement acts. Although physical quantities are represented as pairs of value and unit, semantically, a physical quantity is more than that. To find out whether two physical quantities are equal, it is not enough to compare equality of their two values and units independently. For example, 100 cm equals 1 m although neither values nor units are equal. To define equality we introduce the notion of a canonical form.

2.13.2.4
Canonical Form (canonical : PQ)

Definition:      A physical quantity expressed in a canonical unit. In any given unit system has every physical dimension can be assigned one canonical unit Defining the canonical unit is not subject of this specification, only asserting that such a canonical unit exists (and can be arbitrarily chosen) for every physical quantity. An abstract physical quantity is equal to its canonical form.

Definition 130:
invariant(PQ x, y) where x.nonNull.and(y.nonNull) {
  x.canonical.equals(x);
};
      

For example, for a unit system based on the Système International (SI) one can define the canonical form as (a) the product of only the base units; (b) without prefixes; where (c) only multiplication and exponents are used (no division operation); and (d) where the seven base units appear in a defined ordering (e.g., m, s, g...) Thus, 1 mm Hg would be expressed as 133322 m-1 s-2. As can be seen, the rules how to build the canonical form of units may be quite complex. However, for the semantic specification it doesn't matter how the canonical form is built, nor what specific canonical form is chosen, only that some canonical form could be defined.

2.13.2.5
Equality (equals : BL, inherited from ANY)

Two physical quantities are equal if each their values and their units of their canonical forms are equal.

Definition 131:
invariant(PQ x, y) where x.nonNull.and(y.nonNull) {
  x.equals(y).equals(x.canonical.value.equals(y.canonical.value)
                .and(x.canonical.unit.equals(y.canonical.unit)));
};
      
2.13.2.6
Comparability (compares : BL, inherited from QTY)

Two physical quantities compare each other (and have an ordering and difference) if the units of their canonical forms are equal.

Definition 132:
invariant(PQ x, y) where x.nonNull.and(y.nonNull) {
  x.compares(y).equals(x.canonical.unit.equals(y.canonical.unit));
};
      
2.13.2.7
Neutral Element of Multiplication (isOne : BL)

Definition:      A predicate indicating if this value is the number one, i.e., the neutral element of multiplication. There is exactly one physical quantity that has this property and is called the unity.

Definition 133:
invariant(PQ x, y) where x.nonNull.and(y.nonNull) {
  x.isOne.and(y.isOne).implies(x.equals(y));
  x.isOne.and(y.isZero).implies(x.equals(y).not);
};
      

Algebraic operations are defined for physical quantities because they are characterizing operations in the sense of ISO 11404 and because this specification makes use of them when defining the literal form.

2.13.2.8
Multiplication (times : PQ)

Definition:      The product of two physical quantities is the product of their values times the product of their units.

Definition 134:
invariant(PQ x, y, z, i, o)
    where x.nonNull.and(y.nonNull).and(z.nonNull)
     .and(o.isZero).and(i.isOne) {
  x.times(o).equals(o);
  x.times(i).equals(x);                           /* neutral element */
  x.times(y).times(z).equals(x.times(y.times(z)));/* associative */
  x.times(y).equals(y.times(x));                  /* commutative */

  o.lessOrEqual(x).and(o.lessOrEqual(y).implies(o.lessOrEqual(x.times(y));
};
      
2.13.2.9
Inverse Element of Multiplication (inverted : PQ)

Definition:      A PQ value, which, when muliplied with another PQ value yields one (the neutral element of multiplication). Zero (the neutral element of addition) has no inverse element. The quotient of two comparable quantities is comparable to the unity (the unit 1).

Definition 135:
invariant(PQ x, y, i)
    where x.nonNull.and(y.nonNull).and(i.isOne) {
  x.times(x.inverted).equals(1);
  x.compares(y).implies(x.times(y.inverted).compares(1));
};
      
2.13.2.10
Real Multiplication (times : PQ)

Definition:      Multiplication with a real number forms a scaled quantity. A scaled quantity is comparable to its original quantity.

If two quantities Q1 and Q2 compare each other, there exists a real number r such that r1 = Q1 / Q2.

Definition 136:
invariant(PQ x; REAL r)
    where x.nonNull.and(r.nonNull) {
  x.times(r).value.equals(x.value.times(r));
  x.times(r).compares(x);
};
      
2.13.2.11
Homomorphism of REAL into PQ ( : REAL)

A REAL value can be converted to a PQ value with the unity, i.e. the unit 1 (one). Likewise, a physical quantity that compares the unity can be converted to a real number.

Definition 137:
invariant(PQ x, unity) 
    where x.nonNull.and.unity.isOne
     .and(x.compares(unity)) {
  unity.times((REAL)x).equals(x);
};
      
2.13.2.12
Exponentiation (power : PQ)

Definition:      A physical quantity can be raised to an integer power.

Definition 138:
invariant (PQ x, i; INT n, o) 
    where x.nonNull.and(i.isOne)
     .and(n.nonNull.and(o.isZero) {
  x.power(o).equals(i);
  n.greaterThan(o).implies(
      x.power(n).equals(x.times(x.power(n.predecessor))));
  n.lessThan(o).implies(
      x.power(n).equals(x.power(n.negated).inverted);
}
      
2.13.2.13
Addition (plus : PQ)

Definition:      Two physical quantities that compare each other can be added.

Definition 139:
invariant (PQ x, y)
    where x.compares(y) {
  x.canonical.plus(y.canonical).value
     .equals(x.canonical.value.plus(y.canonical.value));
};
      
2.13.2.14
Literal Form

The literal form for a physical quantity is a real number literal followed by optional white space and a character string representing a valid code in the Unified Code for Units of Measure (UCUM) [http://aurora.rg.iupui.edu/UCUM].

Definition 140:
PQ.literal ST {
  PQ : REAL unit    { $.value.equals($1);
                      $.unit.equals($2); }
  CS unit : ST      { $.value.equals($1);
                      $.codeSystem.equals(2.16.840.1.113883.6.8); };
};
      

For example 20 minutes is "20 min".

2.14

Monetary Amount (MO)

Definition:      A monetary amount is a quantity expressing the amount of money in some currency. Currencies are the units in which monetary amounts are denominated in different economic regions. While the monetary amount is a single kind of quantity (money) the exchange rates between the different units are variable. This is the principle difference between physical quantity and monetary amounts, and the reason why currency units are not physical units.

Table 34: Property Summary of Monetary Amount
Name Type Description
value REAL The magnitude of the monetary amount in terms of the currency unit.
currency CS The currency unit as defined in ISO 4217.
Definition 141:
type MonetaryAmount alias MO extends QTY {
          REAL value;
          CS   currency;
  type    MO   diff
          MO   plus(diff x);
          diff minus(MO x);
          MO   negated;
          MO   times(REAL x);
  literal ST;
};
    

2.14.1

Properties of Monetary Amount (MO)

2.14.1.1
Value (value : REAL)

Definition:      The magnitude of the monetary amount in terms of the currency unit.

NOTE: monetary amounts are usually precise to 0.01 (one cent, penny, paisa, etc.) For large amounts, it is important not to store monetary amounts in floating point registers, since this may lose precision. However, this specification does not define the internal storage of real numbers as fixed or floating point numbers.

The precision attribute of the real number type is the precision of the decimal representation, not the precision of the value. The real number type has no notion of uncertainty or accuracy. For example, "1.99 USD" (precision 3) times 7 is "13.93 USD" (precision 4) and should not be rounded to "13.9" to keep the precision constant.

2.14.1.2
Currency (currency : CS)

Definition:      The currency unit as defined in ISO 4217.

Table 35: Domain Currency:
code name definition
ARS Argentine Peso Argentine Peso, monetary currency of Argentina
AUD Australian Dollar Australian Dollar, monetary currency of Australia
BRL Brazilian Real Brazilian Real, monetary currency of Brazil
CAD Canadian Dollar Canadian Dollar, monetary currency of Canada
CHF Swiss Franc Swiss Franc, monetary currency of Switzerland
CLF Unidades de Formento Unidades de Formento, monetary currency of Chile
CNY Yuan Renminbi Yuan Renminbi, monetary currency of China
DEM Deutsche Mark Deutsche Mark, monetary currency of Germany
ESP Spanish Peseta Spanish Peseta, monetary currency of Spain
EUR Euro Euro, monetary currency of European Union
FIM Markka Markka, monetary currency of Finland
FRF French Franc French Franc, monetary currency of France
GBP Pound Sterling Pound Sterling, monetary currency of United Kingdom
ILS Shekel Shekel, monetary currency of Israel
INR Indian Rupee Indian Rupee, monetary currency of India
JPY Yen Yen, monetary currency of Japan
KRW Won Won, monetary currency of Korea (South)
MXN Mexican Nuevo Peso Mexican Nuevo Peso, monetary currency of Mexico
NLG Netherlands Guilder Netherlands Guilder, monetary currency of Netherlands
NZD New Zealand Dollar New Zealand Dollar, monetary currency of New Zealand
PHP Philippine Peso Philippine Peso, monetary currency of Philippines
RUR Russian Ruble Russian Ruble, monetary currency of Russian Federation
THB Baht Baht, monetary currency of Thailand
TRL Lira Lira, monetary currency of Turkey
TWD Taiwan Dollar Taiwan Dollar, monetary currency of Taiwan
USD US Dollar US Dollar, monetary currency of United States
ZAR Rand Rand, monetary currency of South Africa
2.14.1.3
Equality (equals : BL, inherited from ANY)

Two MO values are equal if each their values and their currency units are equal.

Definition 142:
invariant(MO x, y) where x.nonNull.and(y.nonNull) {
  x.equals(y).equals(x.value.equals(y.value)
                .and(x.unit.equals(y.unit)));
};
      
2.14.1.4
Comparability (compares : BL, inherited from QTY)

Two MO values compare each other (and have an ordering and difference) if their currency units are equal.

If the currencies are not equal, the amounts cannot be compared. Conversion between the currencies is outside the scope of this specification. In practice, foreign exchange rates are highly variable not only over long and short amounts of time, but also depending on location and access to currency trade markets.

Definition 143:
invariant(MO x, y) where x.nonNull.and(y.nonNull) {
  x.compares.equals(x.currency.equals(y.currency));
};
      
2.14.1.5
Addition (plus : MO)

Definition:      Two monetary amounts can be added if they are denominated in the same currency.

Definition 144:
invariant (MO x, y)
    where x.currency.equals(y.currency) {
  x.plus(y).currency.equals(x.currency);
  x.plus(y).value.equals(x.value.plus(y.value));
};
      
2.14.1.6
Real Multiplication (times : MO)

Definition:      Multiplication with a real number to forms a scaled quantity. A scaled quantity is comparable to its original quantity.

Definition 145:
invariant(MO x; REAL r) where x.nonNull.and(r.nonNull) {
  x.times(r).value.equals(x.value.times(r));
  x.times(r).currency.equals(x.currency);
};
      
2.14.1.7
Literal Form

The literal form for a monetary amount consists of the currency code string, optional white space, and REAL literal amount.

Definition 146:
MO.literal ST {
  MO : currency value     { $.currency.equals($1); }
                            $.value.equals($2);
  CS currency : ST        { $.currency.value.equals($1);
                            $.currency.codeSystem
                               .equals(2.16.840.1.113883.6.9); }
  REAL value : REAL       { $.value.equals($1); }
};
      

For example, "USD189.95" is the literal for 189.95 U.S. Dollar.

2.15

Point in Time (TS)

Definition:      A a quantity specifying a point on the axis of natural time. A point in time is most often represented as a calendar expression.

Semantically, however, time is independent from calendars and best described by its relationship to elapsed time (measured as a physical quantity in the dimension of time.) A point in time plus an elapsed time yields another point in time. Inversely, a point in time minus another point in time yields an elapsed time.

As nobody knows when time began, a point in time is conceptualized as the amount of time that has elapsed from some arbitrary zero-point, called an epoch. Because there is no absolute zero-point on the time axis natural time is a difference-scale quantity, where only differences are defined but no ratios. (For example, no point in time is — absolutely speaking — "twice as late" as another point in time.)

Given some arbitrary zero-point, one can express any point in time as an elapsed time measured from that offset. Such an arbitrary zero-point is called an epoch. This epoch-offset form is used as a semantic representation here, without implying that any system would have to implement the TS data type in that way. Systems that do not need to compute distances between points in time will not need any other representation than a calendar expression literal.

Definition 147:
type PointInTime alias TS extends QTY {
            PQ  offset;
            CS  calendar;
            INT precision;
            PQ  timezone;
            BL  equals(TS x);
            TS  plus(PQ x);
            PQ  minus(TS x);
  literal   ST;
  type      PQ  diff;
};
    

2.15.1

Properties of Point in Time (TS)

2.15.1.1
Offset from Epoch (offset : PQ)

Definition:      The elapsed time since any constant epoch, measured as a physical quantity in the dimension of time (i.e., comparable to one second.)

Definition 148:
invariant(TS x) where x.nonNull {
  x.offset.compares(1 s);
};
      

It is not necessary for this specification to define a canonical epoch; the semantics is the same for any epoch, as long as the epoch is constant.

NOTE: the offset property may be treated as a purely semantic property that is not represented in any way other than the calendar literal expression. However, an ITS may just as well choose to define a constant epoch and represent point-in-time values as elapsed time offsets relative to that epoch. However, an ITS using an epoch-offset representation would still need to communicate the calendar code and the precision of a calendar representation once other calendars are supported.
2.15.1.2
Equality (equals : BL, inherited from QTY)

Two point-in-time values are equal if and only if their offsets (relative to the same epoch) are equal.

Definition 149:
invariant(TS x, y) where x.nonNull.and(y.nonNull) {
  x.equals(y).equals(x.offset.equals(y.offset));
};
      
2.15.1.3
Calendar (calendar : CS, default GREG)

Definition:      A code specifying the calendar used in the literal representation of this point in time.44

Table 36: Domain Calendar:
code name definition
GREG Gregorian The Gregorian calendar is in effect in the most countries of Christian influence since approximately 1582. This calendar superceded the Julian calendar.

The purpose of this property is mainly to faithfully convey what has been entered or seen by a user in a system originating such a point-in-time value. The calendar property also advises any system rendering a point-in-time value into a literal form of which calendar to use. However, this is only advice; any system that renders point-in-time values to users may choose to use the calendar and literal form demanded by its users rather than the calendar mentioned in the calendar property. Hence, the calendar property is not constant in communication between systems, the calendar is not part of the equality test.

For the purpose of defining the relationship between calendar expression and epoch/offset form, two private data types, Calendar (CAL) and CalendarCycle (CLCY,) are defined. These calendar data types exist only for defining this specification. These private data types may not be used at all outside this specification.

2.15.1.4
Precision of the Calendar Literal Form (precision : INT)

Definition:      The number of significant digits of the calendar expression representation.

Precision is formally defined based on the TS.literal

The precision attribute is only the precision of a decimal digit representation, not the accuracy of the point in time value.

The purpose of the precision property for the point in time data type is to faithfully capture the whole information presented to humans in a calendar expression. The number of digits shown conveys information about the uncertainty (i.e., precision and accuracy) of a measured point in time.

NOTE: the precision of the representation is independent from uncertainty (precision accuracy) of a measurement result. If the uncertainty of a measurement result is important, one should specify uncertain values as PPD

The precision property is dependent on the calendar. A given precision value relative to one calendar does not mean the same in another calendar with different periods.

For example "20000403" has 8 significant digits in the representation, but the uncertainty of the value may be in any digit shown or not shown, i.e., the uncertainty may be to the day, to the week, or to the hour. Note that external representations should adjust their representational precision with the uncertainty of the value. However, since the precision in the digit string depends on the calendar and is granular to the calendar periods, uncertainty may not fall into that grid (e.g., 2000040317 is an adequate representation for the value between 2000040305 and 2000040405.)

NOTE: For a character based Implementation Technology the ITS need not represent the precision as an explicit attribute if point in time values are represented as literal calendar expressions. A point in time representation must not produce more or less significant digits than were originally in that value. Conformance can be tested through round-trip encoding - decoding - encoding.
2.15.1.5
Timezone Offset (timezone : PQ)

Definition:      The difference between the local time in that time zone and Universal Coordinated Time (UTC, formerly called Greenwich Mean Time, GMT). The time zone is a physical quantity in the dimension of time (i.e., comparable to one second.) A zero time zone value specifies UTC. The time zone value does not permit conclusions about the geographical longitude or a conventional time zone name.

For example, 200005121800-0500 may be eastern standard time (EST) in Indianapolis, IN, or central daylight savings time (CDT) in Decatur, IL. Furthermore in other countries having other latitude the time zones may be named differently.

Definition 150:
invariant(TS x, y) where x.nonNull.and(y.nonNull) {
  x.timezone.compares(1 s);
};
      

When the time zone is NULL (unknown), "local time" is assumed. However, "local time" is always local to some place, and without knowledge of that place, the time zone is unknown. Hence, a local time cannot be converted into UTC. The time zone should be specified for all point in time values in order to avoid a significant loss of precision when points in time are compared. The difference of two local times where the locality is unknown has an error of ±12 hours.

In administrative data context, some time values do not carry a time zone. For a date of birth in administrative data, for example, it would be incorrect to specify a time zone, since this may effectively change the date of birth when converted into other time zones. For such administrative data the time zone is NULL (not applicable.)

2.15.1.6
Addition (plus : QTY, inherited from QTY)

Definition:      A point in time plus an elapsed time (i.e., physical quantity in the dimension of time) is a point in time.

Definition 151:
invariant(TS x, PQ t)
    where x.nonNull.and(t.compares(1 s)) {
  x.plus(t).offset.equals(x.offset.plus(t));
};
      
2.15.1.7
Difference (minus : QTY, inherited from QTY)

Definition:      The difference between two points in time is an elapsed time.

Definition 152:
invariant(TS x) where x.nonNull {
  x.minus(y).offset.equals(x.offset.plus(y.offset.negated));
};
      
2.15.1.8
Literal Form

Point-in-time literals are simple calendar expressions, as defined by the calendar definition table. By default, the western (Gregorian) calendar shall be used (Table 37).

For the default Gregorian calendar the calendar expression literals of this specification conform to the constrained ISO 8601 that is defined in ISO 8824 (ASN.1) under clause 32 (generalized time) and to the HL7 version 2 TS data format.

Calendar expression literals are sequences of integer numbers ordered according to the "Counter/ord." column of Table 37. Periods with lower order numbers stand to the left of periods with higher order numbers. Periods with no assigned order number cannot occur in the calendar expression for points in time.

The "Counter/digits" column of Table 37 specifies the exact number of digits for the counter number for any period.

Thus, Table 37 specifies that western calendar expressions begin with the 4-digit year (beginning counting at zero); followed by the 2-digit month of the year (beginning counting at one); followed by the 2-digit day of the month (beginning with one); followed by the 2-digit hour of the day (beginning with zero); and so forth. For example, "200004010315" is a valid expression for April 1, 2000, 3:15 am.

A calendar expression can be of variable precision, omitting parts from the right.

For example, "20000401" is precise only to the day of the month.

The least defined calendar period may be written as a real number, with the number of integer digits specified, followed by the decimal point and any number of fractional digits.

For example, "20000401031520.34" means April 1, 2000, 3:15 and 20.34 seconds.

When other calendars will be used in the future, a prefix "GREG:" can be placed before the western (Gregorian) calendar expression to disambiguate from other calendars. Each calendar shall have its own prefix. However, the western calendar is the default if no prefix is present.

In the modern Gregorian calendars (and all calendars where time of day is based on UTC,) the calendar expression may contain a time zone suffix. The time zone suffix begins with a plus (+) or minus (() followed by digits for the hour and minute cycles. UTC is designated as offset "+00" or "-00"; the ISO 8601 and ISO 8824 suffix "Z" for UTC is not permitted.

Definition 153:
TS.literal ST {
  TS : cal timestamp($1)              { $.equals($2); }
     | timestamp(GREG)                { $.equals($1); };

  TS timestamp(Calendar C)
  : cycles(C.head, C.epoch) zone(C)   { $.equals($1.minus($2)); }
                                        $.timezone.equals($2); }
  | cycles(C.head, C.epoch)           { $.equals($1);
                                        $.timezone.unknown; };
  Calendar cal
  : /[a-zA-Z_][a-zA-Z0-9_]*:/         { $.equals($1); };
  TS cycles(CalendarCycle c, TS t)
  : cycle(c, t) cycles(c.next, $1)    { $.equals($2); }
  | cycle(c, t) "." REAL.fractional   { $.equals(c.sum($1, $3));
                                        $.precision.equals(
                                          t.precision.plus($3.precision)); }
  | cycle(c, t)                       { $.equals($1); };
  TS cycle(CalendarCycle c, TS t)
  : /[0-9]{c.ndigits}/                { $.equals(c.sum(t, $1));
                                        $.precision.equals(
                                          t.precision.plus(c.ndigits)); };
  PQ zone(Calendar C)
  : "+" cycles(C.zonehead, C.epoch)   { $.equals($2.minus(C.epoch)); }
  | "-" cycles(C.zonehead, C.epoch)   { $.equals(C.epoch.minus($2)); };
}
      

2.15.2

Calendar (CAL)

Definition:      A calendar is a concept of measuring time in various cycles. Such cycles are years, months, days, hours, minutes, seconds, and weeks. Some of these cycles are synchronized and some are not (e.g., weeks and months are not synchronized.)

After "rolling the time axis" into these cycles (See ) a calendar expresses a point in time as a sequence of integer counts of cycles, e.g., for year, month, day, hour, etc. The calendar is rooted in some conventional start point, called the "epoch."


A calendar "rolls" the time axis into a complex convolute according to
the calendar periods year (blue), month (yellow), day (green), hour
(red), etc. The cycles need not be aligned, for example, the week (not
shown) is not aligned to the month.
Imagine a special clock that measures those cycles, where the pointers
are not all stacked on a common axis but each pointer is attached to
the end of the pointer measuring the next larger cycle.

Figure 9: A calendar "rolls" the time axis into a complex convolute according to the calendar periods year (blue), month (yellow), day (green), hour (red), etc. The cycles need not be aligned, for example, the week (not shown) is not aligned to the month. Imagine a special clock that measures those cycles, where the pointers are not all stacked on a common axis but each pointer is attached to the end of the pointer measuring the next larger cycle.

Calendar is defined as a set of calendar cycles, and has a name and a code. The head of the Calendar is the largest CalendarCycle appearing right most in the calendar expression. The epoch is the beginning of that calendar, i.e., the point in time where all calendar cycles are zero.

Definition 154:
private type Calendar alias CAL extends SET<CLCY>  {
  CV   name;
  CLCY head;
  TS   epoch;
};

invariant(CAL c) where c.nonNull {
  c.name.nonNull;
  c.contains(c.head);
};
      

The calendar definition can be shown as in Table 37 for the modern Gregorian calendar. The calendar definition table lists a calendar cycle in each row. The calendar units are dependent on each other and defined in the value column. The sequence column shows the relationship through the next property. The other columns are as in the formal calendar cycle definition.45

Table 37: Domain CalendarCycle:
name code 1 code 2 counter digits start condition
year   Y   CY   1   4   0   MY12  
month of the year   M   MY   2   2   1   MY01,03,05,07,08,10,12 → DM31
MY04,06,09,11 → DM30
MY02 Y/4 Y/100 → DM28
MY02 Y/4 → DM29
MY02 → DM28
 
month (continuous)     CM       0   continuous MY  
week (continuous)   W   CW       0   CD7  
week of the year     WY     2   1   continuous DW7  
day of the month   D   DM   3   2   1   HD24  
day (continuous)     CD       0   CH24  
day of the year     DY     3   1   HD24  
day of the week (begins with Monday)   J   DW     1   1   HD24  
hour of the day   H   HD   4   2   0   MH60  
hour (continuous)     CH       0   CN60  
minute of the hour   N   NH   5   2   0   UTC leap second → SN61 → SN60  
minute (continuous)     CN       0   CS60  
second of the minute   S   SN   6   2   0   CS1  
second (continuous)     CS       0   basis  

2.15.3

Calendar Cycle (CLCY)

Definition:      A calendar cycle defines one group of decimal digits in the calendar expression. Examples for calendar cycles are year, month, day, hour, minute, second, and week.

A calendar cycle has a name and two codes, a one-letter code and a two-letter code. The property ndigits is the number of decimal digits occupied in the calendar expression. The property start specifies where counting starts (i.e., at 0 or 1.) The next property is the next lower cycle in the order of the calendar expression. The max(t) property is the maximum number of cycles at time t (max depends on the time t to account for leap years and leap seconds.) The property value(t) is the integer number of cycles shown in the calendar expression of time t. The property sum(t, n) is the sum of n calendar cycles added to the time t.

Definition 155:
private type CalendarCycle alias CALCY extends ANY {
  CE    name;
  INT   ndigits;
  INT   start;
  CALCY next;
  INT   max(TS);
  TS    sum(TS t, REAL r);
  INT   value(TS t);
};

invariant(CALCY c) where c.nonNull {
  c.name.nonNull;
  c.start.equals(0).or(c.start.equals(1));
  c.digits.greaterThan(0);
};
      

3

Generic Collections

Generic Collection Data Types

Figure 10: Generic Collection Data Types

This section defines data types that can "collect" other data values, Set, Sequence, Bag and Interval.46 These collection types are defined as generic (parameterized) types. The concept of generic types is described in (§ ).

3.1

Set (SET)

Definition:      A value that contains other distinct values in no particular order.

Definition 156:
template<ANY T>
type Set<T> alias SET<T> extends ANY {
            BL      contains(T element);
            BL      isEmpty;
            BL      nonEmpty;
            BL      contains(SET<T> subset);
            INT     cardinality;
            SET<T>  union(SET<T> otherset);
            SET<T>  except(T element);
            SET<T>  except(SET<T> otherset);
            SET<T>  intersection(SET<T> otherset);
  literal   ST;
  promotion SET<T>  (T x);
            IVL<T>  hull;
};
    

3.1.1

Properties of Set (SET)

3.1.1.1
Contains Element (contains : BL)

Definition:      A relation of the set with its elements, true if the given value is an element of the set.

This is the primitive semantic property of a set, based on which all other properties are defined.

A set may only contain distinct non-NULL elements. Exceptional values (NULL-values) cannot be elements of a set.

Definition 157:
invariant(SET<T> s, T n) where s.nonNull.and(n.isNull) {
  s.contains(n).not;
};
        
3.1.1.2
Contains Subset (contains : BL)

Definition:      The relation between a set and its subsets, where each element in the subset is also an element of the superset.

Definition 158:
invariant(SET<T> superset, subset) 
    where superset.nonNull.and(subset.nonNull)
  superset.contains(subset).equals(
         forall(T element) where subset.contains(element) {
	   superset.contains(element);      
	 });
};
      

This implies that the empty set is a subset of every set including itself.

3.1.1.3
Non-Empty (nonEmpty : BL)

Definition:      A predicate indicating that this set contains elements.

Definition 159:
invariant(SET<T> set) where set.nonNull {
  set.nonEmpty.equals(exists(T element) { set.contains(element); });
};
      
3.1.1.4
The Empty Set (isEmpty : BL)

Definition:      A predicate indicating that this set has no elements (negation of the SET.nonEmpty. The empty set is a proper set value, not an exceptional (NULL) value.

Definition 160:
invariant(SET<T> set) where set.nonNull {
  set.isEmpty.equals(nonEmpty.not);
};
      
3.1.1.5
Cardinality (cardinality : INT)

Definition:      The cardinality of a set is the number of distinct elements in the set.

Definition 161:
invariant(SET<T> set) where set.nonNull {
  exists(T element) where set.contains(element) {
    set.cardinality.equals(set.except(element).cardinality.successor);
  };
};
      

The cardinality definition is not sufficient since it doesn't converge for uncountably infinite sets (REAL, PQ, etc.) and it doesn't terminate for infinite sets. In addition, the definition of integer number type in this specification is incomplete for these cases, as it doesn't account for infinities. Finally the cardinality value is an example where it would be necessary to distinguish the cardinality ℵ0 (aleph0) of countably infinite sets (e.g., INT) from ℵ1 (aleph1), the cardinality of uncountable sets (e.g., REAL, PQ).

3.1.1.6
Union (union : SET<T>)

Definition:      A union of two sets (component sets) is a set where each of the union's elements also is an element of either one component set.

Definition 162:
invariant(SET<T> x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.union(y).equals(z)
    .equals(forall(T e) {
              z.contains(e).equals(x.contains(e).or(y.contains(e)));
            });
};
      
3.1.1.7
Include Element (union : SET<T>)

Definition:      A union of a set and an element.

Definition 163:
invariant(SET<T> set, singletonset, T element)
    where set.nonNull
     .and(element.nonNull)
     .and(singletonset.cardinality.isOne)
     .and(singletonset.contains(element)) {
  set.union(element).equals(set.union(singleton));
};
      
3.1.1.8
Set Difference (except : SET<T>)

Definition:      The difference of this set and its subtracting set is the set that contains all elements of this set that are not elements of the subtracting set.

Definition 164:
invariant(SET<T> x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.except(y).equals(z)
    .equals(forall(T e) {
              z.contains(e).equals(x.contains(e).and(y.contains(e).not));
            });
};
      
3.1.1.9
Exclude Element (except : SET<T>)

Definition:      The difference between this set and an element value is the set that contains all elements of this set except for the subtracting element value. If the element value is not contained in this set, the difference is equal to this set.

Definition 165:
invariant(SET<T> x, z; T d)
    where z.nonNull.and(z.nonNull).and(d.nonNull) {
  x.except(d).equals(z)
    .equals(forall(T e) {
              z.contains(e).equals(x.contains(e).and(d.equals(e).not));
            });
};
      
3.1.1.10
Intersection (intersection : SET<T>)

Definition:      The intersection between two sets is a set containing all and only those elements that are contained in both of the operand sets.

Definition 166:
invariant(SET<T> x, y, z)
    where x.nonNull.and(y.nonNull).and(z.nonNull) {
  x.intersection(y).equals(z)
    .equals(forall(T e) {
              z.contains(e).equals(x.contains(e).and(y.contains(e)));
            });
};
      
3.1.1.11
Literal Form

When the element type T has a literal form, the set of T elements has a literal form, wherein the elements of the set are enumerated within curly braces and separated by semicolon characters.

Definition 167:
SET<T>.literal ST {
  SET<T> : "{" elements "}"   { $.equals($2); };
  SET<T> elements
        : elements ";" T      { $.except($2).equals($1); }
        | T                   { $.contains($1);
                                $.except($1).isEmpty; };
};
      
NOTE: this literal form for sets is only practical for relatively small enumerable sets; this does not mean, however, that all sets are relatively small enumerations of elements.
Table 38: Example
literal meaning
{1; 3; 5; 7; 19} a set of integer numbers or real numbers
{3; 1; 5; 19; 7} the same set of integer numbers or real numbers
{1.2 m; 2.67 m; 17.8 m} a set of discrete physical quantities
{apple; orange; banana} a set of character strings
NOTE: a character-based ITS should choose a different literal form for sets if the Implementation Technology has a more native literal form for such collections.
3.1.1.12
Promotion of Element Values to Sets (promotion : )

A data value of type T can be promoted into a trivial set of T with that data value as its only element.

Definition 168:
invariant(T x) {
  ((SET<T>)x).contains(x);
  ((SET<T>)x).except(x).isEmpty;
};
      
3.1.1.13
Convex Hull of Totally Ordered Sets (hull : IVL<T>)

Sets of quantities may be totally ordered sets when there is an order relationship defined between any two elements in the set. Note that "ordered set" does not mean the same as Sequence (LIST). For example, the set {3; 2; 4; 88; 1} is an ordered set. The ordering of the elements in the set notation is still irrelevant, but elements can be compared to establish an order (1; 2; 4; 88).

Totally ordered sets have convex hull. A convex hull of a totally ordered set S is the smallest interval that is a superset of S. This concept is going to be important later on.

Definition 169:
type Set<QTY> alias SET<QTY> {
            BL      totallyOrdered;
            IVL<T>  hull;
};

invariant(SET<QTY> s) where s.nonNull {
  s.totallyOrdered.equals(forall(QTY x, y) where s.contains(x)
                                             .and(s.contains(y)) {
                            x.compares(y); });
};

invariant(SET<QTY> s) where s.totallyOrdered {
  s.hull.contains(s);
  forall(T e) where s.contains(e) {
    s.hull.low.lessOrEqual(e);
    e.lessOrEqual(s.hull.high);
  };
};
      

Note that hull is defined if and only if the actual set is a totally ordered set. The data type of the elements itself need not be totally ordered. For example, the data type PQ is only partially ordered (since only quantities of the same kind can be compared), but a SET<PQ> may still be totally ordered (if it contains only comparable quantities.) For example, the convex hull of {4 s, 20 s, 55 s} is [4 s;55 s]; the convex hull of {"apples"; "oranges"; "bananas"} is undefined because the elements have no order relationship among them; and the convex hull of {2 m; 4 m; 8 s} is likewise undefined, because it is not totally ordered (seconds are not comparable with meters.)

Convex Hull of a Totally Ordered Set

Figure 11: Convex Hull of a Totally Ordered Set

3.2

Sequence (LIST)

Definition:      A value that contains other discrete values in a defined sequence.

Definition 170:
template<ANY T>
type Sequence<T> alias LIST<T> extends ANY {
            T         head;
            LIST<T>   tail;
            BL        isEmpty;
            BL        nonEmpty;
	    T         item(INT index);
            BL        contains(T item);
            INT       length;
  literal   ST;
  promotion LIST<T>   (T x);
};
    

A sequence may contain NULL values as items.

3.2.1

Properties of Sequence (LIST)

3.2.1.1
Head Item (head : T)

Definition:      The first item in this sequence. The is a definitional property for the semantics of the sequence.

3.2.1.2
Tail Sequence (tail : LIST<T>)

Definition:      The sequence following the first item in this sequence. The is a definitional property for the semantics of the sequence.

3.2.1.3
Empty Sequence (isEmpty : BL)

Definition:      A predicate that is true if this sequence is an empty sequence, i.e., if it contains no items.

Notice the difference between empty-sequence and NULL: an empty sequence is a proper sequence, not a null-value.

Definition 171:
invariant(LIST<T> x) where x.isEmpty {
  x.head.isNull;
  x.tail.isNull;
};
      

Notice that head and tail being NULL is only a necessary condition but not sufficient for determining an empty list, since a sequence may contain NULL-values as items, this condition can mean that this list has only a head item that happens to be NULL.

3.2.1.4
Non-Empty Sequence (nonEmpty : BL)

Definition:      A predicate that is true if this sequence is non-empty. Negation of LIST.isEmpty.

Definition 172:
invariant(LIST<T> x) where x.nonNull {
  x.nonEmpty.equals(x.isEmpty.not);
};
      
3.2.1.5
Item by Index (item : T)

Definition:      The item at the given sequential position (index) in the sequence. The index zero refers to the first element (head) of the sequence.

Definition 173:
invariant(LIST<T> list; INT index) 
     where list.nonNull.and(index.nonNegative) {
  list.isEmpty
    .implies(list.item(index).isNull);
  list.nonEmpty.and(index.isZero)
    .implies(list.item(index).equals(list.head));
  list.nonEmpty.and(index.nonZero)
    .implies(list.item(index).equals(list.tail.item(index.predecessor)));
};
      
3.2.1.6
Contains Item (contains : BL)

Definition:      A predicate that is true if this sequence contains the given item value.

Definition 174:
invariant(LIST<T> list; T item) 
     where list.nonNull {
  list.isEmpty
    .implies(list.contains(item).not);
  list.nonEmpty.and(item.nonNull)
    .implies(list.contains(item).euqals(    list.head.equals(item)
                                       .or(list.tail.contains(item))));
  list.nonEmpty.and(item.isNull)
    .implies(list.contains(item).equals(    list.head.isNull
                                       .or(list.tail.contains(item)));
};
      
3.2.1.7
Length (length : INT)

Definition:      The number of elements in the sequence. NULL elements are counted as regular sequence elements.

Definition 175:
invariant(LIST<T> list) where x.nonNull {
  list.isEmpty.equals(list.length.isZero);
  list.nonEmpty.equals(list.length.equals(list.tail.length.successor));
};
      
3.2.1.8
Equality (equals : BL, inherited from ANY)

Two lists are equal if and only if they are both empty, or if both their head and their tail are equal.

Definition 176:
invariant(LIST<T> x, y) where x.nonNull.and(y.nonNull) {
  x.isEmpty.and(y.isEmpty)
     .implies(x.equals(y));
  x.nonEmpty.and(y.nonEmpty).and(x.head.nonNull)
     .implies(x.equals(y).equals(     x.head.equals(y.head)
                                 .and(x.tail.equals(y.tail))));
  x.nonEmpty.and(y.nonEmpty).and(x.head.isNull)
     .implies(x.equals(y).equals(     y.heas.isNull
                                 .and(x.tail.equals(y.tail))));
};
      
3.2.1.9
Literal Form

When the element type T has a literal form, the sequence LIST<T> has a literal form. List elements are enumerated, separated by semicolon, and enclosed in parentheses.

Definition 177:
LIST<T>.literal ST {
  LIST<T>
  : "(" elements ")"        { $.equals($2); }
  | "(" ")"         { $.isEmpty; };
  LIST<T> elements
         : T ";" elements     { $.head.equals($1);
            $.tail.equals($3); }
         | T          { $.head.equals($1);
            $.tail.isEmpty; };
};
      
Table 39: Examples
literal meaning
(1; 3; 5; 7; 19) a sequence of integer numbers or real numbers
(3; 1; 5; 19; 7) a different sequence of integer numbers or real numbers
(1.2 m; 17.8 m; 2.67 m) a sequence of discrete physical quantities
(apple; orange; banana) a sequence of character strings
NOTE: a character-based ITS should choose a different literal form for sequences if the Implementation Technology has a more native literal form for such collections.
3.2.1.10
Promotion of Item Values to Sequences (promotion : )

A data value of type T can be promoted into a trivial sequence of T with that data value as its only item.

Definition 178:
invariant(T x) {
  ((LIST<T>)x).head.equals(x);
  ((LIST<T>)x).tail.isEmpty;
};
      

Specializations of Sequence (LIST)

3.2.2

GeneratedSequence (GLIST) restricts LIST

Definition:      A periodic or monotone sequence of values generated from a few parameters, rather than being enumerated. Used to specify regular sampling points for biosignals.

Definition 179:
type GeneratedSequence<QTY T> alias GLIST restricts LIST<T> {
        T       head;
        T.diff  increment;
        INT     period;
	INT     denominator;
};
    
Table 40: Property Summary of GeneratedSequence
Name Type Description
head T The first item in this sequence. The is a definitional property for the semantics of the sequence.
increment T.diff The difference between one value and its pervious different value. For example, to generate the sequence (1; 4; 7; 10; 13; ...) the increment is 3; likewise to generate the sequence (1; 1; 4; 4; 7; 7; 10; 10; 13; 13; ...) the increment is also 3.
period INT If non-NULL, specifies that the sequence alternates, i.e., after this many increments, the sequence item values roll over to start from the initial sequence item value. For example, the sequence (1; 2; 3; 1; 2; 3; 1; 2; 3; ...) has period 3; also the sequence (1; 1; 2; 2; 3; 3; 1; 1; 2; 2; 3; 3; ...) has period 3 too.
denominator INT The the integer by which the index for the sequence is divided, effectively the number of times the sequence generates the same sequence item value before incrementing to the next sequence item value. For example, to generate the sequence (1; 1; 1; 2; 2; 2; 3; 3; 3; ...) the is 3.

The item at a certain index in the list is calculated by performing an integer division on the index (i) with the GLIST.denominator (d) and then take that value's remainder with the GLIST.period (p). Multiply this value with the GLIST.increment (Δx) and add to the GLIST.head (x0.)

xi = x0 + Δx × (i/d) mod p

Definition 180:
invariant(GLIST<T> list, INT index) 
      where list.nonNull
       .and(index.nonNull)

  list.period.nonNull
    .implies(list.item(index)
              .equals(      list.head
                      .plus(item.dividedBy(list.increment.denominator)
                              .remainder(list.period))
                         .times(increment)));
  list.period.isNull
    .implies(list.item(index)
              .equals(      list.head
                      .plus(item.dividedBy(list.increment.denominator))
                         .times(increment)));
};
    
3.2.2.1
Head Item (head : T, inherited from LIST)

This is the start-value of the generated list.

3.2.2.2
Increment (increment : T.diff)

Definition:      The difference between one value and its pervious different value. For example, to generate the sequence (1; 4; 7; 10; 13; ...) the increment is 3; likewise to generate the sequence (1; 1; 4; 4; 7; 7; 10; 10; 13; 13; ...) the increment is also 3.

3.2.2.3
Period Step Count (period : INT, default )

Definition:      If non-NULL, specifies that the sequence alternates, i.e., after this many increments, the sequence item values roll over to start from the initial sequence item value. For example, the sequence (1; 2; 3; 1; 2; 3; 1; 2; 3; ...) has period 3; also the sequence (1; 1; 2; 2; 3; 3; 1; 1; 2; 2; 3; 3; ...) has period 3 too.

The period allows to repeatedly sample the same sample space. The "waveform" of this periodic generator is always a "saw", just like the x-function of your oscilloscope.47

3.2.2.4
Denominator (denominator : INT, default 1)

Definition:      The the integer by which the index for the sequence is divided, effectively the number of times the sequence generates the same sequence item value before incrementing to the next sequence item value. For example, to generate the sequence (1; 1; 1; 2; 2; 2; 3; 3; 3; ...) the is 3.

The use of the denominator is to allow multiple generated sequences to periodically scan a multidimensional space. For example, an (abstract) TV screen uses 2 such generators for the columns and rows of pixels. For instance, if there are 200 scan lines and 320 raster colunmns, the column-generator would have denominator 1 and the line-generator would have denominator 320.

Table 41: Examples for Generated Sequences
head increment denominator period meaning
0 1 1 The identity-sequence where each item is equal to its index.
198706052000 2 hour 1 Sequence starting on June 5, 1987 at 7 PM and incrementing every two hours: 9 PM, 11 PM, 1 AM (June 6), 3 AM, 5 AM, and so on.
0 V 1 mV 1 100 The x-wave of a digital oscillograph scanning between 0 and 100 mV in 100 steps of 1 mV. The frequency is unknown from these data as we do not know how much time elapses between each step of the index.
2002072920300 100 us 1 A timebase from June 29, 2002 at 8:30 PM with 100 us between each steps of the index. If combined with the previous generator as a second sampling dimension this would now describe our digital oscilloscope's x-timebase as 1 mV per 100 us. At 100 steps per period, the period is 10 ms, which is equal to a frequency of 100 Hz.
0 V 1 mV 100 100 Combining this generator to the previous two generators could describe a three-dimensional sampling space with two voltages and time. This generator also steps at 1 mV and has 100 steps per period, however, it only steps every 100 index increments, so, the first voltage generator makes one full cycle before this generator is incremented. One can think of the two voltages as "rows" and "columns" of a "sampling frame". With the previous generator as the timebase, this results in a scan of sampling frames of 100 mV × 100 mV with a framerate of 1 Hz.

3.2.3

SampledSequence (SLIST) restricts LIST

Definition:      A sequence of sampled values scaled and translated from a list of integer values. Used to specify sampled biosignals.

Definition 181:
type SampledSequence<QTY T> alias SLIST extends LIST<T> {
  T         origin;
  T.diff    scale;
  LIST<INT> digits;
}
    
Table 42: Property Summary of SampledSequence
Name Type Description
origin T The origin of the list item value scale.
scale T.diff A ratio quantity that is factored out of the digit sequence.
digits LIST<INT> A sequence of raw digits for the sample values. This is typically the raw output of an A/D converter.

The item at a certain index (i) in the list is calculated by multiplying the item at the same index in the SLIST.digits sequence (di) with the SLIST.scale (s) and then add that value to the SLIST.origin (xo ).

xi = xo + s × di

Definition 182:
invariant(SLIST<T> list, INT index) 
    where list.nonNull.and(index.nonNegative)
{
  list.item(index).equals(      list.scale.times(digits.item(index))
                          .plus(list.origin));
}
    
3.2.3.1
Scale Origin (origin : T)

Definition:      The origin of the list item value scale.

3.2.3.2
Scale Factor (scale : T.diff)

Definition:      A ratio quantity that is factored out of the digit sequence.

3.2.3.3
Sampled Digits (digits : LIST<INT>)

Definition:      A sequence of raw digits for the sample values. This is typically the raw output of an A/D converter.

3.3

Bag (BAG)

Definition:      An unordered collection of values, where each value can be contained more than once in the bag.

Definition 183:
template<ANY T>
type Bag<T> alias BAG<T> extends ANY {
            INT     contains(T kind);
            BL      isEmpty;
	    BL	    nonEmpty;
            BAG<T>  plus(BAG<T>);
            BAG<T>  minus(BAG<T>);
  promotion BAG<T>  (T x);
};
    
NOTE: a bag can be represented in two ways. Either as a simple enumeration of elements, including repeated elements, or as a "compressed bag" whereby the content of the bag is listed in pairs of element value and number. A histogram showing absolute frequencies is a bag represented in compressed form. The bag is therefore useful to communicate raw statistical data samples.

3.3.1

Properties of Bag (BAG)

3.3.1.1
Contains Item (contains : INT)

Definition:      The number of items in this bag with the given item value.

This is the primitive semantic property of a bag, based on which all other properties are defined.

Definition 184:
invariant(BAG<T> bag; T item) 
     where bag.nonNull.and(item.nonNull) {
  bag.contains(item).nonNegative;
  bag.isEmpty.equals(bag.contains(item).isZero);
};
      
3.3.1.2
Non-Empty (nonEmpty : BL)

Definition:      A predicate indicating that this bag contains item.

Definition 185:
invariant(BAG<T> bag) where bag.nonNull {
  bag.nonEmpty.equals(exists(T item) { bag.contains(item); });
};
      
3.3.1.3
The Empty Bag (isEmpty : BL)

Definition:      A predicate indicating that this bag has no elements (negation of the BAG.nonEmpty predicate. The empty bag is a proper set value, not an exceptional (NULL) value.

Definition 186:
invariant(BAG<T> bag) where bag.nonNull {
  bag.isEmpty.equals(nonEmpty.not);
};
      
3.3.1.4
Addition (plus : BAG<T>)

Definition:      A bag that contains all items of the operand bags, i.e. the number of items of each item value are added.

Definition 187:
invariant(BAG<T> x, y, z) where x.nonNull.and(y.nonNull) {
  x.plus(y).equals(z)
    .equals(forall(T e) where e.nonNull {
              z.contains(e).equals(x.contains(e).plus(y.contains(e)));
            });
};
      
3.3.1.5
Subtraction (minus : BAG<T>)

Definition:      A bags that contains all items of this bag (minuend) diminished by the items in the other bag (subtrahend). Bags cannot carry deficits. When the subtrahend contains more items of one value than the minuend, the difference contais zero items of that value.

Definition 188:
invariant(BAG<T> x, y, z) where x.nonNull.and(y.nonNull) {
  x.minus(y).equals(z)
    .equals(forall(T e) where e.nonNull {
              exists(INT n)
                  where n.equals(x.contains(e).minus(y.contains(e)) {
                n.nonNegative.equals(z.contains(e));
                n.isNegative.equals(z.contains(e).isZero);
              };
            });
};
      
3.3.1.6
Promotion of Item Values to Bags (promotion : )

A data value of type T can be promoted into a trivial bag of type T with that data value as its only item.

Definition 189:
invariant(T x) {
  ((BAG<T>)x).contains(x).equals(1);
  forall(T y) { ((BAG<T>)x).contains(y).implies(x.equals(y)) };
};
      

3.4

Interval (IVL)

Definition:      A set of consecutive values of an ordered base data type.

Any ordered type can be the basis of an interval; it does not matter whether the base type is discrete or continuous. If the base data type is only partially ordered, all elements of the interval must be elements of a totally ordered subset of the partially ordered data type.

For example, physical quantities are considered ordered. However the ordering of physical quantities is only partial; a total order is only defined among comparable quantities (quantities of the same physical dimension.) While intervals between 2 and 4 meter exists, there is no interval between 2 meters and 4 seconds.

Intervals are sets and have all the properties of sets. However, union and differences of intervals may not be intervals any more, since the elements of these union and difference sets might not be contiguous. Intersections of intervals are always intervals.

Definition 190:
template<QTY T>
type Interval<T> alias IVL<T> extends SET<T> {
            T       low;
            BL      lowClosed;
            T       high;
            BL      highClosed;
            T.diff  width;
            T       center;
            IVL<T>  hull(IVL<T> x);
  literal   ST;
  promotion IVL<T>  (T x);
  demotion  T;
};
    

3.4.1

Properties of Interval (IVL)

3.4.1.1
Low Boundary (low : T)

Definition:      This is the low limit of the interval.

Definition 191:
invariant(IVL<T> x; T e) where x.nonNull.and(x.contains(e)) {
  x.low.lessOrEqual(e);
};
      
3.4.1.2
High Boundary (high : T)

Definition:      This is the high limit of the interval.

Definition 192:
invariant(IVL<T> x; T e) where x.nonNull.and(x.contains(e)) {
  e.lessOrEqual(x.high);
};
      
3.4.1.3
Width (width : T.diff)

Definition:      The difference between high and low boundary. The purpose of distinguishing a width property is to handle all cases of incomplete information symmetrically. In any interval representation only two of the three properties high, low, and width need to be stated and the third can be derived.

When both boundaries are known, width can be derived as high minus low. When one boundary and the width is known, the other boundary is also known. When no boundary is known, the width may still be known. For example, one knows that an activity takes about 30 minutes, but one may not yet know when that activity is started.

Note that the data type of the width is not always the same as for the boundaries. For ratio scale quantities (REAL, PQ, MO) it is the same. For difference scale quantities (e.g., TS) is is the data type of the difference (e.g., PQ in the dimension of time for TS). For discrete elements (INT) the width may be a REAL indicating the number of elements in the interval divided by 2.

Definition 193:
invariant(IVL<T> x) {
  x.low.lessOrEqual(x.high);
  x.width.equals(x.high.minus(x.low));
};
      
3.4.1.4
Central Value (center : T)

Definition:      The arithmetic mean of the interval (low plus high divided by 2). The purpose of distinguishing the center as a semantic property is for conversions of intervals from and to point values.

Note that a center doesn't always exist for every interval. Notably intervals that are infinite on one side do not have a center. Also intervals of discrete base types with an even number of elements do not have a center. If an interval is unknown on one (or both) boundaries, the center can still be asserted. In fact, the main use case for the center is to be asserted when no boundary is known.

Definition 194:
invariant(IVL<T> x) where x.low.nonNull.and(x.high.nonNull) {
  x.center.equals(x.low.plus(x.width.times(0.5))));
};
invariant(IVL<T> x) where x.low.isNull.or(x.high.isNull) {
  x.center.notApplicable;
};
      
3.4.1.5
Low Boundary Closed (lowClosed : BL, default true)

Definition:      Specifies whether the low limit is included in the interval (interval is closed) or excluded from the interval (interval is open).

Definition 195:
invariant(IVL<T> x) where x.nonNull {
  x.low.nonNull.implies(x.lowClosed.equals(x.contains(x.low)));
  x.low.isNull.implies(x.lowClosed.not);
};
      
3.4.1.6
High Boundary Closed (highClosed : BL, default true)

Definition:      Specifies whether the high limit is included in the interval (interval is closed) or excluded from the interval (interval is open).

Definition 196:
invariant(IVL<T> x) where x.nonNull {
  x.high.nonNull.implies(x.highClosed.equals(x.contains(x.high)));
  x.high.isNull.implies(x.highClosed.not);
};
      
3.4.1.7
Literal Form

The literal form for the interval data type is defined such that it is as intuitive to humans as possible. Five different forms are defined:48

  1. the interval form using square brackets, e.g., "[3.5; 5.5[";


  2. the dash-form, e.g., "3.5-5.5";


  3. the "comparator" form, using relational operator symbols, e.g., "<5.5";


  4. the center-width form, e.g., "4.5[2.0[".


  5. the width-only form using square brackets, e.g., "[2.0[".


Definition 197:
IVL<T>.literal ST {
  IVL<T> range
  : interval                { $.equals($1); }
  | dash                    { $.equals($1); }
  | comparator              { $.equals($1); }
  | center_width            { $.equals($1); }
  | width                   { $.equals($1); };

  IVL<T> interval
  : open T ";" T close;     { $.low.equals($2);
                              $.high.equals($4);
                              $.lowClosed.equals($1);
                              $.highClosed.equals($5); };
  BL open : "["             { $.equals(true); }
          | "]"             { $.equals(false); };
  BL close : "]"            { $.equals(true); }
           | "["            { $.equals(false); };
  IVL<T> width
  : open T.diff close       { $.width.equals($2);
                              $.lowClosed.equals($1);
                              $.highClosed.equals($3); };
  IVL<T> center_width
  : T width                 { $.center.equals($1);
                              $.width.equals($2.width);
                              $.lowClosed.equals($2.lowClosed);
                              $.highClosed.equals($2.highClosed); };
  IVL<T> dash : T "-" T;    { $.low.equals($2);
                              $.high.equals($4);
                              $.lowClosed.equals(true);
                              $.highClosed.equals(true); };
  IVL<TS> comparator
  : "<"  T                  { $.high.equals(T);
                              $.high.closed(false);
                              $.low.negativelyInfinite; }
  | ">"  T                  { $.low.equals(T);
                              $.low.closed(false);
                              $.high.positivelyInfinite; }
  | "<=" T                  { $.high.equals(T);
                              $.high.closed(true);
                              $.low.negativelyInfinite; }
  | ">=" T                  { $.low.equals(T);
                              $.low.closed(true);
                            $.high.positivelyInfinite; };
};
      
3.4.1.8
Promotion of Element Values to Intervals (promotion : )

A quantity of type T can be promoted into a trivial interval of T where low and high boundaries are equal and boundaries closed.

Definition 198:
invariant(T x) {
  ((IVL<T>)x).low.equals(x);
  ((IVL<T>)x).high.equals(x);
  ((IVL<T>)x).highClosed;
  ((IVL<T>)x).lowClosed;
};
      
3.4.1.9
Demotion of Intervals to a Representative Element Value (demotion : T)

An interval of T can be demoted to a simple quantity of type T that is representative for the whole interval. If both boundaries are finite, this is the IVL.center. If one boundary is infinite, the representative value is the other boundary. If both boundaries are infinite, the conversion to a point value is not applicable.

Definition 199:
invariant(IVL<T> x) where x.nonNull {
  x.low.nonNull.and(x.high.nonNull).implies(((T)x).equals(x.center));
  x.high.nonNull.and(x.low.isNull).implies(((T)x).equals(x.high));
  x.low.nonNull.and(x.high.isNull).implies(((T)x).equals(x.low));
  x.low.isNull.and(x.high.isNull).implies(((T)x).notApplicable);
};
      
3.4.1.10
Convex Hull (hull : IVL<T>, inherited from SET)

Definition:      A convex hull or "interval hull" of two intervals is the least interval that is a superset of its operands. This concept will play an important role later on.

Convex Hull of two Intervals

Figure 12: Convex Hull of two Intervals

Definition 200:
invariant(IVL<T> h, IVL<T> i, j) where h.equals(i.hull(j)) {
  i.low.lessOrEqual(j.low).implies(h.low.equals(i.low));
  j.low.lessOrEqual(i.low).implies(h.low.equals(j.low));
  i.high.lessOrEqual(j.high).implies(h.high.equals(j.high));
  j.high.lessOrEqual(i.high).implies(h.high.equals(i.high));
};
      

Specializations of Interval (IVL)

3.4.2

Interval of Physical Quantities (IVL<PQ>)

An interval of physical quantities is constructed from the generic interval type. However, recognizing that the unit can be factored from the boundaries, we add additional semantics and a separate literal form. The additional view of an interval of physical quantities is an interval of real numbers with one unit.

Definition 201:
type Interval<PQ> alias IVL<PQ> {
  IVL<REAL> value;
  CS  unit;
};
      

The unit applies to both low and high boundary.

Definition 202:
invariant(IVL<PQ> x) where x.nonNull {
  x.value.nonNull;
  x.low.value.equals(x.value.low);
  x.low.unit.equals(x.unit);
  x.lowClosed.equals(x.value.lowClosed);
  x.high.value.equals(x.value.high);
  x.high.unit.equals(x.unit);
  x.highClosed.equals(x.value.highClosed);
};
      

The special literal form is simply an interval of real numbers a space and the unit.

Definition 203:
IVL<PQ>.literal ST {
  IVL<PQ>
  : IVL<REAL> " " unit          { $.value($1); $.unit.equals($3); }
  | IVL<REAL>                   { $.equals($1); };
  CS unit : ST                  { $.value.equals($1);
                                  $.codeSystem(2.16.840.1.113883.3.2); };
};
      

For example: "[0;5] mmol/L" or "<20 mg/dL" are valid literal forms of intervals of physical quantities. The generic interval form, e.g., "[50 nm; 2 m]" is also allowed.

3.4.3

Interval of Point in Time (IVL<TS>)

The generic interval data type defines the interval of points in time too. However, there are some special considerations about literal representations and conversions of intervals of point in time, which are specified in this section.

Definition 204:
type Interval<TS> alias IVL<TS> {
  literal   ST
  promotion IVL<TS> (TS x);
};
      
3.4.3.1
Promotion of Points in Time Values to Intervals (promotion : )

A TS can be promoted to an IVL<TS> whereby the low boundary is the TS value itself, and the width is inferred from the precision of the TS and the duration of the least significant calendar period specified. The high boundary is open. For example, the TS literal "200009" is converted to an IVL<> with low boundary 200009 and width 30 days, which is the interval "[200009;200010[".

3.4.3.2
Literal Form

The literal form for interval of point in time is exceptional.

In order to avoid syntactic conflicts with the timezone and slightly different usage profiles of the ISO 8601 that occur on some ITS platforms, the dash form of the interval is not permitted for IVL<>. The interval-form using square brackets is preferred.

Example: May 12, 1987 from 8 to 9:30 PM is "[198705122000;198705122130]".

NOTE: The precision of a stated interval boundary is irrelevant for the interval. One might wrongly assume that the interval "[19870901;19870930]" stands for the entire September 1987 until end of the day of September 30. However, this is not so! The proper way to denote an entire calendar cycle (e.g., hour, day, month, year, etc.) in the interval notation with is to use an open high boundary. For example, all of September 1987 is denoted as "[198709;198710[".49

The "hull-form" of the literal is defined as the convex hull (see IVL.hull) of interval-promotions from two time stamps.

Definition 205:
  IVL<TS> hull : TS ".." TS { $.equals(((IVL<TS>)$1).hull((IVL<TS>)$3)); };
        

For example, "19870901..19870930" is a valid literal using the hull form. The value is equivalent to the interval form "[19870901;19871001[". 50

The hull-form further allows an abbreviation, where the higher timestamp literal does not need to repeat digits on the left that are the same as for the lower timestamp literal. The two timestamps are right-aligned and the digits to the left copied from the lower to the higher timestamp literal. This is a simple string operation and is not formally defined here.

Example: May 12, 1987 to May, 23, 1987 is "19870512..23". However, note that May 12, 1987 to June 2, 1987 is "19870512..0602", and not "20000512..02".

4

Generic Type Extensions

Generic type extensions are generic types with one parameter type, and that extend (specialize) their parameter type. In the formal data type definition language, generic type extensions follow the pattern: template<ANY T> typeGenericTypeExtensionNameextends T { ... }; These generic type extensions inherit most properties of their base type and add some specific feature to it. The generic type extension is a specialization of the base type, thus a value of the extension data type can be used instead of its base data type.

NOTE: values of extended types can be substituted for their base type. However, an ITS may make some constraints as to what extensions to accommodate. Particularly, extensions need not be defined for those components carrying the values of data value properties. Thus, while any data value can be annotated outside the data type specification, and ITS may not provide for a way to annotate the value of a data value property. At this time HL7 does not permit use of generic type extensions, except where explicitly enabled (in this or another HL7 specification) for such use cases where this advanced functionality is important. In these cases, instances of these generic type extensions must be specifically and explicitly reflected in the HL7 RIM, MIM, RMIM and HMD (as applicable), as a result of balloted Technical Committee content.51

4.1

History Item (HXIT)

Definition:      A generic data type extension that tags a time range to any data value of any data type. The time range is the time in which the information represented by the value is (was) valid.

If the base type T does not possess a valid time property, the HXIT adds that property to the base type. If, however, the base type T does have a valid time property, that property can be mapped to the valid time property of the HXIT.52

Definition 206:
template<ANY T>
type HistoryItem<T> alias HXIT<T> extends T {
    IVL<TS> validTime;
};
    

4.1.1

Properties of History Item (HXIT)

4.1.1.1
Valid Time (validTime : IVL<TS>)

Definition:      The time interval during which the given information was, is, or is expected to be valid. The interval can be open or closed infinite or undefined on either side.

4.1.2

History (HIST)

Definition:      A set of data values that conform to the history item (HXIT) type, (i.e., that have a valid-time property). The history information is not limited to the past; expected future values can also appear.

The history information is not limited to the past; expected future values can also appear.

Definition 207:
template<ANY T>
type History<T> alias HIST<T> extends SET<HXIT<T>> {
             HXIT<T>   earliest;
             HIST      exceptEarliest;
             HXIT<T>   latest;
             HIST      exceptLatest;
  demotion   HXIT<T>;
};
      

The semantics does not principally forbid the time intervals to overlap. However, if two history items have the same low (high) boundary in the valid time interval, it is undefined which one is considered the earliest (latest).

Definition 208:
invariant(HIST x) where x.nonNull {
  x.nonEmpty;
  ((T)x).equals(x.latest);
};
      
4.1.2.1
Earliest Item (earliest : HXIT<T>)

Definition:      The item in the set whose valid time's low boundary (validity start time) is less or equal (i.e. before) that of any other history item in the set.

Definition 209:
invariant(HIST x; HXIT<T> e) where x.contains(e) {
    x.earliest.validTime.low.lessOrEqual(e.validTime.low);
};
        
4.1.2.2
Latest Item (latest : HXIT<T>)

Definition:      The item in the set whose valid time's high boundary (validity end time) is greater or equal (i.e. after) that of any other history item in the set.

Definition 210:
invariant(HIST x; HXIT<T> e) where x.contains(e) {
    x.latest.validTime.high.greaterOrEqual(e.validTime.high);
};
        
4.1.2.3
Except Earliest Item (exceptEarliest : HIST<T>)

Definition:      The derived history that has the earliest item excluded.

Definition 211:
invariant(HIST x) where x.nonNull {
  x.exceptEarliest.equals(x.except(x.earliest));
};
        
4.1.2.4
Except Latest Item (exceptLatest : HIST<T>)

Definition:      The derived history that has the latest item excluded.

Definition 212:
invariant(HIST x) where x.nonNull {
  x.exceptLatest.equals(x.except(x.latest));
};
        
4.1.2.5
(demotion : HIXT<T>)

A type conversion between an entire history HIST<T> and a single history item HXIT<T>. This conversion takes the latest data from the history.

The purpose of this conversion is to allow an information producer to produce a history of any value instead of sending just one value. An information-consumer, who does not expect a history but a simple value, will convert the history to the latest value.

Note from the definition of history item (HXIT) that HXIT semantically extends T. This means, that the information-consumer expecting a T but given an HXIT extension of T will not recognize any difference (substitutability of specializations.)

4.2

Uncertain Value - Probabilistic (UVP)

Definition:      A generic data type extension used to specify a probability expressing the information producer's belief that the given value holds.

How the probability number was arrived at is outside the scope of this specification.

Probabilities are subjective and (as any data value) must be interpreted in their individual context, for example, when new information is found the probability might change. Thus, for any message (document, or other information representation) the information — and particularly the probabilities — reflect what the information producer believed was appropriate for the purpose and at the time the message (document) was created.

For example, at the beginning of the 2000 baseball season (May), the Las Vegas odds makers may have given the New York Yankees a probability of 1 in 10 (0.100) of winning the World Series. At the time of this writing, the Yankees and Mets have won their respective pennants, but the World Series has yet to begin. The probability of the Yankees winning the World Series is obviously significantly greater at this point in time, perhaps 6 in 10 (0.600). The context, and in particular the time of year, made all the difference in the world.

Since probabilities are subjective measures of belief, they can be stated without being "correct" or "incorrect" per se, let alone "precise" or "imprecise". Notably, one does not have to conduct experiments to measure a frequency of some outcome in order to specify a probability. In fact, whenever statements about individual people or events are made, it is not possible to confirm such probabilities with "frequentists" experiments.

Returning to our example, the Las Vegas odds makers can not insist on the Yankees and Mets playing 1000 trial games prior to the Series; even if they could, they would not have the fervor of the real Series and therefore not be accurate. Instead, the odds makers must derive the probability from past history, player statistics, injuries, etc.

Definition 213:
template<ANY T>
type UncertainValueProbabilistic<T> alias UVP<T> extends T {
    REAL  probability;
};
    

The type T is not formally constrained. In theory, discrete probabilities can only be stated for discrete data values. Thus, generally UVP should not be used with REAL, PQ, or values.

4.2.1

Properties of Uncertain Value - Probabilistic (UVP)

4.2.1.1
Probabilithy (probability : REAL)

Definition:      The probability assigned to the value, a decimal number between 0 (very uncertain) and 1 (certain).

Definition 214:
invariant(UVP<T> x) where x.nonNull.and(x.probability.nonNull) {
  ((IVL<REAL>)[0;1]).contains(x.probability);
};
      

There is no "default probability" that one can assume when the probability is unstated. Therefore, it is impossible to make any semantic difference between an UVP of T without probability and a simple T. UVP of T does not mean "uncertain", and a simple T does not mean "certain". In fact, the probability of the UVP could be 0.999 or 1, which is quite certain, where a simple T value could be a very vague guess.

4.2.2

Non-Parametric Probability Distribution (NPPD)

Definition:      A set of uncertain values with probabilities (also known as histogram.) All the elements in the set are considered alternatives and are rated each with its probability expressing the belief (or frequency) that each given value holds.

The purpose of the non-parametric probability distribution is chiefly to support statistical data reporting as it occurs in measurements taken from many subjects and consolidated in a histogram. This occurs in epidemiology, veterinary medicine, laboratory medicine, but also in cost controlling and business process engineering.

Semantically, the information of a stated value exists in contrast to the complement set of unstated possible values. Thus, semantically, a non-parametric probability distribution contains all possible values and assigns probabilities to each of them.

The easiest way to visualize this is a bar chart as shown in

Example of a Histogram

Figure 13: Example of a Histogram

This example illustrates the probability of selected major league baseball teams winning the World Series (prior to the season start). Each team is mutually exclusive, and were we to include all of the teams, the sum of the probabilities would equal 1 (i.e., it is certain that one of the teams will win).

NOTE: even though semantically the NPPD assigns probabilities to all possible values, not all values need to be represented explicitly. Those possible values that are not mentioned in a NPPD data structure will have the rest-probability distributed equally over all unmentioned values. For example, if the value set is {A; B; C; D} but the NPPD value states just {(B; 0.5); (C; 0.25)} then the rest-probability is 1 ( 0.75 = 0.25 which is distributed evenly over the complement set: {(A; 0.125); (D; 0.125)}. Semantically, the NPPD is the union of the stated probability distribution and the unstated complement with rest-probability distributed evenly.
Definition 215:
template<ANY T>
type NonParametricProbabilityDistribution(T(
    alias NPPD<T> extends SET<UDP<T>> {
    SET<UDP<T>> mostLikely(INT n);
};
    

Just as with UVP, the type T is not formally constrained, even though there are reasonable and unreasonable uses. Typically one would use the NPPD for unordered types, if only a "small" set of possible values is assigned explicit probabilities, or if the probability distribution cannot (or should not) be approximated with parametric methods. For other cases, one may prefer PPD.

4.2.2.1
Most Likely (mostLikely : UVP)
Definition 216:
invariant(NPPD<T> x) where x.nonNull {
  x.nonEmpty;
  x.contains(x.mostLikely(n));
  x.mostLikely(n).
  forall(UVP<T> d, e; SET<UVP<T>> m; INT n)
      where x.contains(d)
       .and(m.equals(x.mostLikely(n)))
       .and(m.contains(e)) {
    e.greaterOrEqual(d).or(m.contains(d));
  };
};
      

4.3

Parametric Probability Distribution (PPD)

Definition:      A generic data type extension specifying uncertainty of quantitative data using a distribution function and its parameters. Aside from the specific parameters of the distribution, a mean (expected value) and standard deviation is always given to help maintain a minimum layer of interoperability if receiving applications cannot deal with a certain probability distribution.

Definition 217:
template<QTY T>
type ParametricProbabilityDistribution<T> alias PPD<T> extends T {
            T.diff  standardDeviation;
            CS      distributionType;
            IVL<T>  confidenceInterval(REAL p);
            REAL    probability(IVL<T> x);
            PPD<T>  times(REAL x);
};
    

For example, the most common college entrance exam in the United States is the SAT, which is comprised of two parts: verbal and math. Each part has a minimum score of 400 (no questions answered correctly) and a perfect score of 800. In 1998, according to the College Board, 1,172,779 college-bound seniors took the test. The mean score for the math portion of the test was 512, and the standard deviation 112. These parameter values (512, 112), tagged as the normal distribution parameters, paint a pretty good picture of test score distribution. In most cases, there is no need to specify all 1-million+ points of data when just 2 parameters will do!

Example for a parametric probability distribution

Figure 14: Example for a parametric probability distribution

Note that the normal distribution is only one of several distributions defined for HL7.

Since a PPD extends its parameter type T, a simple T value is the mean (expected value or first moment) of the probability distribution. Applications that cannot deal with distributions will take the simple T value neglecting the uncertainty. That simple value of type T is also used to standardize the data for computing the distribution.

Probability distributions are defined over integer or real numbers and normalized to a certain reference point (typically zero) and reference unit (e.g., standard deviation = 1). When other quantities defined in this specification are used as base types, the mean and the standard deviation are used to scale the probability distribution. For example, if a PPD of PQ for a length is given with mean 20 ft and a standard deviation of 2 in, the normalized distribution function f(x) that maps a real number x to a probability density would be translated to f′(x′) that maps a length x′ to a probability density as f′(x′) = f((x′ - μ ) / σ).

Where applicable, the PPD specification conforms to the ISO Guide to the Expression of Uncertainty in Measurement (GUM) as reflected by NIST publication 1297 Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results. The PPD specification does not describe how uncertainty is to be evaluated but only how it is expressed. The concept of "standard uncertainty" as set forth by the ISO GUM corresponds to the "standard deviation" property of the PPD.

4.3.1

Properties of Parametric Probability Distribution (PPD)

4.3.1.1
Standard Deviation (standardDeviation : T.diff)

Definition:      The primary measure of variance/uncertainty of the value (the square root of the sum of the squares of the differences between all data points and the mean). The standard deviation is used to normalize the data for computing the distribution function. Applications that cannot deal with probability distributions can still get an idea about the confidence level by looking at the standard deviation.

The standard deviation of a probability distribution over a type T is of the related type T.diff that expresses differences between values of type T. If T is REAL or INT, T.diff is also REAL or INT respectively. However if T is a point in time (TS), T.diff is a physical quantity (PQ) in the dimension of time.

The standard deviation is what ISO GUM calls "standard uncertainty."

4.3.1.2
Probability Distribution Type (distributionType : CE)

Definition:      A code specifying the type of probability distribution. Possible values are as shown in the attached table. The NULL value (unknown) for the type code indicates that the probability distribution type is unknown. In that case, the standard deviation has the meaning of an informal guess.

Table 44 lists the defined probability distributions. Many distribution types are defined in terms of special parameters (e.g., the parameters α and β for the γ-distribution, number of degrees of freedom for the t-distribution, etc.) For all distribution types, however, the mean and standard deviation are defined.

Table 44: Domain ProbabilityDistributionType:
code name definition
(NULL) unknown Used to indicate that the mean is estimated without any closer consideration of its probability distribution. In this case, the meaning of the standard deviation is not crisply defined. However, interpretation should be along the lines of the normal distribution, e.g., the interval covered by the mean ±1 standard deviation should be at the level of about two thirds confidence.
U uniform The uniform distribution assigns a constant probability over the entire interval of possible outcomes, while all outcomes outside this interval are assumed to have zero probability. The width of this interval is 2 σ √3. Thus, the uniform distribution assigns the probability densities f(x) = (2 σ √3)-1 to values μ - σ √3 ≥ x ≤ μ + σ √3 and f(x) = 0 otherwise.
N normal (Gaussian) This is the well-known bell-shaped normal distribution. Because of the central limit theorem, the normal distribution is the distribution of choice for an unbounded random variable that is an outcome of a combination of many stochastic processes. Even for values bounded on a single side (i.e. greater than 0) the normal distribution may be accurate enough if the mean is "far away" from the bound of the scale measured in terms of standard deviations.
LN log-normal The logarithmic normal distribution is used to transform skewed random variable X into a normally distributed random variable U = log X. The log-normal distribution can be specified with the properties mean μ and standard deviation σ. Note however that mean μ and standard deviation σ are the parameters of the raw value distribution, not the transformed parameters of the lognormal distribution that are conventionally referred to by the same letters. Those log-normal parameters μlog and σlog relate to the mean μ and standard deviation σ of the data value through σlog2 = log (σ22 + 1) and μlog = log μ - σlog2/2.
G γ (gamma) The gamma-distribution used for data that is skewed and bounded to the right, i.e. where the maximum of the distribution curve is located near the origin. The γ-distribution has two parameters α and β. The relationship to mean μ and variance σ2 is μ = α β and σ2 = α β2.
E exponential Used for data that describes extinction. The exponential distribution is a special form of γ-distribution where α = 1, hence, the relationship to mean μ and variance σ2 are μ = β and σ2 = β2.
X2 χ Used to describe the sum of squares of random variables that occurs when a variance is estimated (rather than presumed) from the sample. The only parameter of the χ2-distribution is υ, so called the number of degrees of freedom (which is the number of independent parts in the sum). The χ2-distribution is a special type of γ-distribution with parameter α = υ /2 and β = 2. Hence, μ = υ and σ2 = 2 υ.
T t (Student) Used to describe the quotient of a normal random variable and the square root of a χ2 random variable. The t-distribution has one parameter υ, the number of degrees of freedom. The relationship to mean μ and variance σ2 are: μ = 0 and σ2 = υ / (υ - 2)
F F Used to describe the quotient of two χ2 random variables. The F-distribution has two parameters υ1 and υ2, which are the numbers of degrees of freedom of the numerator and denominator variable respectively. The relationship to mean μ and variance σ2 are: μ = υ2 / (υ2 - 2) and σ2 = (2 υ22 + υ1 - 2)) / (υ12 - 2)22 - 4)).
B β (beta) The beta-distribution is used for data that is bounded on both sides and may or may not be skewed (e.g., occurs when probabilities are estimated.) Two parameters α and β are available to adjust the curve. The mean μ and variance σ2 relate as follows: μ = α / (α + β) and (σ2 = α β/((α + β)2 (α + β + 1)).

The three distribution-types unknown (NULL), uniform and normal must be supported by every system that claims to support PPD. All other distribution types are optional. When a system interpreting a PPD representation encounters a distribution type that it does not recognize, it maps this type to the unknown (NULL) distribution-type.

4.3.1.3
Literal Form

The parametric probability distribution has a literal form. The general syntax is as follows:

Definition 218:
PPD<T>.literal ST {
  PPD<T> : T "(" type T.diff ")"  { ((T)$).equals($1);
                                    $.distributionType.equals($3);
                                    $.standardDeviation.equals($4); };
  CV type : ST                    { $.value.equals($1);
                                    $.system.equals(); };
};
      

Examples: an example for a PPD<REAL> is "1.23(N0.005)" for a normal distribution of a real number around 1.23 with a standard deviation of 0.005. An example for a PPD<PQ> is "1.23 m (5 mm)" for a distribution of unknown type around the length 1.23 meter with a standard deviation of 5 millimeter. An example for a PPD<TS> is "2000041113(U4 h)" for a uniform distribution around April 11, 2000 at 1pm with standard deviation of 4 hours.

4.3.2

Probability Distribution over Real Numbers (PPD_REAL)

Definition:     

The parametric probability distribution of real numbers is fully defined by the generic data type.

Definition 219:
type ParametricProbabilityDistribution(REAL( alias PPD<REAL>;
      

However, there are some special considerations about literal representations and conversions of probability distributions over real numbers, which are specified in this section.

4.3.2.1

Converting a real number (REAL) to an uncertain real number (PPD<REAL>)

When converting a REAL into a PPD_REAL, the standard deviation is calculated from the REAL value's order of magnitude and precision (number of significant digits). Let x be a real number with precision n. We can determine the order of magnitude e of x as e = log10 |x| where e is rounded to the next integer that is closer to zero (special case: if x is zero, e is zero.) The value of least significant digit l is then l = 10e-n and the standard deviation σ is σ = l / 2.

4.3.2.2

Concise Literal Form for PPD<REAL>

Besides the generic literal form of the PPD, a concise literal form is defined for PPD over real numbers. This concise literal form is defined such that the standard deviation can be expressed in terms of the least significant digit in the mantissa. This literal is defined as an extension of the REAL literal:

Definition 220:
PPD<REAL>.literal ST {
  PPD<REAL> mantissa
  : REAL.mantissa "(" type T.diff ")" { ((T)$).equals($1);
                                        $.distributionType.equals($3);
                                        $.standardDeviation.equals($4); }
  | REAL.mantissa                     { $.equals($1);
                                        $.distributionType.equals($3);
       $.standardDeviation.equals($1.leastSignificantDigit.times(0.5)); };
  CS type : ST                        { $.value.equals($1);
       $.system.equals(2.16.840.1.113883.5.1019); };
};
        

Examples: "1.23e-3 (U5e-6)" is a the uniform distribution around 1.23 ( 10(3 with 5 ( 10(6 standard deviation in generic literal form. "1.230(U5)e-3" is the same value in concise literal form.

4.3.3

Parametric Probability Distributions over Physical Quantities (PPD_PQ)

Definition:     

A parametric probability distribution over physical quantities is constructed from the generic PPD type. However, recognizing that the unit can be factored from the boundaries, we add additional semantics and a separate literal form. The additional view of a probability distribution over physical quantities is a probability distribution over real numbers with one unit.

Definition 221:
type ParametricProbabilityDistribution<PQ> alias PPD<PQ> {
  PPD<REAL> value;
  CS  unit;
};
      

The unit applies to both mean and standard deviation.

Definition 222:
invariant(PPD<PQ> x) where x.nonNull {
  x.value.nonNull;
  ((REAL)x.value).equals(((PQ)x).value);
  x.unit.equals(((PQ)x).unit);
  x.value.standardDeviation.equals(x.standardDeviation.value);
  x.standardDeviation.unit.equals(x.unit);
};
      

4.3.3.1

Concise Literal Form for PPD<PQ>

A concise literal form for probability distributions of physical quantities is defined based on the concise literal form of PPD<REAL> where REAL is the value. This literal is defined as an extension of the PQ literal.

Definition 223:
PPD<PQ>.literal ST {
  PPD<PQ> : PPD<REAL> " " unit  { $.value.equals($1);
            $.unit.equals($3); }
};
        

Examples: "1.23e-3 m (N5e-6 m)" is the normal-distributed length of 1.23 × 10-3 m with 5 × 10-6 m standard deviation in generic literal form. "1.230(N5)e-3 m" is the same value in concise literal form. "1.23e-3(N0.005e-3) m " is also valid; it is the concise literal form for PPD<> combined with the generic literal form for PPD<>.

4.3.4

Probability Distribution over Time Points (PPD_TS)

Definition:     

The parametric probability distribution over time points is fully defined by the generic data type.

Definition 224:
type ParametricProbabilityDistribution<TS> alias PPD<TS>;
      

The standard deviation is of type TS.diff, which is a duration (a physical quantity in the dimension of time.)

4.3.4.1

Converting a point in time (TS) to an uncertain point in time

When converting a TS into a PPD<TS>, the standard deviation is calculated from the TS value's order of magnitude and precision (number of significant digits) such that two standard deviations span the maximal time range of the digits not specified. For example, in 20000609 the unspecified digits are hour of the day and lower. All these digits together span a duration of 24 hours, and thus, the standard deviation ( is( = 12 h from 20000609000000.0000... up to 20000609999999.9999... (= 20000610)

This rule is different from real numbers in that the range of uncertainty lies above the time value specified. This is to go with the common sense judgment that June 9th spans all day of June 9th with noon as the center, not midnight.

5

Timing Specification

Overview of Timing Specification Data Types

Figure 15: Overview of Timing Specification Data Types

The timing specification suite of data types is used to specify the complex timing of events and actions such as those that occur in order management and scheduling systems. It also supports the cyclical validity patterns that may exist for certain kinds of information, such as phone numbers (evening, daytime), addresses (so called "snowbirds," residing in the south during winter and north during summer) and office hours.

The timing specification data types include point in time (TS) and the interval of time (<TS>) and add types that are specifically suited to repeated schedules. These additional types include PIVL, EIVL, and finally the GTS type itself. All these timing types describe the time distribution of repeating states or events.

5.1

Periodic Interval of Time (PIVL)

Definition:      An interval of time that recurs periodically. Periodic intervals have two properties, phase and period. The phase specifies the "interval prototype" that is repeated every period.

Table 45: Property Summary of Periodic Interval of Time
Name Type Description
phase IVL<T> A prototype of the repeating interval specifying the duration of each occurrence and anchors the periodic interval sequence at a certain point in time.
period T.diff A time duration specifying as a reciprocal measure of the frequency at which the periodic interval repeats.
alignment CS Specifies if and how the repetitions are aligned to the cycles of the underlying calendar (e.g., to distinguish every 30 days from "the 5th of every month".) A non-aligned periodic interval recurs independently from the calendar. An aligned periodic interval is synchronized with the calendar.
institutionSpecified BL Indicates whether the exact timing is up to the party executing the schedule (e.g., to distinguish "every 8 hours" from "3 times a day".)

For example, "every eight hours for two minutes" is a periodic interval where the interval's width equals 2 minutes and the period at which the interval recurs equals 8 hours.

The phase also marks the anchor point in time for the entire series of periodically recurring intervals. The recurrence of a periodic interval has no beginning or ending, but is infinite in both future and past.

Definition 225:
template<TS T>
protected type PeriodicInterval<T> alias PIVL<T> extends SET<T> {
            T.diff  period;
            IVL<T>  phase;
            CS      alignment;
            BL      institutionSpecifiedTime;
            BL      contains(TS);
  literal   ST;
};
    

A periodic interval is fully specified when both the period and the phase are fully specified. The interval may be only partially specified where either only the width or only one boundary is specified.

For example: "every eight hours for two minutes" specifies only the period and the phase's width but no boundary of the phase. Conversely, "every eight hours starting at 4 o'clock" specifies only the period and the phase's low boundary but not the phase's high boundary. "Every eight hours for two minutes starting at 4 o'clock" is fully specified since the period, and both the phase's low boundary and width are specified (low boundary and width implies the high boundary.)

The periodic interval of time is a generic type extension whose type parameter T is restricted to the point in time (TS) data type and its extensions. The parametric probability distribution of point in time (PPD<TS>) is an extension of point in time and therefore can be used to form periodic intervals of probability distributions of point in time (PIVL<PPD<TS>>) values (uncertain periodic interval.)

Oftentimes repeating schedules are only approximately specified. For instance "three times a day for ten minutes each" does not usually mean a period of precisely 8 hours and does often not mean exactly 10 minutes intervals. Rather the distance between each occurrence may vary as much as between 3 and 12 hours and the width of the interval may be less than 5 minutes or more than 15 minutes. An uncertain periodic interval can be used to indicate how much leeway is allowed or how "timing-critical" the specification is.

5.1.1

Properties of Periodic Interval of Time (PIVL)

5.1.1.1
Phase (phase : IVL<T>)

Definition:      A prototype of the repeating interval specifying the duration of each occurrence and anchors the periodic interval sequence at a certain point in time.

The phase also marks the anchor point in time for the entire series of periodically recurring intervals. The recurrence of a periodic interval has no begin or end but is infinite in both future and past. A phase must be specified for every non-NULL periodic interval. The width of the phase must be less or equal the period.

Definition 226:
invariant (PIVL<T> x) where x.nonNull {
  x.phase.nonNull;
  x.phase.width.lessOrEqual(x.period);
};
        
5.1.1.2
Period (period : T.diff)

Definition:      A time duration specifying as a reciprocal measure of the frequency at which the periodic interval repeats.

The period is a physical quantity in the dimension of time (TS.diff.) For an uncertain periodic interval the period is a probability distribution over elapsed time.

Definition 227:
 
invariant(PIVL<T> x) where x.nonNull { 
  x.period.nonNull; 
};
        
5.1.1.3
Alignment to the Calendar (alignment : CS)

Definition:      Specifies if and how the repetitions are aligned to the cycles of the underlying calendar (e.g., to distinguish every 30 days from "the 5th of every month".) A non-aligned periodic interval recurs independently from the calendar. An aligned periodic interval is synchronized with the calendar.

The calendar alignment specifies a calendar cycle to which the periodic interval is aligned. The even flow of time will then be partitioned by the calendar cycle. The partitioning is called the calendar "grid" generated by the aligned-to calendar cycle. The boundaries of each occurrence interval will then have equal distance from the earliest point in each partition. In other words, the distance from the next lower grid-line to the beginning of the interval is constant.

For example, "every 5th of the month" is a calendar aligned periodic interval. The period spans 28 to 31 days depending on the calendar month. Conversely, "every 30 days" is an independent period that will fall on a different date each month.

The calendar alignment specifies a calendar cycle to which the periodic interval is aligned. The even flow of time will then be partitioned by this calendar cycle. The partitioning is called the calendar "grid" generated by the aligned-to calendar cycle. The boundaries of each occurrence interval will then have equal distance from the earliest point in each partition. In other words, the distance from the next lower grid-line to the beginning of the interval is constant.

For example, with “every 5th of the month” the alignment calendar cycle would be month of the year (MY.) The even flow of time is partitioned in months of the year. The distance between the beginning of each month and the beginning of its occurrence interval is 4 days (4 days because day of month (DM) starts counting with 1.) Thus, as months differ in their number of days, the distances between the recurring intervals will vary slightly, so that the interval occurs always on the 5th.

5.1.1.4
Institution Specified Timing (institutionSpecified : BL)

Definition:      Indicates whether the exact timing is up to the party executing the schedule (e.g., to distinguish "every 8 hours" from "3 times a day".)

For example, with a schedule "three times a day" the average time between repetitions is 8 hours, however, with institution specified time indicator true, the timing could follow some rule made by the executing person or organization ("institution"), that, e.g., three times a day schedules are executed at 7 am, noon, and 7 pm.

5.1.1.5
Literal Form

Generic Literal Form. The generic literal form for periodic intervals of time is as follows:

(phase : IVL<T>( / (period : T.diff( [ @ (alignment( ] [ IST ].

Definition 228:
PIVL<T>.literal ST {
  PIVL<T>
  : S2                        { $.equals($1); }
  | S2 "IST"                  { $.phase.equals($1.phase);
                                $.period.equals($1.period);
                                $.institutionSpecified.equals(true); };
  PIVL<T> S2
  : S1                        { $.equals($1); }
  | S1 "@" "(" ST ")"         { $.phase.equals($1.phase);
                                $.period.equals($1.period);
                                $.alignment.equals($4); };
  PIVL<T> S1
  : IVL<T> "/" "(" T.diff ")" { $.phase.equals($1);
                                $.period.equals($3); }
  |        "/" "(" T.diff ")" { $.period.equals($2); };
};
      

For example, "[200004181100;200004181110]/(7 d)@DW" specifies every Tuesday from 11:00 to 11:10 AM. Conversely, "[200004181100;200004181110]/(1 mo)@DM" specifies every 18th of the month 11:00 to 11:10 AM.

See Table 37 for calendar-period codes defined for the Gregorian calendar. There are 1-character and 2-character symbols. The 2-character symbols are preferred for the alignment period identifier.

Calendar Pattern Form. This form is used to specify calendar-aligned timing more intuitively using "calendar patterns." The calendar pattern syntax is (semi-formally) defined as follows:

(anchor( [ (calendar digits( [ .. (calendar digits( ]] / (number : INT( [ IST ]

A calendar pattern is a calendar date where the higher significant digits (e.g., year and month) are omitted. In order to interpret the digits, a period identifier is prefixed that identifies the calendar period of the left-most digits. This calendar period identifier anchors the calendar digits following to the right.

See Table 37 for calendar-period codes defined for the Gregorian calendar. There are 1-character and 2-character symbols. The 1-character symbols are preferred for the calendar pattern anchor.

For example: "M0219" is February 19 the entire day every year. This periodic interval has the February 19 of any year as its phase (e.g., "[19690219;19690220[" ), a period of one year, and alignment month of the year (M). The alignment calendar-cycle is the same as the anchor (e.g., in this example, month of the year.)

The calendar digits may also omit digits on the right. When digits are omitted on the right, this means the interval from lowest to highest for these digits. For example, "M0219" is February 19 the entire day; "M021918" is February 19, the entire hour between 6 and 7 PM.

In absence of a formal definition for this, the rules for parsing a calendar pattern are as follows (example is "M021918..21")

  1. Read the anchoring period identifier (e.g. "M")


  2. the PIVL's alignment is equal to this calendar period (e.g. month of the year)


  3. use the current point in time and format a literal exact to the next higher significant calendar period from the anchoring calendar period (e.g. year, "2000", constructing "2000021918"); this is the "stem literal"


  4. Read this constructed literal (e.g., "2000021918") into a TS value and convert that value to an interval according to IVL<TS>.promotion (e.g., "[2000021918;2000021919[") this is the "low interval."


  5. If the hull-operator token ".." follows, read the following calendar digits (e.g., "21")


  6. Right-align the stem literal and the calendar digits just read
          "2000021918"
          "        21"
    


  7. and copy all digits from the stem literal that are missing to the left of the calendar digits just read (e.g., yields "2000021921".)


  8. Read this constructed literal (e.g., "2000021918") into a TS value and convert that value to an IVL<TS> according to IVL<TS>.promotion (e.g., "[2000021921;2000021922[") this is the "high interval."


  9. The phase interval is the convex hull of the low interval and the high interval (e.g., "[2000021918;2000021922[").


  10. If the hull-operator was not present, the phase is simply the low interval.


Interleave. A calendar pattern followed by a slash and an integer number n indicates that the given calendar pattern is to apply every nth time.

For example: "D19/2" is the 19th of every second month.

A calendar pattern expression is evaluated at the time the pattern is first enacted. At this time, the calendar digits missing from the left are completed using the earliest date matching the pattern (and following a preceding pattern in a combination of time sets).

For example: "D19/2" is the 19th of every second month. If this expression is evaluated on March 14, 2000 the phase is completed to: "[20000319;20000320[/(2 mo)@DM" and thus the two-months cycle begins with March 19, followed by May 19, etc. If the expression were evaluated by March 20, the cycle would begin at April 19, followed by June 19, etc.

If no calendar digits follow after the calendar period identifier, the pattern matches any date. The integer number following the slash indicates the length of the cycle. The phase interval in these cases has only the width specified to be the duration of the anchoring calendar-cycle (e.g., in this example 1 day.)

For example: "CD/2" is every other day, "H/8" is every 8th hour, for the duration of one hour.

Institution Specified Time. Both a generic periodic interval literal and a calendar pattern may be followed by the three letters "IST" to indicate that within the larger calendar cycle (e.g., for "hour of the day" the larger calendar cycle is "day") the repeating events are to be appointed at institution specified times. This is used to specify such schedules as "three times a day" where the periods between two subsequent events may vary well between 4 hours (between breakfast and lunch) and 10 hours (over night.)

5.1.2

Periodic Intervals as Sets

The essential property of a set is that it contains elements. For non-aligned periodic intervals, the contains-property is defined as follows. A point in time t is contained in the periodic interval of time if and only if there is an integer i for which t plus the period times i is an element of the phase interval.

Definition 229:
invariant (PIVL<TS> x, TS t) where x.nonNull.and(x.alignment.isNull) {
  x.contains(t).equals(exists(INT i) {
       x.phase.contains(t.plus(x.period.times(i)));
     });
};
      

For calendar-aligned periodic intervals the contains property is defined using the calendar-cycle's sum(t, n) property that adds n such calendar cycles to the time t.

Definition 230:
invariant (PIVL<TS> x, TS t, CalendarCycle c)
    where x.nonNull.and(c.equals(x.alignment)) {
  x.contains(t).equals(exists(INT i) {
        x.phase.contains(c.sum(t, i));
     });
};
      

5.2

Event-Related Periodic Interval of Time (EIVL)

Definition:      Specifies a periodic interval pf time where the recurrence is based on activities of daily living or other important events that are time-related but not fully determined by time.

For example, "one hour after breakfast" specifies the beginning of the interval at one hour after breakfast is finished. Breakfast is assumed to occur before lunch but is not determined to occur at any specific time.

Definition 231:
template<TS T>
protected type EventRelatedPeriodicInterval<T> alias EIVL<T> extends SET<T>{
            CV          event;
            IVL<T.diff> offset;
            IVL<T>      occurrenceAt(TS eventTime);
            BL          contains(TS);
  literal   ST;
};
    

5.2.1

Properties of Event-Related Periodic Interval of Time (EIVL)

5.2.1.1
Event (event : CE)
A code for a common (periodical) activity of daily living based on which the event related periodic interval is specified.

Such events qualify for being adopted in the domain of this attribute for which all of the following is true:

Table 46: Domain TimingEvent:
code name definition
AC before meal (from lat. ante cibus)
ACD before lunch (from lat. ante cibus diurnus)
ACM before breakfast (from lat. ante cibus matutinus)
ACV before dinner (from lat. ante cibus vespertinus)
HS the hour of sleep (e.g., H18-22)
IC between meals (from lat. inter cibus)
ICD between lunch and dinner
ICM between breakfast and lunch
ICV between dinner and the hour of sleep
PC after meal (from lat. post cibus)
PCD after lunch (from lat. post cibus diurnus)
PCM after breakfast (from lat. post cibus matutinus)
PCV after dinner (from lat. post cibus vespertinus)
5.2.1.2
Offset (offset : IVL<T.diff>)

Definition:      An interval of elapsed time (duration, not absolute point in time) that marks the offsets for the beginning, width and end of the event-related periodic interval measured from the time each such event actually occurred.

For example: if the specification is "one hour before breakfast for 10 minutes" the offset's low boundary is (1 h and the offset's width is 10 min (consequently the offset's high boundary is (50 min.)

5.2.1.3
Literal Form

The literal form for an event related interval begins with the event code followed by an optional interval of the time-difference.

Definition 232:
EIVL<TS>.literal ST {
  EIVL<TS> : event    { $.event.equals($1); }
  | event offset      { $.event.equals($1); $.offset.equals($2); };
  CV event : ST       { $.code.equals($1);
                        $.codeSystem.equals(2.16.840.1.113883.5.1019); }
  IVL<TS.diff> offset
  : "+" IVL<TS.diff>  { $.equals($2); }
  | "-" IVL<TS.diff>  { $.low.equals($2.high.negate);
                        $.high.equals($2.low.negate);
                        $.width.equals($2.width);
                        $.lowClosed($2.highClosed);
                        $.highClosed($2.lowClosed); };
};
      

For example, one hour after meal would be "PC+[1h;1h]". One hour before bedtime for 10 minutes: "HS-[50min;1h]".

5.2.2

Resolving the Event-Relatedness

An event-related periodic interval of time is a set of time, that is, one can test whether a particular time or time interval is an element of the set. Whether an event-related periodic interval of time contains a given interval of time is decided using a relation event χ time referred to as EVENT(event, time). The property occurrenceAt(t) is the occurrence interval that would exist if the event occurred at time t.

Definition 233:
invariant(EIVL<T> x, T eventTime, IVL<T> v)
     where v.equals(x.occurrenceAt(eventTime)) {
  v.low.equals(eventTime.plus(x.offset.low));
  v.high.equals(eventTime.plus(x.offset.high));
  v.lowClosed.equals(x.offset.lowClosed);
  v.highClosed.equals(x.offset.highClosed);
};
      

Thus, an event related interval of time contains a point in time t if there is an event time e with an occurrence interval v such that v contains t.

Definition 234:
invariant(EIVL<T> x, T y) {
  x.contains(y).equals(exists(T e, IVL<T> v)
                           where EVENT(x.event, y)
                            .and(v.resolvedAt(y)) {
                         v.contains(y);
                       });
};
      

5.3

General Timing Specification (GTS)

Definition:      A set of points in time, specifying the timing of events and actions and the cyclical validity-patterns that may exist for certain kinds of information, such as phone numbers (evening, daytime), addresses (so called "snowbirds," residing in the south during winter and north during summer) and office hours.

The GTS data type has the following aspects:

In all cases the GTS is defined as a set of point in time (SET<TS>). Using the set operations, union, intersection and difference, more complex sets of time can be constructed from simpler ones. Ultimately the building blocks from which all GTS values are constructed are interval, periodic interval, and event-related periodic interval. The construction of the GTS can be specified in the literal form. No special data type structure is defined that would generate a combination of simpler time-sets from a given GTS value. While any implementation would have to contain such a structured representation, it is not needed in order to exchange GTS values given the literal form.53

Definition 235:
type GeneralTimingSpecification alias GTS extends SET<TS> {
            IVL<TS>   hull;
            IVL<TS>   nextTo(TS x)
            IVL<TS>   nextAfter(TS x)
            GTS       periodicHull(GTS x);
            BL        interleaves(GTS x);
  demotion  LIST<IVL<TS>>;
  literal   ST;
};
    

The GTS data type is defined as using intervals, periodic intervals, and event-related periodic intervals. Intervals of time have been defined above

5.3.1

Convex Hull

A convex hull is the least interval that is a superset of all occurrence intervals. As noted in Section 3.1.2, all totally ordered sets have a convex hull. Because a GTS is a SET<TS> and because a SET<TS> is a totally ordered set, all GTS values have a convex hull.

The convex hull of a GTS can less formally be called "outer bound interval". Thus, the convex hull of a GTS describes the absolute beginning and end of the repeating schedule. For infinite repetitions (e.g., a simple periodic interval) the convex hull has infinite bounds.

Convex Hull of a Schedule

Figure 16: Convex Hull of a Schedule

5.3.2

GTS as a Sequence of Occurrence Intervals

A GTS value is a generator of a sequence of time intervals during which an event or activity occurs, or during which a state is effective.

The nextTo-property maps to every point in time t the greatest continuous subset (an "occurrence interval") v of the GTS value S, where v is the interval closest to t that begins later than t or that contains t.

Definition 236:
invariant(GTS S, TS t, IVL<TS> v) {
  v.equals(S.nextTo(t)).equals(
         S.contains(o)
    .and(forall(IVL<TS> u) where x.contains(u) {
           u.contains(v).implies(u.equals(v)); })
    .and(    v.contains(t)
         .or(forall(TS i) where t.lessOrEqual(i)
                           .and(i.lessThan(v.low)) {
               S.conatins(i).not; })));
};
      

The nextAfter-property maps to every point in time t the greatest continuous subset (an "occurrence interval") v of the GTS value S, where v is the interval closest to t that begins later than t.

Definition 237:
invariant(GTS S, TS t) where {
  S.contains(t).not
     .implies(S.nextAfter(t).equals(S.nextTo(t)));
  S.contains(t)
     .implies(S.nextAfter(t).equals(S.except(nextTo(t)).nextTo(t)));
};
      

A GTS value can be converted into a generic Sequence of time intervals (LIST<>) of occurrence intervals.

Definition 238:
invariant(GTS x) where x.isEmpty { ((LIST<IVL<TS>>)x).isEmpty; };

invariant(GTS x, IVL<TS> first)
    where x.nonEmpty
     .and(x.hull.low.nonNull)
     .and(first.equals(x.nextTo(x.hull.low))
{
  ((LIST<IVL<TS>>)x).head.equals(first);
  ((LIST<IVL<TS>>)x).tail.equals((LIST<IVL<TS>>)x.except(first));
};
      

5.3.3

Interleaving Schedules and Periodic Hull

Interleaving Schedules and Periodic Hull

Figure 17: Interleaving Schedules and Periodic Hull

For two GTS values A and B we say that A interleaves B if their occurrence intervals interleave on the time line. This concept is visualized in Figure 15.

For the GTS values A and B to interleave the occurrence intervals of both groups can be arranged in pairs of corresponding occurrence intervals. It must further hold that for all corresponding occurrence intervals aA and bB, a starts before b starts (or at the same time) and b ends after a ends (or at the same time).

The interleaves-relation holds when two schedules have the same average frequency, and when the second schedule never "outpaces" the first schedule. That is, no occurrence interval in the second schedule may start before its corresponding occurrence interval in the first schedule.

With two interleaving GTS values one can derive a periodic hull such that the occurrence intervals of the periodic hull is the convex hull of the corresponding occurrence intervals.

The periodic hull is important to construct two schedules by combining GTS expressions. For example, to construct the periodic interval from Memorial Day to Labor Day every year, one first needs to set up the schedules M for Memorial Day (the last Monday in May) and L for Labor Day (the first Monday in September) and then combine these two schedules using the periodic hull of M and L.

Definition 239:
invariant(GTS A, B) where x.nonNull.and(y.nonNull) {
  A.interleaves(B).equals(
    forall(IVL<TS> a, b, c; TS t)
        where a.equals(A.nextTo(t))
         .and(b.equals(B.nextTo(a.low)))
         .and(c.equals(A.nextTo(b.high))) {
      b.equals(B.nextTo(a.high));
      a.low.lessOrEqual(b.low);
      c.equals(A.nextTo(b.high));
      c.equals(a).or(c.equals(A.nextAfter(a.high)));
    });
};
      

For two GTS values A and B where A interleaves B, a periodic hull is defined as the pair wise convex hull of the corresponding occurrence intervals of A and B.

Definition 240:
invariant(GTS A, B, C) where A.interleaves(B) {
  A.periodicHull(B).equals(C).equals(
    forall(IVL<TS> a, b; TS t)
        where a.equals(A.nextTo(t))
         .and(b.equals(B.nextTo(a.low))) {
      C.contains(c).equals(c.equals(a.hull(b)));
    });
};
      

The interleaves-relation is reflexive, asymmetric, and intransitive. The periodic hull operation is non-commutative and non-associative.54

5.3.4

Literal Form

The GTS literal allows specifying combinations of intervals, periodic intervals, and event related periodic intervals of time using the set operations, unions and intersection. This literal form is specified based on the simpler time set data types interval, periodic interval, and event related periodic interval.55

Unions are speechified by a semicolon-separated list. Intersections are specified by a white space separated list. Intersection has higher priority than union. Exclusions (set differences) can be specified using a backslash; exclusions have an intermediate priority, i.e. weaker than intersection but stronger than union.

Table 47: Domain SetOperator:
code name definition
A intersect Form the intersection with the value.
H convex hull Form the convex hull with the value. The convex hull is defined over ordered domains and is the smallest contiguous superset (interval) that of all the operand sets.
P periodic hull Form the periodic hull with the value. The periodic hull is defined over ordered domains and is the periodic set that contains all contiguous supersets of pairs of intervals generated by the operand periodic intervals.

Also parentheses can be used to overcome operator precedence when necessary.

Definition 241:
GTS.literal ST {
  GTS : symbol                { $.equals($1); }
  | union                     { $.equals($1); };
  | exclusion                 { $.equals($1); };
  SET<TS> union
  : intersection ";" union    { $.equals($1.union($3)); }
  | intersection              { $.equals($1); };
  SET<TS> exclusion
  : exclusion "\" intersection { $.equals($1.except($3)); };
  SET<TS> intersection
  : factor intersection       { $.equals($1.intersection($2)); }
  | factor;                   { $.equals($1); }
  SET<TS> hull
  : factor ".." hull          { $.equals($1.periodicHull($3)); }
  | factor;                   { $.equals($1); }
SET<TS> factor
  : IVL<TS>                   { $.equals($1); }
  | PIVL<TS>                  { $.equals($1); }
  | EIVL<TS>                  { $.equals($1); }
  | "(" GTS ")"               { $.equals($1); };
};
      

The following table contains paradigmatic examples for complex GTS literals. For simpler examples confer to the literal forms for interval, periodic interval, and event related interval.

5.3.4.1
Symbolic Abbreviations for GTS expressions.

The following Table 46 defines symbolic abbreviations for GTS values that can be used in GTS literals instead of their equivalent GTS term. Abbreviations are defined for common periods of the day (AM, PM), for periods of the week (business day, weekend), and for holidays. The computation for the dates of some holidays, namely the Easter holiday, involve some sophistication that goes beyond what one would represent in a GTS literal term. It is assumed that the dates of these holidays are drawn from some table or some generator module that is outside the scope of this specification.

These abbreviations are named GTS values and they can in turn be a factor of a GTS term. For example, one can say "JHCHRXME H08..12" to indicate that the office hours on Christmas Eve is from 8 AM to 1PM only. And one can say "JHNUSMEM..JHNUSLBR" for the typical midwestern swimming pool season from Memorial Day to Labor Day.

NOTE: this table is not complete. Neither does it include religious holidays other than Christian (of the Gregorian (western) tradition), nor does it contain national holidays on other countries. This is a limitation to be remedied by subsequent additions.
NOTE: holidays are locale-specific. Exactly which religious holidays are subsumed under JH depends on the locale and other tradition. For global interoperability, using constructed GTS expressions is safer than named holidays. However, some holidays that depend on moon phases (e.g., Easter) or ad-hoc decree cannot be easily expressed in a GTS form.

Endnotes

  1. [source] The HL7 Message Development Framework defines "update modes" for fields in a message. Note that because data values have neither identity nor state nor changing of state, these update modes do not apply for the properties of data values. Data values and their properties are never updated. A field of an object (e.g., a message) can be updated in which case the field's value is replaced by another value. But the value itself is never updated.
  2. [source] This is the reason why the ISO Abstract Syntax Notation 1 (ASN.1) is not an appropriate formalism for semantic data type specifications.
  3. [source] The data type definition language employed here is a conclusion of experiments and experience with various alternatives. These alternatives include data type definition tables and the use of the Object Management Group's (OMG) Interface Definition Language (IDL). The disadvantage of the data type definition tables was that they gave the wrong impression of this specification being a specification of abstract syntax rather than semantics. Conversely, the disadvantage with IDL was that IDL gave the wrong impression of this specification being an application programming interface (API) definition.

    The resulting data type definition language borrows significantly from IDL, the Object Constraint Language (OCL), JAVA, C++, and the parser generation tools LEX and YACC. It is inspired by features and style of these languages but amalgamating and augmenting these languages into precisely what is needed for this data type specification. The goal was a language that is minimal, and self-consistent. Also, as the main purpose of this language is to define data types it tries to get by without any built-in data types.

  4. [source] As can be seen, the type keyword is in place of IDL's and Java's interface and C++ amd Java's class keyword. The alias clause is unique to this specification as we do have the need for extremely short data type mnemonics in addition to more descriptive names. The extends clause is the same as JAVA's, which is preferred over C++ or IDL's colon clause as its meaning is more obvious.
  5. [source] Note that the IDL's notion of input and output arguments and IDL's, JAVA's and C++'s notion of return values and exceptions are all irrelevant concepts for this specification. The semantics of data types is not about procedure calls and parameter passing or normal and abnormal returns of control from a procedure body. Instead, each semantic property is conceptualized as a function that maps a value and optional arguments to another value. This mapping is not "computed" or "generated" it logically exists and we do not need to "call" such a function to actualize the mapping.
  6. [source] "Extends" means "refines" or "specializes and adds properties." This kind of "extension" (specialization) has nothing to do with the "extensional" (vs. "intentional") definitions of data types.
  7. [source] The restriction variant of specialization deserves explanation. It is generally touted that inheritance should not retract properties that have been defined for the genus. This is still true for the restriction as properties are not actually retracted but constrained to a smaller value set. This may mean constraining properties to NULL, if NULL was an allowed value for that property in the parent type. In any case, logically, restriction is a specialization, with inheritance and substitutability. Furthermore extends and restricts are not hard opposites as a specialized type may both extend and constrain; the two keywords are mainly used to guide a human reader as to the intention of the design.
  8. [source] Note the meaning of protected is a little different from the accessibility qualifiers (public, package, protected, private) as known from JAVA and C++. The protection used here is not about hiding the type information or barring properties defined by a protected type from access outside of this specification "package." It mainly is a strong recommendation not to declare attributes or other features of such protected types. Protected types should be used as "wrapped" in other types. The protected type is still directly accessible within the "wrap," no notion of "delegated properties" exists.
  9. [source] The invariant statement syntax and semantics is similar to the OCL "inv" clause. However, we did not use OCL in this specification for several reasons. (1) OCL syntax has a Smalltalk style that does not fit the C++/Java style of the data type definition language. (2) OCL has many primitive constructs and data types, while this specification avoids primitives as much as possible. (3) In part because of the richness in primitive constructs, OCL is fairly complex, more than is needed in this specification.
  10. [source] This construct is somewhat cyclical; there is a preexisting notion of Boolean values even though the Boolean is a type defined just like any other type. In addition, since this data type definition language is written in character strings, the notion of character strings pre-exists the definition of the character string type. These two types, character string and Boolean are therefore exceptional, but on the surface, they are defined just like any other data type. Since this data type specification language is not meant to be implemented, the cyclicality is not a real issue. Even if this language was implemented, one can use a "bootstrapping" technique as is common, e.g., for compilers that compile themselves.
  11. [source] Most of these syntactic features are in the spirit of the JAVA language, use of argument lists, curly braces to enclose blocks, semicolon to finish a statement, and the period to reference value properties. The double colon :: as used by C++ or IDL to distinguish between member-references and value-references are not used (as in Java). Unlike Java but like C++ and IDL, every statement is ended by a semicolon, including type declarations. Implicit type conversion is also retained from C++.
  12. [source] This means that if a one expects an ED value but actually has an ST value instead, one can turn the ST value into an ED.
  13. [source] The different grammars of literals are not meant to be combined into one overall HL7 value expression grammar. Although attempt have been made to resolve potential ambiguities between the literals of different types where they would be harmful, some of these ambiguities still remain. For example "1.2" can be a valid literal for both Object Identifier (OID) and a Real Number.
  14. [source] The BNF variant used here is similar to the YACC parser and LEX lexical analyzer generator languages but is simplified and made consistent to the syntax and declarative style of this data type definition language. The differences are that all symbols have exactly one attribute, their value strongly typed as one of the defined data types. Each symbol's type is declared in front of the symbol's definition (e.g.: INT digit : "0" | "1" | ... | "9";). The start symbol has no name but just a type (e.g., INT : digit | INT digit;). A data type name can occur as a symbol name meaning a literal of that data type.
  15. [source] Note that the equals property (defined for all data types, see Section 1.4.2.3) is a relation, a test for equality, not an assignment statement. One can not assign a value to another value. Unlike YACC and LEX analyzers, this data type definition language is purely declarative it has no concept of assignment. For this reason, the grammar rules define both parsing and building literal expressions.
  16. [source] Generic type extensions are sometimes called "mixins", since their effect is to mix certain properties into the preexisting data type.
  17. [source] RFC 1766 is the HL7-approved coding system for all reference to human languages, in data types and elsewhere.
  18. [source] For this reason, a system or site that does not deal with multilingual text or names in the real world can safely ignore the language property.
  19. [source] The cryptographically strong checksum algorithm Secure Hash Algorithm-1 (SHA-1) is currently the industry standard. It has superseded the MD5 algorithm only a couple of years ago, when certain flaws in the security of MD5 were discovered. Currently the SHA-1 hash algorithm is the default and required only choice for the integrity check algorithm. However, there is no assurance that SHA-1 will not be superseded at anytime when its flaws will be discovered. In fact, by the time this specification reaches third ballot a new SHA-256 is beginning to pick up popularity.
  20. [source] Originally, the term thumbnail refers to an image in a lower resolution (or smaller size) than another image. However, the thumbnail concept can be metaphorically used for media types other than images. For example, a movie may be represented by a shorter clip; an audio-clip may be represented by another audio-clip that is shorter, has a lower sampling rate, or a lossy compression.
  21. [source] ISO/IEC 10646-1: 1993 defines a character as "A member of a set of elements used for the organization, control, or representation of data." ISO/IEC TR 15285 - An operational model for characters and glyphs. Discusses the problems involved in defining characters. Notably, characters are abstract entities of information, independent of type font or language. The ISO 10646 (UNICODE [http://www.unicode.org]) - or in Japan, JIS X0221 - is a globally applicable character set that uniquely identifies all characters of any language in the world.

    In this specification, ISO 10646 serves as a semantic model for character strings. The important point is that for semantic purposes, there is no notion of separate character sets and switching between character sets. Character set and character encoding are ITS layer considerations. The formal definition gives indication to this effect because each character is by itself an ST value that has a charset property. Thus, the binary encoding of each character is always understood in the context of a certain character set. This does not mean that the ITS should represent a character string as a sequence of full blown ED values. What it means is that on the application layer the notion of character encoding is irrelevant when we deal with character strings.

  22. [source] A character string literal is a conversion from a character string to another data type. Obviously, character string literals for character strings is a cyclical if not redundant feature. This literal form, therefore, mainly specifies how character strings are parsed in the data type specification language.
  23. [source] Although post-coding is often performed from free text information, such as documents, scanned images or dictation, multi-media data is explicitly not permitted as original text. Also, the original text property is not meant to be a link into the entire source document. The link between different artifacts of medical information (e.g., document and coded result) is outside the scope of this specification and is maintained elsewhere in the HL7 standards. The original text is an excerpt of the relevant information in the original sources, rather than a pointer or exact reproduction. Thus the original text is to be represented in plain text form.
  24. [source] The code system versions do not count in the equality test since by definition a code symbol must have the same meaning throughout all versions of a code system. Between versions, codes may be retired but not withdrawn or reused.
  25. [source] Translations are not included in the equality test of concept descriptors for safety reasons. An alternative would have been to consider two CD values equal if any of their translations are equal. However, some translations may be equal because the coding system of that translation is very coarse-grained. More sophisticated comparisons between concept descriptors are application considerations that are not covered by this specification.
  26. [source] NULL-values are exceptional values, not proper concepts. It would be unsafe to equate two values merely on the basis that both are exceptional (e.g., not codable or unknown.) Likewise there is no guarantee that original text represents a meaningful or unique description of the concept so that equality of that original text does not constitute concept equality. The reverse is also true: since there is more than one possible original text for a concept, the fact that original text differs does not constitute a difference of the concepts.
  27. [source] This ruling at design-time is necessary to prevent HL7 interfaces from being burdened by code literal style conversions at runtime. This is notwithstanding the fact that some applications may require mapping from one form into another if that application has settled with the representation option that was not chosen by HL7.
  28. [source] This is one reason why the CD.qualifiers for post-coordination are to be used sparingly and with caution. An additional problem of post-coordinated coding is that a general rule for equality may not exist at all.
  29. [source] The advantage of the concept descriptor data type is its expressiveness, however, if all of its features, such as coding exceptions, text, translations and qualifiers are used at all times, implementation and use become very difficult and unsafe. Therefore, the CD type is most often used in a restricted form with reduced features.
  30. [source] This is not withstanding the fact that an external referenced domain, such as the IETF MIME media type may include an extension mechanism. These extended MIME type codes would not be considered "extensions" in the sense of violating the CNE provision. The CNE provision is only violated if an attempt is made in using a different code system (by means of the CD.codeSystem property), which is not possible with the CS data type.
  31. [source] The value/namespace view on ISO object identifiers has important semantic relevance. It represents the notion of identifier value versus identifier assigning authority (= namespace), which is common in healthcare information systems in general, and HL7 v2.x in particular.
  32. [source] DICOM objects are identified by UID only. For the purpose of DICOM/HL7 integration, it would be awkward if HL7 required the extension to be mandatory and to consider the UID only as an assigning authority. Since UID values are simpler and do not contain the risks of containing meaningless decoration, we do encourage systems to use simple UID identifiers as external references to their objects.
  33. [source] This ruling at design-time is necessary to prevent HL7 interfaces from being burdened by identifier literal style conversions at runtime. This is notwithstanding the fact that some applications may require mapping from one form into another if that application has settled with the representation option that was not chosen by HL7.
  34. [source] The data type of the is still CS and for HL7 purposes, the is a CNE domain. This appears to be at odds with the fact that there is no one official list of URL schemes, and so many URL schemes in use may be defined locally. However, we cannot allow extension of the URL scheme using the HL7 mechanism of local alternative code systems, which is why technically the is a CS data type.
  35. [source] Remember that semantic properties are bare of all control flow semantics. The AD.formatted could be implemented as a "procedure" that would "return" the formatted address, but it would not usually be a variable to which one could assign a formatted address. However, HL7 does not define applications but only the semantics of exchanged data values. Hence, the semantic model abstracts from concepts like "procedure", "return", and "assignment" but speaks only of property and value.
  36. [source] These rules for formatting addresses are part of the semantics of addresses because addresses are primarily defined as text displayed or printed and consumed by humans. Other uses (e.g., epidemiology) are secondary — although not forbidden, the AD data type might not serve these other use cases very well, and HL7 defines better ways to handle these use cases. Note that these formatting rules are not ITS issues, since this formatting applies to presentations for humans whereas ITS specifications are presentations for computer interchange.
  37. [source] The XML encoding shown here is according to the XML ITS only in order to avoid introducing another instance notation. This does not imply that the function would only work in XML, nor even that XML is the preferred representation.
  38. [source] This example shows the strength of the mark-up approach to addresses. A typical German system that stores house number and street name in separate fields would print the address with street name first followed by the house number. For U.S. addresses, this would be wrong as the house number in the U.S. is written before the street name. The marked-up address allows keeping the natural order of address parts and still understanding their role.
  39. [source] Remember that semantic properties are bare of all control flow semantics. The AD.formatted could be implemented as a "procedure" that would "return" the formatted address, but it would not usually be a variable to which one could assign a formatted address. However, HL7 does not define applications but only the semantics of exchanged data values. Hence, the semantic model abstracts from concepts like "procedure", "return", and "assignment" but speaks only of property and value.
  40. [source] These rules for formatting names are part of the semantics of names because the name parts have been designed with the important use case of displaying and rendering on labels. Note that these formatting rules are not ITS issues, since this formatting applies to presentations for humans whereas ITS specifications are presentations for computer interchange.
  41. [source] The quantity data type abstraction corresponds to the notion of difference scales in contrast to ordinal scales and ratio scales (Guttman and Stevens). A data type with only the order requirement but not the difference requirement would be an ordinal. Ordinals are not currently defined with a special data type. Instead, ordinals are usually coded values, where the underlying code system specifies ordinal semantics. This ordinal semantics, however, is not reflected in the HL7 data type semantics at this time.
  42. [source] H. Grassman. Lehrbuch der Arithmetik. 1861. We prefer Grassman's original axioms to the Peano axioms, because Grassman's axioms work for all integers, not just for natural numbers. Also, "it is rather well-known, through Peano's own acknowledgment, that Peano borrowed his axioms from Dedekind and made extensive use of Grassmann's work in his development of the axioms." (Hao Wang. The Axiomatization of Arithmetic. J. Symb. Logic; 1957:22(2); p. 145.)
  43. [source] The term "Real" for a fractional number data type originates and is well established in the Algol, Pascal tradition of programming languages.
  44. [source] At this time, no other calendars than the Gregorian calendar are defined. However, the notion of a calendar as an arbitrary convention to specify absolute time is important to properly define the semantics of time and time-related data types. Furthermore, other calendars might be supported when needed to facilitate HL7's use in other cultures.
  45. [source] At present, the CalendarCycle properties sum and value are not formally defined. The computation of calendar digits involves some complex computation which to specify here would be hard to understand and evaluate for correctness. Unfortunately, no standard exists that would formally define the relationship between calendar expressions and elapsed time since an epoch. ASN.1, the XML Schema Data Type specification and SQL92 all refer to ISO 8601, however, ISO 8601 does only specify the syntax of Gregorian calendar expressions, but not their semantics. In this standard, we define the syntax and semantics formally, however, we presume the semantics of the sum-, and value-properties to be defined elsewhere.
  46. [source] In some programming languages, "collection types" are understood as containers of individually enumerated data items, and thus, an interval (low - high) would not be considered a collection. Such narrow interpretation of "collection" however is heavily representation/implementation dependent. From a data type semantics viewpoint, it doesn't matter whether an element of a collection "is actually contained in the collection" or not. There is no need for all elements in a collection to be individually enumerated.
  47. [source] Note the difference to the GTS. The GTS is a generator for a SET<TS> not for a LIST<TS>. A sequence of discrete values from a continuous domain makes not much sense other than in sampling applications. The SET<TS>, however, can be thought of as a sequence of IVL<TS>, which still is different from a LIST<TS>.
  48. [source] The presence of so many options deserves explanation. In principle, the interval form together with the width-only form would be sufficient. However, the interval form is felt alien to many in the field of medical informatics. One important purpose of the literal forms is to eradicate non-compliance through making compliance easy, without compromising on the soundness of the concepts.

    Furthermore, the different literal forms all have strength and weaknesses. The interval and center-width forms' strength is that they are most exact, showing closed and open boundaries. The interval form's weakness, however, is that infinite boundaries require special symbols for infinities, not necessary in the "comparator" form. The center-width form cannot specify intervals with an infinite boundary at all. The "comparator" form, however, can only represent single-bounded intervals (i.e., where the other boundary is infinite or unknown.) The dash form, while being the weakest of all, is the most intuitive form for double bounded intervals.

  49. [source] This statement seems to directly contradict the ruling about the promotion of TS to IVL<>. However, there is no contradiction. The precision of a boundary does not have any relevance, but the precision of a simple timestamp (not as an interval boundary) is relevant, when that timestamp is promoted to an interval.
  50. [source] The hull form may appear superfluous for the simple interval all by itself. However, the hull form will become important for the periodic interval notation as it shortens the notation and (perhaps arguably) makes the notation of more complex timing structures more intuitive.
  51. [source] This specification imposes a self-restraint upon itself to allow existing systems a graceful transition. However, the formal specification keeps the generic type extensions as substitutable for their base types. This self-restraint may be omitted in the future. New implementations are advised to accommodate some generalizable support for these generic data type extensions.
  52. [source] Note that data types are specifications of abstract properties of values. This specification does not mandate how these values are represented in an ITS or implemented in an application. Specifically, it does not mandate how the represented components are named or positioned. In addition, the semantic generalization hierarchy may be different from a class hierarchy chosen for implementation (if the implementation technology has inheritance.) Keep the distinction between a type (interface) and an implementation (concrete data structure, class) in mind. The ITS must contain a mapping of ITS defined features of any data type to the semantic properties defined here.
  53. [source] The GTS is an example of a data type that is only defined algebraically without giving any definition of a data structure that might implement the behavior of such a data type. The algebraic definition looks extremely simple, so that one might assume it is incomplete. Since at this point we are relying entirely on the literal form to represent GTS values, all the definition of data structur
  54. [source] The interleaves property may appear overly constrained. However, these constraints are reasonable for the use case for which the interleaves and periodic hull properties are defined. To safely and predictably combine two schedules one would want to know which of the operands sets the start points and which sets the endpoints of the periodic hull's occurrence intervals.
  55. [source] This literal specification again looks surprisingly simple, so one might assume it is incomplete. However, the GTS literal is based on the TS, IVL, PIVL, and EIVL literals and does also imply the literals for the extensions of TS, notably the PPD_TS. The GTS literal specification itself only needs to tie the other literal forms together, which is indeed a fairly simple task by itself.