Chapter 2
OpenMath Objects

In this chapter we provide a self-contained description of OpenMath objects. We first do so by means of an abstract grammar description (Section 2.1) and then give a more informal description (Section 2.2).

2.1 Formal Definition of OpenMath Objects

OpenMath represents mathematical objects as terms or as labelled trees that are called OpenMath objects or OpenMath expressions. The definition of an abstract OpenMath object is then the following.

2.1.1 Basic OpenMath objects

The Basic OpenMath Objects form the leaves of the OpenMath Object tree. A Basic OpenMath Object is of one of the following.

(i) Integer.
Integers in the mathematical sense, with no predefined range. They are "infinite precision" integers (also called "bignums" in computer algebra).
(ii) IEEE floating point number.
Double precision floating-point numbers following the IEEE 754-1985 standard [6].
(iii) Character string.
A Unicode Character string. This also corresponds to "characters" in XML.
(iv) Bytearray.
A sequence of bytes.
(v) Symbol.
A Symbol encodes three fields of information, a symbol name, a Content Dictionary name, and (optionally) a Content Dictionary base URI, The name of a symbol is a sequence of characters matching the regular expression described in Section 2.3. The Content Dictionary is the location of the definition of the symbol, consisting of a name (a sequence of characters matching the regular expression described in Section 2.3) and, optionally, a unique prefix called a cdbase which is used to disambiguate multiple Content Dictionaries of the same name. There are other properties of the symbol that are not explicit in these fields but whose values may be obtained by inspecting the Content Dictionary specified. These include the symbol definition, formal properties and examples and, optionally, a Role which is a restriction on where the symbol may appear in an OpenMath object. The possible roles are described in Section 2.1.4.
(vi) Variable.
A Variable must have a name which is a sequence of characters matching a regular expression, as described in Section 2.3.

2.1.2 Derived OpenMath Objects

Derived OpenMath objects are currently used as a way by which non-OpenMath data is embedded inside an OpenMath object. A derived OpenMath object is built as follows:

(i) If A is not an OpenMath object, then foreignA is an OpenMath foreign object. An OpenMath foreign object may optionally have an encoding field which describes how its contents should be interpreted.

2.1.3 OpenMath Objects

OpenMath objects are built recursively as follows.

(i) Basic OpenMath objects are OpenMath objects. (Note that derived OpenMath objects are not OpenMath objects, but are used to construct OpenMath objects as described below.)
(ii) If A1, …, An (n>0) are OpenMath objects, then application(A1, …, An) is an OpenMath application object.
(iii) If S1, …, Sn are OpenMath symbols, and A is an OpenMath object, and A1, …, An (n>0) are OpenMath objects or OpenMath derived objects, then attribution (A, S1 A1, … , Sn An) is an OpenMath attribution object.
A is the object stripped of attributions. S1, …, Sn are referred to as keys and A1, …, An as their associated values. If, after recursively applying stripping to remove attributions, the resulting un-attributed object is a variable, the original attributed object is called an attributed variable.
(iv) If B and C are OpenMath objects, and v1, …, vn (n ≥ 0) are OpenMath variables or attributed variables, then binding (B, v1, …, vn, C) is an OpenMath binding object.
(v) If S is an OpenMath symbol and A1, …, An (n ≥ 0) are OpenMath objects or OpenMath derived objects, then error (S, A1,…,An) is an OpenMath error object.

OpenMath objects that are contstructed via rules (ii) to (v) are jointly called compound OpenMath objects

2.1.4 OpenMath Symbol Roles

We say that an OpenMath symbol is used to construct an OpenMath object if it is the first child of an OpenMath application, binding or error object, or an even-indexed child of an OpenMath attribution object (i.e. the key in a (key, value) pair). The role of an OpenMath symbol is a restriction on how it may be used to construct a compound OpenMath object and, in the case of the key in an attribution object, a clarification of how that attribution should be interpreted. The possible roles are:

binder The symbol may appear as the first child of an OpenMath binding object.
attribution The symbol may be used as key in an OpenMath attribution object, i.e. as the first element of a (key, value) pair, or in an equivalent context (for example to refer to the value of an attribution). This form of attribution may be ignored by an application, so should be used for information which does not change the meaning of the attributed OpenMath object.
semantic-attribution This is the same as attribution except that it modifies the meaning of the attributed OpenMath object and thus cannot be ignored by an application, without changing the meaning.
error The symbol may appear as the first child of an OpenMath error object.
application The symbol may appear as the first child of an OpenMath application object.
constant The symbol cannot be used to construct an OpenMath compound object.

A symbol cannot have more than one role and cannot be used to construct a compound OpenMath object in a way which requires a different role (using the definition of construct given earlier in this section). This means that one cannot use a symbol which binds some variables to construct, say, an application object. However it does not prevent the use of that symbol as an argument in an application object (where by argument we mean a child with index greater than 1).

If no role is indicated then the symbol can be used anywhere. Note that this is not the same as saying that the symbol's role is constant.

2.2 Further Description of OpenMath Objects

Informally, an OpenMath object can be viewed as a tree and is also referred to as a term. The objects at the leaves of OpenMath trees are called basic objects. The basic objects supported by OpenMath are:

Integer: Arbitrary Precision integers.
Float: OpenMath floats are IEEE 754 Double precision floating-point numbers. Other types of floating point number may be encoded in OpenMath by the use of suitable content dictionaries.
Character strings: are sequences of characters. These characters come from the Unicode standard [12].
Bytearrays: are sequences of bytes. There is no "byte" in OpenMath as an object of its own. However, a single byte can of course be represented by a bytearray of length 1. The difference between strings and bytearrays is the following: a character string is a sequence of bytes with a fixed interpretation (as characters, Unicode texts may require several bytes to code one character), whereas a bytearray is an uninterpreted sequence of bytes with no intrinsic meaning. Bytearrays could be used inside OpenMath errors to provide information to, for example, a debugger; they could also contain intermediate results of calculations, or "handles" into computations or databases.
Symbols: are uniquely defined by the Content Dictionary in which they occur and by a name. The form of these definitions is explained in Chapter 4. Each symbol has no more than one definition in a Content Dictionary. Many Content Dictionaries may define differently a symbol with the same name (e.g. the symbol union is defined as associative-commutative set theoretic union in a Content Dictionary set1 but another Content Dictionary, multiset1 might define a symbol union as the union of multi-sets).
Variables: are meant to denote parameters, variables or indeterminates (such as bound variables of function definitions, variables in summations and integrals, independent variables of derivatives).

Derived OpenMath objects are constructed from non-OpenMath data. They differ from bytearrays in that they can have any structure. Currently there is only one way of making a derived OpenMath object.

Foreign: is used to import a non-OpenMath object into an OpenMath attribution. Examples of its use could be to annotate a formula with a visual or aural rendering, an animation, etc. They may also appear in OpenMath error objects, for example to allow an application to report an error in processing such an object.

The four following constructs can be used to make compound OpenMath objects out of basic or derived OpenMath objects.

Application: constructs an OpenMath object from a sequence of one or more OpenMath objects. The first child of an application is referred to as its "head" while the remaining objects are called its "arguments". An OpenMath application object can be used to convey the mathematical notion of application of a function to a set of arguments. For instance, suppose that the OpenMath symbol sin is defined in a suitable Content Dictionary, then application(sin, x ) is the abstract OpenMath object corresponding to sin (x ). More generally, an OpenMath application object can be used as a constructor to convey a mathematical object built from other objects such as a polynomial constructed from a set of monomials. Constructors build inhabitants of some symbolic type, for instance the type of rational numbers or the type of polynomials. The rational number, usually denoted as 1/2, is represented by the OpenMath application object application(Rational, 1, 2). The symbol Rational must be defined, by a Content Dictionary, as a constructor symbol for the rational numbers.

Figure 2.1 The OpenMath application and binding objects for sin (x ) and λ x.x + 2 in tree-like notation.
Binding: objects are constructed from an OpenMath object, and from a sequence of zero or more variables followed by another OpenMath object. The first OpenMath object is the "binder" object. Arguments 2 to n-1 are always variables to be bound in the "body" which is the nth argument object. It is allowed to have no bound variables, but the binder object and the body should be present. Binding can be used to express functions or logical statements. The function λ x.x +2, in which the variable x is bound by λ, corresponds to a binding object having as binder the OpenMath symbol lambda: binding(lambda, x , application(plus, x , 2)).

Phrasebooks are allowed to use α conversion in order to avoid clashes of variable names. Suppose an object Ω contains an occurrence of the object binding (B , v , C ). This object binding (B , v , C ) can be replaced in Ω by binding (B , z , C') where z is a variable not occurring free in C and C' is obtained from C by replacing each free (i.e., not bound by any intermediate binding construct) occurrence of v by z. This operation preserves the semantics of the object Ω. In the above example, a phrasebook is thus allowed to transform the object to, e.g. binding (lambda, v , binding (lambda, z ,application (times,z ,z))). binding(lambda, z , application(plus, z , 2)).

Repeated occurrences of the same variable in a binding operator are allowed. An OpenMath application should treat a binding with multiple occurrences of the same variable as equivalent to the binding in which all but the last occurrence of each variable is replaced by a new variable which does not occur free in the body of the binding. binding (lambda, v , v ,application (times,v ,v) ) is semantically equivalent to: binding (lambda , v' , v ,application (times,v ,v) ) so that the resulting function is actually a constant in its first argument (v' does not occur free in the body application (times,v ,v) )).
Attribution: decorates an object with a sequence of one or more pairs made up of an OpenMath symbol, the "attribute", and an associated object, the "value of the attribute". The value of the attribute can be an OpenMath attribution object itself. As an example of this, consider the OpenMath objects representing groups, automorphism groups, and group dimensions. It is then possible to attribute an OpenMath object representing a group by its automorphism group, itself attributed by its dimension.

OpenMath objects can be attributed with OpenMath foreign objects, which are containers for non-OpenMath structures. For example a mathematical expression could be attributed with its spoken or visual rendering.

Composition of attributions, as in attribution(attribution(A, S1 A1,…,Sh Ah), Sh+1 Ah+1, …, Sn An) is semantically equivalent to a single attribution, that is attribution(A, S1 A1, …, Sh Ah, Sh+1 Ah+1, …, Sn An). The operation that produces an object with a single layer of attribution is called flattening.

Multiple attributes with the same name are allowed. While the order of the given attributes does not imply any notion of priority, potentially it could be significant. For instance, consider the case in which Sh = Sn (h < n) in the example above. Then, the object is to be interpreted as if the value An overwrites the value Ah. (OpenMath however does not mandate that an application preserves the attributes or their order.)

Attribution acts as either adornment annotation or as semantical annotation. When the key has role attribution, then replacement of the attributed object by the object itself is not harmful and preserves the semantics. When the key has role semantic-attribution then the attributed object is modified by the attribution and cannot be viewed as semantically equivalent to the stripped object. If the attribute lacks the role specification then attribution is acting as adornment annotation.

Objects can be decorated in a multitude of ways. An example of the use of an adornment attribution would be to indicate the colour in which an OpenMath object should be displayed, for example attribution(A, colour red ). Note that both A and red are arbitary OpenMath objects whereas color is a symbol. An example of the use of a semantic attribution would be to indicate the type of an object. For example the object attribution(A, type t ) represents the judgment stating that object A has type t. Note that both A and t are arbitary OpenMath objects whereas type is a symbol.
Error: is made up of an OpenMath symbol and a sequence of zero or more OpenMath objects. This object has no direct mathematical meaning. Errors occur as the result of some treatment on an OpenMath object and are thus of real interest only when some sort of communication is taking place. Errors may occur inside other objects and also inside other errors. Error objects might consist only of a symbol as in the object: error (S ).

2.3 Names

The names of symbols, variables and content dictionaries must conform to the production Name specified in the following grammar (which is identical to that for XML names in XML 1.1, [16]). Informally speaking, a name is a sequence of Unicode [12] characters which begins with a letter and cannot contain certain punctuation and combining characters. The notation #x... represents the hexadecimal value of the encoding of a Unicode character. Some of the character values or code points in the following productions are currently unassigned, but this is likely to change in the future as Unicode evolves^*1.

^*1 We note that in XML 1 the name production explicitly listed the characters that were allowed, so all the characters added in versions of Unicode after 2.0 (which amounted to tens of thousands of characters) were not allowed in names.

Name → NameStartChar (NameChar)*

NameStartChar → ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] |

[#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |

[#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |

[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] |

[#x10000-#xEFFFF]

NameChar → NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] |

[#x203F-#x2040]

CD Base A cdbase must conform to the grammar for URIs described in [7]. Note that if non-ASCII characters are used in a CD or symbol name then when a URI for that symbol is constructed it will be necessary to map the non-ASCII characters to a sequence of octets. The precise mechanism for doing this depends on the URI scheme.

Note on content dictionary names It is a common convention to store a Content Dictionary in a file of the same name, which can cause difficulties on many file systems. If this convention is to be followed then OpenMath recommends that the name be restricted to the subset of the above grammar which is a legal POSIX [5] filename, namely:

Name → (PosixLetter | '_') (Char)*

Char → PosixLetter | Digit | '.' | '-' | '_'

PosixLetter → 'a' | 'b' | ... | 'z' | 'A' | 'B' | ... | 'Z'

Canonical URIs for Symbols To facilitate the use of OpenMath within a URI-based framework (such as RDF [19] or OWL [18]), we provide the following scheme for constructing a canonical URI for an OpenMath Symbol:

URI = cdbase-value + '/' + cd-value + '#' + name-value

So for example the URI for the symbol with cdbase http://www.openmath.org/cd, cd transc1 and name sin is:

http://www.openmath.org/cd/transc1#sin

In particular, this now allows us to refer uniquely to an OpenMath symbol from a MathML document [17]:

<mathml:csymbol xmlns:mathml="http://www.w3.org/1998/Math/MathML/"
                definitionURL="http://www.openmath.org/cd/transc1#sin">
  <mo> sin </mo> 
</csymbol>

2.4 Summary

OpenMath supports basic objects like integers, symbols, floating-point numbers, character strings, bytearrays, and variables.
OpenMath compound objects are of four kinds: applications, bindings, errors, and attributions.
OpenMath objects may be attributed with non-OpenMath objects via the use of foreign OpenMath objects.
OpenMath objects have the expressive power to cover all areas of computational mathematics.

Observe that an OpenMath application object is viewed as a "tree" by software applications that do not understand Content Dictionaries, whereas a Phrasebook that understands the semantics of the symbols, as defined in the Content Dictionaries, should interpret the object as functional application, constructor, or binding accordingly. Thus, for example, for some applications, the OpenMath object corresponding to 2+5 may result in a command that writes 7.

Name	→	NameStartChar (NameChar)*
NameStartChar	→	":" \| [A-Z] \| "_" \| [a-z] \| [#xC0-#xD6] \| [#xD8-#xF6] \|
		[#xF8-#x2FF] \| [#x370-#x37D] \| [#x37F-#x1FFF] \|
		[#x200C-#x200D] \| [#x2070-#x218F] \| [#x2C00-#x2FEF] \|
		[#x3001-#xD7FF] \| [#xF900-#xFDCF] \| [#xFDF0-#xFFFD] \|
		[#x10000-#xEFFFF]
NameChar	→	NameStartChar \| "-" \| "." \| [0-9] \| #xB7 \| [#x0300-#x036F] \|
		[#x203F-#x2040]

Name	→	(PosixLetter \| '_') (Char)*
Char	→	PosixLetter \| Digit \| '.' \| '-' \| '_'
PosixLetter	→	'a' \| 'b' \| ... \| 'z' \| 'A' \| 'B' \| ... \| 'Z'

Chapter 2OpenMath Objects