A. Strotmann: Proposal presented at 5th OpenMath Workshop, Bath.
With corrections that came up during the discussion.
OpenMath Layer: Data Structure
Even though this is a realistic proposal for an OpenMath data structure
layer, other possibilities do exist in the literature, e.g. ASAP.
The primary purposes here are to provide an example that may help
clarify the distinction between OpenMath's data structure and expression
layers, and to serve as a starting point for the discussions during this
workshop.
Atomic Data Structures
- int's
- Bit, byte, 32-bit, 64-bit
- signed and unsigned
- perhaps: 24 bit, 'n'-bit
- BigInts are not atomic: see Expression layer!
- floats
- 32 bit, 64 bit, 128 bit
- IEEE; perhaps native
- BigFloats are not atomic: see Expression layer!
- chars
- ASCII (7 bits); ISO 8-bit extensions; Unicode = ISO 10 646 Basic
Multilingual Plane(?) (16 bits); full ISO 10 646 (32 bits)
- perhaps national or industry-"standard" character sets
(JISC 2022(?), Big5(?), EBCDIC) (8 or 16 bits)
- perhaps different character set encodings for UniCode (say)
(UTF-7, UTF-8 (8 bits))
- simple homogeneous vectors
- elements are of homogeneous atomic data type
- size (= number of elements) is arbitrary (possibly zero)
- strings = simple homogeneous vectors of char's
(therefore: several kinds of strings)
Note: an encoding will usually give explicit size information, but may
leave it unspecified if necessary
- (perhaps) simple homogeneous n-dimensional arrays
- any number of dimensions, arbitrary size along dimension
- elements are of homogeneous atomic data type
- size (= number of elements) is arbitrary (possibly zero)
- "cuboid" only
- may choose to use "pair of (simple homogeneous vector of
ints, simple homogeneous vector of atomic data type)" instead
Note: an encoding will usually give explicit size information, but may
leave it unspecified if necessary
- non-homogeneous sequences (aka lists, n-tuples)
- elements are arbitrary OpenMath data structures
Note: an encoding will often give explicit size information, but may
leave it unspecified if necessary
- Attribution
Note: this is a binary data structure used for attaching a list of attributes
(i.e. attribute/value pairs) to another data structure.
- Pairs (aka tuples?)
Note: unlike in LISP, 'lists' are not made up of (dotted) pairs. Pairs
are not lists of two elements, either. Pairs are a separate data type.
Example uses: named arguments to operations, attribute/value pairs,
exponent/coefficient pairs, "struct"s/records replacement
- Symbols are (in general) records with
- reference to context it's registered in (usually, a string
or an int or a "path")
- id within that context (usually, a string or an int)
- category information (explained in expression layer)
- easy or even trivial to implement in C or any similar language
- easy to map to existing symbolic computation systems
- allow efficient implementations and provide support for
efficient representations and encodings
of things like large numeric data
streams or image data
- support representation of symbolic data structures
- no "mathematics", just "computer science"
- no semantics, just syntax
Labels and Backreferences should be added in above.
BigInts: there are several possible representations for these, e.g.
- (base, sign, simple-vector-of-ints) : "ordinary" representation
- (simple-vector-of-ints, simple-vector-of-ints) : "chinese remainder" representation
- text string : "cop-out" representation, actually a
special case of first repesentation
Proposal: use such data structures plus the information that the
data structure is in fact meant to represent an object of mathematical
type Integer (in other words: the representation proper is on this
level, but assigning the meaning of "integer" to it is on the next level
up).
Arrays, BigFloats, Handles (see above): similar to the BigInt issue
Symbols: may need to have additional info in a distributed computation
environment, such as handles refering to owning process or link, scope...
This page is part of the OpenMath Web archive,
and is no longer kept up to date.