In this document we concentrate on the lowest two or three layers: OpenMath Expression, OpenMath data-structure, and OpenMath encoding. During prototyping the design decisions concerning these layers usually involved a compromise between what seemed "best" and what could be achieved before a specific deadline.
We recall also that the Objectives Document stated that it was not necessary for the different layers to be readily distinguishable in an implementation, but that the important criterion was that the OpenMath object received should be semantically equivalent to the one sent. Indeed, in our implementations of prototype2.1 both the OpenMath object and the OpenMath expression layers had only ephemeral existences. The sections below describe some of the more abstract design decisions which were taken during the development of prototypes 2.0 and 2.1; greater detail will be given in the talk on OpenMath Prototypes.
In summary, the choice is not especially clear cut, but the differences between the two options are quite small in any case. Something I regard as an advantage of using trees is that shared substructure is explicit even above the data-structure level, but this is only an opinion.
The specifications for prototype2.1 describe OpenMath trees with 15 different types of node. Some of the node types are used directly to represent data while others have a more "managerial" role, for OpenMath is a complete protocol not merely a data representation. Most node types have fixed numbers of children, just two allow varied numbers.
OpenMath contexts play a vital role in the conversions between objects and trees. For the prototypes only a very simple scheme of contexts was used: a context was no more than an indexed list of symbols whose meanings were described in a human-only comprehensible form. As a more sophisticated internal structure for contexts becomes stable, the extra information thereby represented can be exploited during the conversion processes.
It is important to bear in mind that a context may be extended between the time that an OpenMath object is converted to an OpenMath tree and the time that the object is reconstructed from the tree: think of data stored in an archive. Thus the manner in which contexts are used for these conversions should permit extensions to the contexts without invalidating or altering the meaning of OpenMath data stored in an archive. This is also the reason why deletions from officially registered contexts are not allowed. Note: although it is an OpenMath encoding which is stored in an archive, the conversion between a tree and its linear encoding is independent of all contexts; thus it is fair to say that an archive contains OpenMath trees.
The difficulty with big integers is that different systems use different internal representations, and the cost of conversion is super-linear in the length of the integer: in general, the complexity of converting base of an integer of length n is the smaller of n*n and M(n)*log(n) where M(n) is the cost of multiplying together two integers of length n. Barring Maple, all computer algebra systems use a binary representation permitting a linear time inter-conversion. Imposing a binary representation would make Maple to Maple communication via OpenMath unreasonably slow if many large integers are to be exchanged; in this case a "decimal" representation is the only acceptable solution. Moreover, integer values sent to a typesetting system, say, must also be in a "decimal" form since it is not reasonable to require the typesetter to perform base conversion on arbitrary precision integers.
A similar problem arises for MPFP numbers. Here the problem is that numerical computations frequently involve very large numbers of floating point values, and if a large array is to be transmitted the cost of changing the format a MPFP number is multiplied manifold. A more serious aspect of changing format is that precision may be lost, or some values may be unrepresentable (e.g. magnitude is too great or too small). When both sender and receiver have the same internal format for MPFP numbers, OpenMath should allow them to use that format for data exchange.
The solution proposed is for every conversation to be preceded by a negotiation phase where the format to be used for big integers, and for MPFP numbers is determined at run-time. This does pose problems when OpenMath is to be used for archival or other "non-interactive" purposes such as drag-and-drop, and communication via email. In these cases the sender knows little about the receiver, but ultimate efficiency is also less important: a "decimal" representation for big integers must be used in case the receiver is a typesetter; it is much less clear what format to use for MPFP numbers.
Once the idea of negotiation is introduced, it can be extended to cover other aspects. One possible choice is the alphabet (and encoding) for strings: ASCII (or ISO-Latin-1) is an obvious choice for English speakers, while the full range of UniCode may be needed in some circumstances. Another candidate for negotiation is the overall encoding of the OpenMath tree: e.g. Lisp-like, based on ASAP, or based on MP. In the prototypes to date, renegotiation of encoding formats was not permitted during a conversation because it seemed unnecessary, and could be risky.
If run-time negotiation is to be adopted as the correct approach then it will be necessary to list all the various formats which may be chosen from during negotiation; this may be tricky for floating point numbers. Also a default format must be chosen for MPFP numbers to use in "non-interactive" situations; this default should be capable of representing without loss of precision any number from any of the current common formats for MPFP numbers (and any likely future format).
It was decided that the "naive" solution described above was too restrictive, and instead a more flexible, but hardly more complicated, scheme was devised: the linear encoding of an OpenMath tree may be "interrupted" at almost any point by an error message. This permits the server to interleave computation with sending back of the result: there is no problem if an error occurs after part of the result has already been sent back. It also provides a convenient mechanism for allowing the user (in an interactive environment) to terminate prematurely the transmission of an excessively long result.
The ability of an error report to "interrupt" the encoding of a tree probably cannot be exploited by "object compatible" data transfer. On the other hand, there is no obligation on the server to interleave computation and transmission of a result; the OpenMath error reporting mechanism merely permits it.