next up previous contents
Next: Applications Up: The OpenMath Project Final Previous: OpenMath and MathML   Contents


Software Tools

The goal of this task was to provide software tools (programs and libraries) to help test the standards proposed in work package 1 and build the demonstrators in work package 3.

Important tools are certainly libraries to allow programs to read and write OpenMath objects conveniently. One goal of this work package was thus to provide a C and a Java library. C is important because it is very common and almost all other languages and compilers have facilities to link and access C code. Java is of course a very popular language today, particularly for web-based applications.

This work package was also responsible for providing two other important kinds of software tools that are necessary for the wide adoption of OpenMath: conversion tools (converting OpenMath to and from the most common math formats, necessary to allow the interoperability of OpenMath and the existing formats), and tools to edit and display OpenMath objects (necessary to build a lot of other interactive tools and crucial to the planned demonstrators). Another kind of tool has also been produced as part of this work package, searching tools (for searching in a set of OpenMath encoded mathematics). They are quite different compared to the others. The problem of searching mathematics is not very well understood and there are several difficult issues. This remains a topic for research and hence the tools that we have developed are still experimental.

We initially planned to develop other software tools as part of task 2.2 that would be more directly useful for developing OpenMath applications (for example tools that could generate part of a phrase book). Instead we looked at developing ``generic'' phrase books in Java which could be customised for particular applications. These are useful in cases where the input language to the application is essentially a linear string (e.g. an interactive mathematics package) but of course does not allow access to the application's internal data structures.

OpenMath libraries

In addition to the planned C, C++, Java and Aldor libraries we have also implemented a Standard ML and a Lisp library. The SML library has been implemented as part of task 2.4 (``Tools for searching mathematical databases and texts'') to turn our deductive database engine into an OpenMath program. The Lisp library has been used outside the project to provide OpenMath support to an optical formula recognition package (a program that can produce an OpenMath object from an image of a formula).

As part of the MathML and OpenMath alignment effort, we have studied the possible inclusion of a MathML encoding in our libraries. We designed such an encoding (as a joint activity with task 1.2) but after a careful examination of its possible uses, we find that it would have been too contrived to achieve the minimal level of interoperability that would have been necessary between a MathML application and a typical OpenMath application.

We should mention that the adoption of our API and libraries may be overtaken by the wide availability of XML libraries. Several people in the OpenMath community have expressed the idea that using an XML library to read and write OpenMath objects is indeed sufficient and one does not need a specific OpenMath library. Using an off-the-shelf XML library has its advantages, most notably the fact that you can read any XML document and use the whole power of XML in the encoding (including using any character encoding, adding new attributes and entity references) but it certainly has its drawbacks: the API is not tailored to OpenMath objects, you cannot use the binary encoding and it becomes quite difficult to enforce the standard (there is a potential threat to interoperability). Historically, the early XML encoding was in fact an SGML encoding and it was unrealistic to expect OpenMath applications to use a full SGML parser. The wide availability of reasonably good XML tools is a recent phenomenon, hence our decision at the outset of the project to produce our own.

The C library

We started this work from a C library that was previously developed at INRIA. We made its API more regular and we improved the robustness of the code. Various changes to the SGML and XML encodings were made following the progress of the standard (work package 1; the SGML encoding was then dropped in favor of the XML one). As part of this task we defined and experimented with the binary encoding of OpenMath objects. It appeared to be reasonably efficient and compact, between three and ten times more compact than the XML encoding. Compared to a good compression algorithm, it produces results half as compact as the GNU ZIP program ( gzip) but the encoding and decoding processes are much more efficient.

The central abstraction in the C API is a device. An OpenMath object is read from (or written to) a device, that hides both the particular encoding (binary, XML or SGML) and the way input or output is done at the lowest level (for example, devices can be created for I/O to strings, files, file descriptors...). Objects are read and written at the token level (not as whole objects) to allow a more flexible integration with the application (most notably for memory management control and efficiency of phrase book translations).

Our C library does not only read and write OpenMath objects, it also supports simple interprocess communication facilities (through sockets) that have been very useful in writing OpenMath clients and servers. Other alternatives include standard technologies such as KQML or FIPA or SOAP (an XML-based mechanism for expressing messages and communication).

The C library is very portable. It runs on several Unix variants (Linux, Digital Unix, AIX, IRIX, Solaris) and on Windows (NT, 98 and 2000 through the Win32 interface). Installation is quite easy with a configure script for Unix platforms (taking care of the various differences between them) and auto-testing capabilities.

This library has been used by our partners in a number of applciations (see chapter 7), and by a number of people outside the project. A collaboration with ZIB (the Konrad Zuse Zentrum für Informationstechnik Berlin), produced a fairly complete OpenMath version of the Reduce computer algebra system. The library was also distributed to a software company that used it to build a prototype requiring communications with several mathematical softwares.

The C++ library

The C++ library is basically a set of classes built on top of the C library that provides a DOM-like interface for reading, writing and manipulating OpenMath objects (DOM is the Document Object Model, a World Wide Web Consortium interface).

The Java library

The API of the Java library is quite different from the C API as it tries to follow common Java conventions. The library is divided in two parts:

The API defined by these interfaces is structured in two levels. The lowest level is close to our C API, and exposes streams of tokens (following the model of SAX, the Simple Api for XML). The higher level manipulates whole OpenMath objects as trees following an interface similar to the one provided by DOM (the Document Object Model of the W3C).

We initially planned to converge quickly to a common API with the Java library developed at Simon Fraser University (Canada) in the PolyMath group (a member of NAOMI, the North American OpenMath Initiative). However it appeared that their library is more oriented towards the effective high-level manipulation of OpenMath objects than we would like it to be. For most of the Java developments in the project (and we believe in most envisioned useful OpenMath applications) OpenMath objects are just used as convenient intermediate objects in the ``phrase book'' process (the conversion between OpenMath objects and the mathematical objects in the representation used by the application). We thus chose a different design but we still hope to share our experiences and come up with a standard OpenMath Java API in the near future (under the auspices of the OpenMath society).

The Aldor library

Although Aldor is not a widespread programming language it has unique characteristics and is used for several interesting and innovative projects in the computer algebra community. That is why we chose to provide an OpenMath library for Aldor.

A first version of this library was just a wrapper around the C library (the Aldor compiler has the ability to link in C code). A second version was pure Aldor but due to the nature of the language and the basic libraries available it was infeasible to include support for the OpenMath binary encoding in this version. The library was used to build a web-based computational server (see 7.2).

Searching tools

The work on searching tools started from the deductive database prototype previously developed at INRIA (during the PhD thesis of Claude Huchet). This prototype has been vastly improved during the project: we changed some internal structures to make it faster (the way algebraic expressions are represented is more efficient), cleaned up the code and enhanced the typing mechanism. The handling of higher order constructs (such as differentiation and indefinite integration) has been improved. We have also designed and implemented new algorithms to improve the efficiency of expression retrieval and the precision of the answers (filtering out the non interesting solutions that are sometimes generated). We have also collected a first test suite.

Of course, a good deal of work has been spent to make the database an OpenMath application. We developed a Content Dictionary for expressing the query language of the database and wrote an OpenMath library in Standard ML (the programming language in which the database is written). The database was not turned into a full OpenMath server at the end of the first year as we had expected for two reasons. The first was the lack of a stable set of basic Content Dictionaries (in part because of the work required by the MathML alignment effort), the second was that some discussions in the project at this time led us to believe that there could have been changes in OpenMath that could have affected this task in important ways.

To use our initial set of data (formulae mostly taken from the ``Handbook of Mathematical Functions'' by Abramovitz and Stegun) we have developed a set of new content dictionaries (and additions to existing CDs) most notably for special functions. Independently of this, a group in Canada produced a similar content dictionary, and work is currently underway to merge the two.

The deductive database normally operates on a (large) set of true statements that are used to answer queries. We add the ability to search in a set of OpenMath formulas modulo the stored true statements. This enables the database to be used as a deductive search engine on OpenMath objects. A close integration with the JOME editor has been performed with the help of Ove. A first interface was demonstrated at the OpenMath industry day in Amsterdam. Searching in a set of OpenMath objects was demonstrated through another JOME interface at the Luxembourg review.

We were expecting a free access to the BIDS (Bath Information and Data Services) and its associated collection of mathematical abstracts to build an appropriate search engine based on our deductive database. Sadly this was not possible due to the unexpected privatisation of this service which occurred during the course of the project. Bath decided to use a collection of LaTeX abstracts instead, from the LMS Journal of Mathematics and Computation, but this work began very late in the project which left little time to tune our searching tools. INRIA is now continuing the work on searching tools by designing and implementing a dedicated toolkit (independent from any encoding or representation of mathematics) to build search engines working on mathematical formulas

Multimedia Tools

Task 2.3 produced two Java applications, Stilo MathWriter and JOME. Both applications can edit and display OpenMath objects. They can be used as applets in Web browsers to render and interact with mathematical objects in Web pages. MathWriter includes support for MathML and has been designed as a commercial product (or to be included in other commercial products). textscjome is oriented towards visual manipulation of (particularly large) expressions, and collaborative working.

Stilo MathWriter

Stilo MathWriter is a Java tool for the creation, edition, rendering and evaluation of MathML and OpenMath objects. In the early stages of its development, Stilo MathWriter was known as STARS. It consists of two co-operating Java applets:

  1. an extension to the publicly available WebEQ applet, providing user interaction and dynamic update and processing of the displayed mathematics within a web page, coupled with
  2. an Input Syntax handler applet which accepts a linear syntax based on TeX, with some extensions for disambiguation. The applet translates the linear input into OpenMath syntax for processing, and into MathML for display. It also accepts user input encoded in OpenMath or Content MathML (entered from the keyboard, pasted from another program or automatically transmitted) and converts this in the opposite direction into the linear syntax.

Stilo MathWriter has a public Java API which allows users to connect to the applet and receive automatic updates when the Maths in changed. This was used in the NAG Multiple Integrators Demonstration (see section 7.1.4) where MathWriter was used to enter the expression to be integrated, and to display the result.

There is a Javascript extension to the MathWriter technology, PageBuilder which allows a user to build up and save a web page by adding mathematical formulae and text.

Stilo MathWriter has some support for evaluating expressions. If it is not able to evaluate the expression for all the operators, it is simplified in terms of understood operators and then displayed. For example 1 + x + 2 simplifies to x + 3 and sin{$ \pi$/2} evaluates to 1.0. Stilo MathWriter has evaluation logic for the arithmetic operators + - / * log ln and most trigonometric functions.

Stilo MathWriter attempts to provide as ``natural'' looking an input syntax as possible. So, for example, to get sin x one simply types "sin x" or "sinx". "ab" is ab by default. There is no requirement to precede known functions with \ (as in TeX), or to surround arguments with () as in many computer algebra packages. Stilo MathWriter is controlled by a Syntax Table which defines the operators and rules understood. This table is a Java class which can be replaced in different implementations. The basic implementation understands all the operator elements on Content MathML. There are also extensions to support the formal theorem proving system COQ. Stilo MathWriter recognises MathML entities such as α for the Greek letter "alpha" and so on. In addition, the evaluation-value of certain mathematical constants is known (as described in the MathML Recommendation [12]). These include ⅇ (e) and π ($ \pi$).

Stilo foresees that the technology developed on this project for MathWriter can be further developed and exploited in various ways:

Stilo MathWriter has been integrated as a front-end into the COQ formal theorem proving system (Technical University of Eindhoven), and is currently being integrated into the prototype online documentation system for the NAG Fortran Libraries (Numerical Algorithms Group, Oxford). Stilo MathWriter is being used by a number or educational institutions, including at MIT in the investigation of web-based interactive mathematical education. It is also on trial at some industrial research locations in the UK.


JOME (Java OpenMath Editor) is a self-contained software component written in Java (as a Java bean) dedicated to the visualisation and manipulation of mathematical formulae. Conceptually based on the Model-View-Controller design pattern, JOME naturally consists of the corresponding three entities, each being a Java bean. This makes it easy to add different kind of representations for a formula and to provide different ways of editing it. This also makes using JOME to display or edit formulas in another Java application extremely simple (in an environment such as Symantec Visual Café or IBM Visual Age, the integration work can be carried out completely through the graphical interface).

JOME has some support for manipulating formulas with semantic drag and drop (the selection can be moved from one side of an operator to the other with a relevant mathematical transformation performed).

Extensibility is achieved via a plug-in system. This system works via a combination of resource files and dynamic class instantiation. It allows an application to be updated to a new Content Dictionary dynamically.

JOME has been used to build an applet to enter and display mathematical expressions and in an interface to the tools created as part of task 2.4 (mathematical search engine). It has also been used as an applet to handle all the formulae in an electronic course.

Conversion tools

The conversion tools targeted two languages, LaTeX which is important today and MathML which should be very important in the (near) future. LaTeX is of course important because it is the de facto standard for mathematics and physics at the university level. MathML will be important because it is the W3C recommended way to write mathematics in XML documents (and thus probably in all future technical documents).

While converting OpenMath to LaTeX is relatively straightforward, going in the other direction is much more complicated as it amounts to adding semantics to the presentation markup. This translation is normally highly context sensitive. Two translators were built and an on-line demonstration is available via the project Web page (where a user can type a piece of LaTeX and look at the resulting translated OpenMath).

MathML has two subsets, the presentation part (which is close to LaTeX) and the content part (which is closer to OpenMath). Converting to and from MathML presentation is simpler than LaTeX because presentation MathML is much more structured. Content MathML is indeed very close to OpenMath (because of the alignment activities that occurred as part of task 1.2 and 1.4). The conversion tools for MathML have been developed as part of Stilo MathWriter. JOME also has the ability to generate MathML.

The project has also produced XSLT (the eXtensible Stylesheet Language, Transformations, a W3C recommendation) code to translate XML encoded OpenMath to and from MathML (this has been used to demonstrate a prototype OpenMath interface to a version of Reduce that has MathML import and export capabilities). When used in Java servlets, these stylesheets allow OpenMath objects to be converted dynamically to presentation MathML by an HTTP server to be displayed natively by a MathML capable browser (such as Mozilla or Amaya).

next up previous contents
Next: Applications Up: The OpenMath Project Final Previous: OpenMath and MathML   Contents
The OpenMath Consortium logo