If you write SGML documents using the DocBook DTD, the files offered here let you embed TeX equations directly in your SGML source files, and arrange for the mathematical notation to be fed directly to TeX on output — hence avoiding both (a) the need to code mathematics in MathML on the input side, and (b) the need to rely upon experimental and unfinished dsssl-based mathematical typesetting code. Provision is made for substituting graphical variants of mathematical formulae in the case of output to HTML.
SGML, using the DocBook DTD, provides an excellent canonical format for storing complex documents (e.g. the manual for the econometrics program gretl, which I maintain), especially if one wants to be able to produce both printable (e.g. PDF) and web-viewable (HTML) versions of these documents on demand.
If these documents contain a significant amount of mathematical notation, however, one is likely to run into problems. There is a means of dealing with mathematics in the context of (semi-)standard DocBook — namely, using MathML mark-up — but this approach has two big problems.
MathML mark-up is extremely verbose. It is impossible to write and edit MathML by hand, other than as an "exercise," or for very simple mathematical expressions.
Even if you can get your math into MathML notation, somehow or other, you then face serious problems in getting it properly typeset in the printable output.
I will expand on each of these problems in turn.
Producing MathML: The inordinate difficulty of writing in MathML directly means that for even moderately complex expressions one is forced to use a GUI equation editor of some sort. But this in turn has two drawbacks, from my point of view. First, in the realm of free, open-source software, such editors are hard to come by. The experimental Amaya is, to my knowledge, the only such open-source tool available. Second, even if a suitable equation editor is available, writing mathematics in this way does not sit well with my preferred mode of document preparation, namely WYSIWYW or "What You See Is What You Wrote," for example editing the SGML source in emacs with the help of a suitable mode.
Typesetting from MathML: A dsssl engine such as openjade can turn the MathML into TeX for you but the results are likely to be disappointing, particularly if you are used to typesetting mathematics using TeX itself. TeX's native mathematical typsetting is near-perfect, only occasionally requiring manual tweaking to achieve optimal results; it is also rather comprehensive, with the aid of the AMS (American Mathematical Society) extensions if need be. But if you take the route of MathML to TeX via dsssl and jade, the specifics of the math typsetting must be handled by the dsssl stylesheet. David Carlisle put some work into this a few years back (for which we can be grateful), but he didn't finish the job and nobody else has done so since. Thus if you send MathML through jade to TeX you are likely to find (a) that those elements that are recognized by the stylesheets are typeset less adeptly than by TeX itself (with clumsy-looking spacing), while (b) various important elements may not be recognized at all. For example in my field of statistics the overbar (denoting the arithmetic mean) is a common modifier, but it is simply ignored. Other formulations common in statistics are also ignored, or are not dealt with properly, so this route is not really usable for me.
"DBTeXMath" is implemented via (a) a minor hack to the standard DocBook DTD, (b) an equally minor hack to the DocBook dsssl stylesheets as offered by Norman walsh, and (c) a couple of perl scripts, one to "unescape" TeX math passed through jade and one to generate graphical versions of TeX math for use in making HTML.
The DTD hack defines a new element, <texmath>, a container for literal TeX mathematical mark-up, for use within the various equation contexts in a DocBook document. The stylesheet hacks produce the effect that (a) the literal TeX math gets passed through when producing PDF or postscript, while (b) it gets ignored (in favor of a PNG rendition of the math) when producing HTML.
Using perl to post-process the output of a dsssl engine is truly an ugly hack, but I think I can rationalize it! While a "cleaner" solution would be to modify openjade (adding an appropriate switch) there are a couple of things against this: one, I'm no C++ jockey and I simply don't have time to learn how to hack on openjade effectively; and two, by offering "DBTeXMath" as a small package of scripts rather than as a patch to openjade of questionable acceptability I'm (hopefully) making it easier for others to try it out and see if they like it.
The package consists of the following files:
"driver" file: defines the new elements and calls the DocBook DTD.
dsssl stylesheet fragment for use in producing TeX output (for further processing by jadetex in order to make PDF or postscript).
dsssl stylesheet fragment suitable for producing HTML output using jade.
skeleton dsssl stylesheet that calls either TeXMath.dsl or HTMLMath.dsl depending on the sort of output that is required. I have kept this very simple; you can add your own stylistic customizations.
sample SGML source file showing the use of the texmath element.
perl script to be run after running jade but before running jadetex.
perl script to generate on the fly PNG images of equations in the SGML source file. Requires latex, dvips and the ImageMagick convert program.
sample configuration file for pdfjadetex.
sample to show the intended order of events and dependencies when using this approach.
This explanatory document in PDF format.
gzipped archive containing all of the above.
To try it out you should first explode dbtexmath.tar.gz in a suitable location (it will create a subdirectory named dbtexmath). Then check the Makefile for compatibility with your system, and check the local paths to the DocBook DTD and stylesheets in dbtexmath.dtd and sample-both.dsl. You may then compile the sample document with make pdf (PDF output) or make html (HTML output to the subdirectory html_out). To compile your own SGML document using this system, copy the Makefile and edit it appropriately, copy your document into the dbtexmath directory, and compile it.
Here is a little bit of inline math:
. It is embedded in the text.
The SGML source looks like this:
<para>Here is a little bit of inline math: <inlineequation> <inlinegraphic fileref="figures/cfunc.png"/> <texmath> $C = \alpha + \beta Y^{\gamma} + \epsilon$ </texmath> </inlineequation>. It is embedded in the text. </para> |
The equation looks fine in PDF, but in the HTML version the math sits above the baseline. This should probably be fixable, but I haven't got to it yet.
New paragraph. Here is a bit of displayed math
Note that for this sort of display you get excessive vertical spacing if you bracket the math with double dollar signs, $$. The spacing is better if you use \[ and \] instead. The source is:<para>New paragraph. Here is a bit of displayed math <informalequation> <graphic fileref="figures/fstat.png"/> <texmath> \[F_{2,T-k}=\frac{(ESS_r-ESS_u)/2}{ESS_u/(T-k)}\] </texmath> </informalequation> Note that for this sort of display you get excessive... |
Now here is a slightly more complex piece of displayed math, for which we adjust the spacing using
\renewcommand{\arraystretch}{1.3} \setlength{\arraycolsep}{.05in} |
within the texmath element:
Finally we try using the <equation> tag with a title:
In all cases the graphic is ignored if you're using jade's TeX back-end, while the texmath is ignored and the graphic used when producing HTML. This document (as you might have guessed) was produced using this system.