Math Typesetting

Typesetting math in HTML was for a long time one of those ‘I can’t believe that hasn’t been solved by now!’ issues. It seemed a bit wrong – wasn’t the Internet more or less invented by math geeks? Did they give up using the web back in 1996 because it didn’t support math? (That would explain the aesthetic of many ‘home pages’ for math professors.)

MathML is the W3C-recommended standard markup for equations – its like ‘HTML tags for math.’ While MathML has a long history and has been established in XML workflows for quite some time, it was only with HTML5 that mathematics finally entered the web as a first-class citizen. This will hopefully lead to some interesting developments as more users explore MathML and actually use it as creatively as they’ve used ‘plain’ text.

However the support in browsers has been sketchy. Internet Explorer does not support MathML natively, Opera only supports a subset, and Konqueror does not support it. WebKit’s partial support only recently made it into Chrome 24 and was held back on Safari due to a font bug. Firefox wins the math geek trophy hands down with early (since FF 3) and best (though not yet complete) support for MathML.

What’s the hold up?

It seems that equation support is underwhelmingly resourced in the browser development world. WebKit’s great improvements this year (including a complete re-write + the effort to get it through Chrome’s code review) have been due to a single volunteer, Dave Barton. Frédéric Wang plays the same heroic (and unpaid) role for Firefox. The question is: Why is this critical feature being left to part-time voluntary developers? Aren’t there enough people and organisations out there that would need math support in the browser (think of all the university math departments just for starters…) to pay for development?

As a result we have no browser that fully supports MathML. While you cannot overstate the accomplishments of the volunteer work, it’s important to tell the full story. Recent cries that Chrome and Safari support MathML can give the wrong impression of the potential for MathML when users encounter bad rendering of basic constructs. The combination of native MathML support, font support and authoring tools (more on this in a later post) makes mathematical content one of the most complex situations for web typesetting.

There have been some valiant attempts to fix this lack of equation support with third-party math typesetting code (Javascript), notably Asciimath by Peter Jipsen with its human-readable syntax and image fallbacks, jqmath by Dave Barton (same as above), and MathQuill.

However by far the most comprehensive solution is the MathJax libraries (Javascript) released as Open Source. While MathJax is developed by a small team (including Frédéric Wang above) they are bigger than most projects, with a growing community and sponsors. MathJax renders beautiful equations as HTML-CSS (using webfonts) or as SVG. The mark up used can be either MathML, Asciimath or TeX. That means authors can write (ugly) mark up like this:

J\alpha(x) = \sum\limits{m=0}^\infty \frac{(-1)^m}{m! , \Gamma(m + \alpha + 1)}{\left({\frac{x}{2}}\right)}^{2 m + \alpha}

into an HTML page and it will be rendered into a lovely looking scalable, copy-and-past-able equation. You can see it in action here.

So why is this really interesting to publishing?

Well…While EPUB3 specifications support MathML this has not made it very far into ereader devices. However MathJax is being utilised in ereader software that has Javascript support. Since many ebook readers are built upon WebKit (notably Android and iOS apps, including iBooks) this strategy can work reasonably effectively. Image rendering of equations in ebook format is another strategy, and possibly the only guaranteed strategy at present, however you can’t copy, edit, annotate or scale the equations and you cannot expect proper accessibility with images.

The only real solution is full equation support in all browsers and ereaders either through native MathML support or through inclusion of libraries like MathJax. We can’t do anything to help the proprietary reader developments get this functionality other than by lobbying them to support EPUB3 (a worthwhile effort) but since more and more reading devices are using the Open Source WebKit we can influence that directly.

The publishing industry should really be asking itself why it is leaving such an important issue to under-resourced volunteers and small organisations like MathJax to solve. Are there no large publishers who need equation support in their electronic books that could step forward and put some money on the table to assist WebKit support of equations? Couldn’t more publishers be as enlightened as Springer and support the development of MathJax? You don’t even have to be a large publisher to help – any coding or financial support would be extremely useful.

It does seem that this is a case of an important technological issue central to the development of the publishing industry ‘being left to others’. I have the impression that it is because the industry as a whole doesn’t actually understand what technologies are at the center of their business. Bold statement as it is, I just don’t see how such important developments could go otherwise untouched by publishers. It could all be helped enormously with relatively small amounts of cash. Hire a coder and dedicate them to assist these developments. Contact Dave Barton or MathJax and ask them how your publishing company can help move this along and back that offer up with cash or coding time. It’s actually that simple.

Many thanks to Peter Krautzberger for fact checking and technical clarifications.

Published on O'Reilly, November 26 2012 http://toc.oreilly.com/2012/11/math-typesetting.html