Using Jade to generate LaTeX

Jade's TeX back-end supports (a subset of) the DSSSL style language, but it requires a special version of TeX. This document describes an alternative approach, using a LaTeX back-end to generate arbitrary LaTeX, in the spirit of Jade's transformation back-end. That is, rather than doing the styling within Jade, using the DSSSL style language, we transform the SGML document into (more-or-less) high-level LaTeX markup, and do all the formatting within LaTeX. You can generate LaTeX using the transformation back-end plus a lot of (literal ...), but that's rather errorprone, and has potential whitespace problems.

This approach means that (a) you can generate LaTeX which is itself distributable because it doesn't rely on a particular TeX setup, and (b) you can offload the formatting magic into a LaTeX class, which might be more confortable for you. This approach runs counter to DSSSL's supposedly central virtue, of generating high-quality print output from a format-independent stylesheet. However, as the SGML Transformation back-end has shown, Jade and the DSSSL Style Language together make a powerful general SGML processor, which needn't be limited to final (ie, immediately printable) versions of documents.

With the patched sources, the version reported by jade -v is `Jade version "1.3.2-starlink-2"'

Flow object extensions

The extensions described here are available in conjunction with the LaTeX back-end, selected with the -t latex option. This requires an appropriately patched version of Jade, built with the --enable-latex option to ./configure.

The extensions, the code which supports them, and even this documentation, are closely based on James Clark's SGML Transformation back-end (qv).

command and empty-command
These flow objects result in LaTeX commands in the output. command is a compound flow object, whilst empty-command is an atomic flow object (it may not have child flow objects). Both flow objects have the following non-inherited characteristics:
name
This is a string-valued characteristic that specifies the name of the environment. It defaults to the generic identifier of the current node.
parameters
This specifies the command's parameters as a list of strings. If the parameter string starts with a `?' character, the parameter value is the remainder of the string, written as an optional parameter. If the parameter string starts with `!', it is written as a required parameter, allowing you to start parameter values with a `?' character.

For example, the following would generate the documentclass line at the beginning of a LaTeX file:

    (make empty-command name: "documentclass"
	  parameters: (list "?a4paper" "article"))
  

The difference between the two flow objects is that the content of the command flow object appears as the command's final parameter. Thus, the following are equivalent:

(element em
  (make command name: "emph"
    (process-children)))
and
(element em
  (make empty-command name: "emph"
    parameters: (list (data (current-node)))))
environment
This flow object results in an environment in the output. It is a compound flow object (it can have child flow objects). It has the following non-inherited characteristics:
name
As for command
parameters
As for command
brackets
This allows you to specify explicitly what the beginning and end of the environment should be. The characteristic's value is a list consisting of two strings, the first of which is inserted at the beginning, and the second at the end, of the environment. For example, you might specify that an element type equation should be transformed into a LaTeX equation via a flow object constructor such as:
  (element mequation
    (make environment brackets: '("\\[" "\\]")
          (process-children)))
  
recontrol
The LaTeX generated by the back-end has line-breaks inserted to avoid the lines becoming too long. If these are inappropriate for some reason, you can override the line-breaking by giving a value to the recontrol characteristic. Its value is a string with three characters, controlling the line-breaking before, in the middle of (ie, between the \\begin and the environment name), and after the opening and closing of the environment. If the corresponding character is `/' a RE is inserted at the appropriate point; if it is `-', no RE is inserted. For example, a verbatim element type might be supported via:
  (element verbatim
    (make environment
      name: "verbatim"
      recontrol: "/-/"
      (process-children)))
  
Thus, the \\begin{verbatim} and \\end{verbatim} will be on lines by themselves, with no RE between the \\begin or \\end and the environment name.

The entity and formatting-instruction flow object classes are available exactly as in the SGML transformation back-end.

In addition, there is also the following characteristic:

escape-tex?
This is an inherited boolean characteristic. When true (the default), characters which are significant to LaTeX, such as backslashes or dollar signs, are escaped on output. If false, this escaping is turned off.

For example, the verbatim element type mentioned above would best be implemented as:

(element verbatim
  (make environment
    name: "verbatim"
    recontrol: "/-/"
    escape-tex?: #f
    (process-children)))

These classes must be declared using declare-flow-object-class in any DSSSL specification that makes use of them. A suitable set of declarations is:

(declare-flow-object-class command
  "UNREGISTERED::Norman Gray//Flow Object Class::command")
(declare-flow-object-class empty-command
  "UNREGISTERED::Norman Gray//Flow Object Class::empty-command")
(declare-flow-object-class environment
  "UNREGISTERED::Norman Gray//Flow Object Class::environment")
(declare-characteristic escape-tex?
  "UNREGISTERED::Norman Gray//Characteristic::escape-tex?"
  #t)

Paragraphs and white space

The back-end attempts to suppress redundant whitespace. It removes blank lines, because they could insert spurious paragraphs. Paragraphs are delimited purely by \\par commands. The back-end provides minimal support for the paragraph flow object class, so you should support paragraph elements with a construction like:

(element para
  (make paragraph
    (process-children)))

The suppression of blank lines is not terribly sophisticated (it should really be supported at a slightly lower level than it is), and I would be grateful to hear of any problems with it. If there are problems, a workaround is to insert a command \\catcode`\\^^M=10 near the beginning of the LaTeX file. This will cause all newlines to be interpreted as simple spaces, so that blank lines will no longer be interpreted as delimiting paragraphs.

Bugs

This is a patch to the official openjade release. I've submitted the patch to the openjade team, but unless and until it's incorporated into the main distribution, any bugs in the LaTeX support should be reported to me rather than to them.

Norman Gray
9 February 2000