% etex_gen.tex --- How to generate e-TeX --- last modified 06 Mar 1998 \font\eighttt= cmtt8 \font\eightrm= cmr8 \font\rtitlefont= cmr7 scaled\magstep5 \font\ititlefont= cmmi7 scaled\magstep5 \def\titlefont{\rtitlefont \textfont1=\ititlefont} \def\eTeX{$\varepsilon$-\TeX} \def\NTS{NTS} \let\mc=\eightrm \rm \let\mainfont=\tenrm \def\.#1{\hbox{\tt#1}} \def\\#1{\hbox{\it#1\/\hskip.05em}} % italic type for identifiers \parskip 2pt plus 1pt \baselineskip 12pt plus .25pt \output{\shipout\box255\global\advance\pageno by 1} % for the title page only \null \vfill \centerline{\titlefont How to generate \eTeX} \vskip 6pt \centerline{({\sl Version 2, March 1998\/})} \vskip 18pt \centerline{by The \NTS\ Team} \vskip 6pt \centerline{Peter Breitenlohner, Max-Planck-Institut f\"ur Physik, M\"unchen} \vskip 6pt \centerline{Philip Taylor, RHBNC, University of London} \vfill \centerline{\vbox{\hsize 4in \noindent Given an implementation of \TeX82 for a particular system, this report describes how to generate a corresponding implementation of \eTeX.}} \vskip 24pt {\baselineskip 9pt \eightrm\noindent The preparation of this report was supported in part by DANTE, Deutschsprachige Anwendervereinigung \TeX\ e.V.\hfil\break `\TeX' is a trademark of the American Mathematical Society. }\pageno=0\eject \output{\shipout\vbox{ % for subsequent pages \baselineskip0pt\lineskip0pt \hbox to\hsize{\strut \ifodd\pageno \hfil\eightrm\firstmark\hfil \mainfont\the\pageno \else\mainfont\the\pageno\hfil \eightrm\firstmark\hfil\fi} \vskip 10pt \box255} \global\advance\pageno by 1} \let\runninghead=\mark \outer\def\section#1.{\noindent{\bf#1.}\quad \runninghead{\uppercase{#1} }\ignorespaces} \section Introduction. Let us first review the process of generating an implementation of \TeX82 for a particular system from the source files as, e.g., described in the \.{WEB} manual [1]. The system-independent source file \.{tex.web} must remain unmodified. All changes to \.{tex.web} necessary for a particular operating system and\slash or compiler are to be collected in a system-dependent change file, typically named \.{tex.ch}. Both files \.{tex.web} and \.{tex.ch} are effectively merged when input by the \.{WEB} system programs \.{WEAVE} and \.{TANGLE}. When \.{WEAVE} processes this merged input, a file \.{tex.tex} is produced. Further processing by \TeX\ yields a `pretty printed' version of the input together with an index. When \.{TANGLE} processes the merged input, a string pool file \.{tex.pool} and a Pascal file \.{tex.pas} (or similar) are produced. The Pascal file can then be further processed by a suitable compiler and\slash or language converter such as \.{web2c}, and eventually yields an executable program. There are actually three versions of the program: First there is \.{INITEX} with its capability to initialize all of \TeX's tables and to write them in compact form to a format file. Next there is the production version \.{VIRTEX} requiring a format file as input. Finally there is \.{TRIPTEX}, a version of \.{INITEX} with special values for some of \TeX's parameters, for the \.{TRIP} test [2] that should be used to validate a \TeX\ implementation. Depending on the capabilities of the compiler, these three versions of the program are generated from three slightly different change files or they are generated from one change file with different compiler options. They might even be one and the same executable program used with different run time options. \vskip 24pt plus 24pt \section Generating \eTeX. The process of generating \eTeX\ is essentially the same as that of generating \TeX\ as described above. Conceptually there is a system-independent source file \.{etex.web} and a system-dependent change file \.{etex.sys}. Processing these two files by \.{TANGLE} yields a string pool file \.{etex.pool} as well as a Pascal file \.{etex.pas}, whilst processing by \.{WEAVE} produces a \TeX\ source file, \.{etex.tex}. It may, however, be necessary to increase some of the constants defined in \.{TANGLE} and \.{WEAVE}. The following values should suffice in most cases: $$ \vcenter{\halign{$#$\hfil\qquad&#\hfil\cr \\{max\_bytes}\times\\{ww}=100~000&\.{TANGLE} and \.{WEAVE}\cr \\{max\_texts}=2~500&\.{TANGLE} and \.{WEAVE}\cr \\{max\_toks}\times\\{zz}=180~000&\.{TANGLE}\cr \\{max\_names}=5~000&\.{TANGLE}\cr \\{max\_scraps}=3~000&\.{WEAVE}\cr \\{stack\_size}=300&\.{WEAVE}\cr }} $$ The source file \.{etex.web} for \eTeX\ does not, however, exist as a physical file. It is the hypothetical file obtained by applying the changes in the actual source file \.{etex.ch} to \.{tex.web}. Thus \.{etex.web} inherits the bulk of code from \.{tex.web}, whilst the system-independent source file \.{etex.ch} for \eTeX\ defines the differences between \.{etex.web} and \.{tex.web}. In order to generate an implementation of \eTeX\ two change files have to be applied to \.{tex.web}, one after the other (the actual file names may differ): $$ \vcenter{\halign{#\hfil&\qquad\.{#}\hfil&\qquad#\hfil\cr 0.&tex.web&system-independent \.{WEB} source for \TeX\cr 1.&etex.ch&system-independent changes for \eTeX\cr 2.&etex.sys&system-dependent changes for \eTeX\cr }} $$ The process of merging several change files into \.{tex.web} should certainly not be performed by hand. There are programs such as \.{TIE} and \.{PATCHWEB} that perform this process automatically. If no such program is available, a \TeX\ program \.{WEBMERGE} can be used. \.{WEBMERGE} reads \.{tex.web} and up to nine change files and produces a merged change file that can then be processed, together with \.{tex.web}, by \.{TANGLE} and \.{WEAVE}. [On systems such as VMS, use of \.{WEBMERGE} can leave a large number of temporary files lying around; this can be avoided by setting a version limit (e.g.~1) on any existing versions of those files, or by setting a version limit on the directory in which they will be created. On other systems, it will probably leave one large temporary file.] Every implementor of \eTeX\ is responsible for creating and maintaining a suitable \.{etex.sys} in the same way as every implementor of \TeX\ is responsible for creating and maintaining \.{tex.ch}. Since the bulk of code in \.{etex.web} is identical to that in \.{tex.web} the bulk of the system-dependent changes in \.{etex.sys} for a particular system will be identical to those in \.{tex.ch} for the same system. In the following we try to give some hints where \.{etex.sys} for a particular system might deviate from the corresponding \.{tex.ch}. First, it might be necessary to increase the size of \TeX's string pool in order to accommodate \eTeX's additional strings (message texts as well as multi-letter control sequences). If this turns out to be necessary for \eTeX\ it would certainly not be harmful to do it for \TeX\ as well. \TeX\ and \eTeX\ use three constants related to the string pool: \\{max\_strings} the maximal number of strings, \\{pool\_size} the maximal number of string characters, and \\{string\_vacancies} the minimal number of available string characters in addition to those occupied by strings from the pool file. It is, therefore, sufficient to increase \\{pool\_size} (or reduce \\{string\_vacancies}) by the number of \eTeX's additional string characters and to increase \\{max\_strings} by the number of \eTeX's additional strings. The latter will, however, be unnecessary for most implementations as \\{max\_strings} is usually increased substantially beyond its standard value in order to accommodate large \TeX\ macro packages. For Version~2 of \eTeX, there are about 100 additional strings with about 1500 additional string characters. The precise numbers can be obtained by running \.{POOLTYPE} on \TeX's and \eTeX's pool files (\.{POOLTYPE} reports the total number of strings and string characters in a pool file). Next, \.{etex.sys} may contain a system-dependent modification of the \\{eTeX\_banner} string. The modified \\{eTeX\_banner} string must contain `\.{e-TeX}' as well as the \eTeX\ version number. Note, however, that the \\{banner} string modified by \.{tex.ch} will not be referenced by \eTeX\ unless the implementor intentionally changes that aspect of \eTeX's functionality: therefore \.{etex.sys} can modify the \\{banner} string in the same way as does \.{tex.ch}. Then, \.{etex.sys} might deviate from \.{tex.ch} in order to use a different pool file name and\slash or format file extension (see below). Finally, \.{etex.sys} will necessarily deviate whenever \.{etex.ch} and \.{tex.ch} try to change the same piece of \.{WEB} code or when the system-independent \eTeX\ changes from \.{etex.ch} and the system-dependent \TeX\ changes from \.{tex.ch} interfere in some other way. In case of any such interference implementors must first of all determine how to combine the respective changes from \.{etex.ch} and \.{tex.ch} in order to obtain \eTeX's functionality for a particular system. Obviously, this process cannot be automated since it requires human insight. The \NTS\ team has tried to formulate \.{etex.ch} such that interferences with system-dependent change files \.{tex.ch} are unlikely. Suggestions by implementors how any remaining such interferences could be avoided by a reformulation of \.{etex.ch} will be taken into serious consideration. Such interferences can be further reduced by reformulating the system-dependent change file \.{tex.ch} for \TeX, e.g.\ by reducing the range of change entries from entire modules to the pieces of code that are actually changed. Implementors might prefer to maintain the system-dependent change file \.{etex.sys} not as a physical file but as a hypothetical file defined through its deviation from \.{tex.ch}. If there are no interferences of the kind mentioned above, then the effect of applying the changes from the hypothetical \.{etex.sys} to the hypothetical \.{etex.web} can be achieved by applying 3 change files, one after the other, to \.{tex.web} (using some tool such as \.{TIE}, \.{PATCHWEB}, or \.{WEBMERGE}): $$ \vcenter{\halign{#\hfil&\qquad\.{#}\hfil&\qquad#\hfil\cr 0.&tex.web&system-independent \.{WEB} source for \TeX\cr 1.&etex.ch&system-independent changes for \eTeX\cr 2.&tex.ch&system-dependent changes for \TeX\cr 3.&tex.ech&additional system-dependent changes for \eTeX\cr }} $$ The third change file \.{tex.ech} will be rather short and contains just the differences between \.{etex.sys} and \.{tex.ch}. It is recommended that implementors try to remove all interferences between \.{etex.ch} and \.{tex.ch} and use this method to generate \eTeX. As with \TeX\ there are three versions of \eTeX: \.{e-INITEX}, \.{e-VIRTEX}, and \.{e-TRIPTEX}. Depending on the implementation they will again be generated from the three slightly different versions of \.{tex.ch} or with different compiler options or they may be one and the same program used with different run time options. \vskip 24pt plus 24pt \section \eTeX\ modes. In order to ensure maximal compatibility with \TeX, \eTeX\ can run in either compatibility mode or extended mode. The possibility of this choice is, of course, an extended feature of \eTeX. Once \eTeX\ has chosen compatibility mode it is, however, a legitimate implementation of \TeX\ (assuming the \TeX\ implementation itself is legitimate). The only differences between \eTeX\ in compatibility mode and \TeX\ are those allowed by D.~E.~Knuth [2] between different implementations of \TeX. An \.{e-TRIP} test suite [3] defines the criteria by which a program can qualify to use the name `\eTeX'. Part of the \.{e-TRIP} test consists of the standard \.{TRIP} test for \.{e-TRIPTEX} in compatibility and extended mode. \eTeX\ can therefore be used instead of \TeX\ without the necessity to maintain both programs. For the case that both programs should nevertheless co-exist on a system, it might be a good idea to name the pool file for \eTeX\ \.{etex.pool} instead of \.{tex.pool} and to use an extension other than \.{.fmt}, e.g., \.{.efmt} for \eTeX\ format files. (Format files for \TeX\ and \eTeX\ are incompatible). All this will require additional changes in the file \.{tex.ech}. When \.{INITEX} or \.{VIRTEX} start, they inspect the first non-blank character from the command line or in response to the \.{**} prompt. This may be an \.{\&} immediately followed by the name of a format file to be loaded. Otherwise \.{VIRTEX} loads a default format, whereas \.{INITEX} starts without loading a format. When \.{e-INITEX} or \.{e-VIRTEX} start, they inspect the first non-blank character from the command line or in response to the \.{**} prompt. This may again be an \.{\&} immediately followed by the name of a format file to be loaded; otherwise \.{e-VIRTEX} loads a default format. For \.{e-INITEX} the first non-blank character may be an \.{*} immediately followed by what would normally be the input for \.{INITEX} (without intervening blanks). \.{e-INITEX} enters extended mode in response to the \.{*}, or compatibility mode otherwise. This mode is recorded in format files produced by \.{e-INITEX} and entered again when such a format is loaded (by either \.{e-INITEX} or \.{e-VIRTEX}). \vskip 24pt plus 24pt \section References. \item {[1]} {\sl The \.{WEB} system of structured documentation\/}, by Donald E.~Knuth,\hfil\break Stanford Computer Science Report~980. \item {[2]} {\sl A torture test for \TeX\/}, by Donald E.~Knuth, Stanford Computer Science Report~1027. \item {[3]} {\sl A torture test for \eTeX\/}, by The \NTS\ Team (Peter Breitenlohner and Bernd Raichle). \end