% language=uk \environment luatex-style \startcomponent luatex-enhancements \startchapter[reference=enhancements,title={Basic \TEX\ enhancements}] \startsection[title={Introduction}] \startsubsection[title={Primitive behaviour}] From day one, \LUATEX\ has offered extra features compared to the superset of \PDFTEX, which includes \ETEX, and \ALEPH. This has not been limited to the possibility to execute \LUA\ code via \prm {directlua}, but \LUATEX\ also adds functionality via new \TEX|-|side primitives or extensions to existing ones. When \LUATEX\ starts up in \quote {iniluatex} mode (\type {luatex -ini}), it defines only the primitive commands known by \TEX82 and the one extra command \prm {directlua}. As is fitting, a \LUA\ function has to be called to add the extra primitives to the user environment. The simplest method to get access to all of the new primitive commands is by adding this line to the format generation file: \starttyping \directlua { tex.enableprimitives('',tex.extraprimitives()) } \stoptyping But be aware that the curly braces may not have the proper \prm {catcode} assigned to them at this early time (giving a \quote {Missing number} error), so it may be needed to put these assignments before the above line: \starttyping \catcode `\{=1 \catcode `\}=2 \stoptyping More fine|-|grained primitives control is possible and you can look up the details in \in {section} [luaprimitives]. For simplicity's sake, this manual assumes that you have executed the \prm {directlua} command as given above. The startup behaviour documented above is considered stable in the sense that there will not be backward|-|incompatible changes any more. We have promoted some rather generic \PDFTEX\ primitives to core \LUATEX\ ones, and the few that we inherited from \ALEPH\ (\OMEGA) are also promoted. Effectively this means that we now only have the \type {tex}, \type {etex} and \type {luatex} sets left. In \in {Chapter} [modifications] we discuss several primitives that are derived from \PDFTEX\ and \ALEPH\ (\OMEGA). Here we stick to real new ones. In the chapters on fonts and math we discuss a few more new ones. \stopsubsection \startsubsection[title={Version information}] \startsubsubsection[title={\lpr {luatexbanner}, \lpr {luatexversion} and \lpr {luatexrevision}}] \topicindex{version} \topicindex{banner} There are three new primitives to test the version of \LUATEX: \unexpanded\def\VersionHack#1% otherwise different luatex and luajittex runs {\ctxlua{% local banner = "\luatexbanner" local banner = string.match(banner,"(.+)\letterpercent(") or banner context(string.gsub(banner ,"jit",""))% }} \starttabulate[|l|l|pl|] \DB primitive \BC value \BC explanation \NC \NR \TB \NC \lpr {luatexbanner} \NC \VersionHack{\luatexbanner} \NC the banner reported on the command line \NC \NR \NC \lpr {luatexversion} \NC \the\luatexversion \NC a combination of major and minor number \NC \NR \NC \lpr {luatexrevision} \NC \luatexrevision \NC the revision number, the current value is \NC \NR \LL \stoptabulate The official \LUATEX\ version is defined as follows: \startitemize \startitem The major version is the integer result of \lpr {luatexversion} divided by 100. The primitive is an \quote {internal variable}, so you may need to prefix its use with \prm {the} depending on the context. \stopitem \startitem The minor version is the two|-|digit result of \lpr {luatexversion} modulo 100. \stopitem \startitem The revision is reported by \lpr {luatexrevision}. This primitive expands to a positive integer. \stopitem \startitem The full version number consists of the major version, minor version and revision, separated by dots. \stopitem \stopitemize \stopsubsubsection \startsubsubsection[title={\lpr {formatname}}] \topicindex{format} The \lpr {formatname} syntax is identical to \prm {jobname}. In \INITEX, the expansion is empty. Otherwise, the expansion is the value that \prm {jobname} had during the \INITEX\ run that dumped the currently loaded format. You can use this token list to provide your own version info. \stopsubsubsection \stopsubsection \stopsection \startsection[title={\UNICODE\ text support}] \startsubsection[title={Extended ranges}] \topicindex{\UNICODE} Text input and output is now considered to be \UNICODE\ text, so input characters can use the full range of \UNICODE\ ($2^{20}+2^{16}-1 = \hbox{0x10FFFF}$). Later chapters will talk of characters and glyphs. Although these are not interchangeable, they are closely related. During typesetting, a character is always converted to a suitable graphic representation of that character in a specific font. However, while processing a list of to|-|be|-|typeset nodes, its contents may still be seen as a character. Inside \LUATEX\ there is no clear separation between the two concepts. Because the subtype of a glyph node can be changed in \LUA\ it is up to the user. Subtypes larger than 255 indicate that font processing has happened. A few primitives are affected by this, all in a similar fashion: each of them has to accommodate for a larger range of acceptable numbers. For instance, \prm {char} now accepts values between~0 and $1{,}114{,}111$. This should not be a problem for well|-|behaved input files, but it could create incompatibilities for input that would have generated an error when processed by older \TEX|-|based engines. The affected commands with an altered initial (left of the equal sign) or secondary (right of the equal sign) value are: \prm {char}, \prm {lccode}, \prm {uccode}, \lpr {hjcode}, \prm {catcode}, \prm {sfcode}, \lpr {efcode}, \lpr {lpcode}, \lpr {rpcode}, \prm {chardef}. As far as the core engine is concerned, all input and output to text files is \UTF-8 encoded. Input files can be pre|-|processed using the \type {reader} callback. This will be explained in \in {section} [iocallback]. Normalization of the \UNICODE\ input is on purpose not built|-|in and can be handled by a macro package during callback processing. We have made some practical choices and the user has to live with those. Output in byte|-|sized chunks can be achieved by using characters just outside of the valid \UNICODE\ range, starting at the value $1{,}114{,}112$ (0x110000). When the time comes to print a character $c>=1{,}114{,}112$, \LUATEX\ will actually print the single byte corresponding to $c$ minus 1{,}114{,}112. Output to the terminal uses \type {^^} notation for the lower control range ($c<32$), with the exception of \type {^^I}, \type {^^J} and \type {^^M}. These are considered \quote {safe} and therefore printed as|-|is. You can disable escaping with \type {texio.setescape(false)} in which case you get the normal characters on the console. \stopsubsection \startsubsection[title={\lpr {Uchar}}] \topicindex{\UNICODE} The expandable command \lpr {Uchar} reads a number between~0 and $1{,}114{,}111$ and expands to the associated \UNICODE\ character. \stopsubsection \startsubsection[title={Extended tables}] All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers. The affected commands are: \startfourcolumns \startlines \prm {count} \prm {dimen} \prm {skip} \prm {muskip} \prm {marks} \prm {toks} \prm {countdef} \prm {dimendef} \prm {skipdef} \prm {muskipdef} \prm {toksdef} \prm {insert} \prm {box} \prm {unhbox} \prm {unvbox} \prm {copy} \prm {unhcopy} \prm {unvcopy} \prm {wd} \prm {ht} \prm {dp} \prm {setbox} \prm {vsplit} \stoplines \stopfourcolumns Because font memory management has been rewritten, character properties in fonts are no longer shared among font instances that originate from the same metric file. Of course we share fonts in the backend when possible so that the resulting \PDF\ file is as efficient as possible, but for instance also expansion and protrusion no longer use copies as in \PDFTEX. \stopsubsection \stopsection \startsection[title={Attributes}] \startsubsection[title={Nodes}] \topicindex {nodes} When \TEX\ reads input it will interpret the stream according to the properties of the characters. Some signal a macro name and trigger expansion, others open and close groups, trigger math mode, etc. What's left over becomes the typeset text. Internally we get linked list of nodes. Characters become \nod {glyph} nodes that have for instance a \type {font} and \type {char} property and \typ {\kern 10pt} becomes a \nod {kern} node with a \type {width} property. Spaces are alien to \TEX\ as they are turned into \nod {glue} nodes. So, a simple paragraph is mostly a mix of sequences of \nod {glyph} nodes (words) and \nod {glue} nodes (spaces). The sequences of characters at some point are extended with \nod {disc} nodes that relate to hyphenation. After that font logic can be applied and we get a list where some characters can be replaced, for instance multiple characters can become one ligature, and font kerns can be injected. This is driven by the font properties. Boxes (like \prm {hbox} and \prm {vbox}) become \nod {hlist} or \nod {vlist} nodes with \type {width}, \type {height}, \type {depth} and \type {shift} properties and a pointer \type {list} to its actual content. Boxes can be constructed explicitly or can be the result of subprocesses. For instance, when lines are broken into paragraphs, the lines are a linked list of \nod {hlist} nodes. So, to summarize: all that you enter as content eventually becomes a node, often as part of a (nested) list structure. They have a relative small memory footprint and carry only the minimal amount of information needed. In traditional \TEX\ a character node only held the font and slot number, in \LUATEX\ we also store some language related information, the expansion factor, etc. Now that we have access to these nodes from \LUA\ it makes sense to be able to carry more information with an node and this is where attributes kick in. \stopsubsection \startsubsection[title={Attribute registers}] \topicindex {attributes} Attributes are a completely new concept in \LUATEX. Syntactically, they behave a lot like counters: attributes obey \TEX's nesting stack and can be used after \prm {the} etc.\ just like the normal \prm {count} registers. \startsyntax \attribute <16-bit number> <32-bit number>!crlf \attributedef <16-bit number> \stopsyntax Conceptually, an attribute is either \quote {set} or \quote {unset}. Unset attributes have a special negative value to indicate that they are unset, that value is the lowest legal value: \type {-"7FFFFFFF} in hexadecimal, a.k.a. $-2147483647$ in decimal. It follows that the value \type {-"7FFFFFFF} cannot be used as a legal attribute value, but you {\it can\/} assign \type {-"7FFFFFFF} to \quote {unset} an attribute. All attributes start out in this \quote {unset} state in \INITEX. Attributes can be used as extra counter values, but their usefulness comes mostly from the fact that the numbers and values of all \quote {set} attributes are attached to all nodes created in their scope. These can then be queried from any \LUA\ code that deals with node processing. Further information about how to use attributes for node list processing from \LUA\ is given in~\in {chapter}[nodes]. Attributes are stored in a sorted (sparse) linked list that are shared when possible. This permits efficient testing and updating. You can define many thousands of attributes but normally such a large number makes no sense and is also not that efficient because each node carries a (possibly shared) link to a list of currently set attributes. But they are a convenient extension and one of the first extensions we implemented in \LUATEX. \stopsubsection \startsubsection[title={Box attributes}] \topicindex {attributes} \topicindex {boxes} Nodes typically receive the list of attributes that is in effect when they are created. This moment can be quite asynchronous. For example: in paragraph building, the individual line boxes are created after the \prm {par} command has been processed, so they will receive the list of attributes that is in effect then, not the attributes that were in effect in, say, the first or third line of the paragraph. Similar situations happen in \LUATEX\ regularly. A few of the more obvious problematic cases are dealt with: the attributes for nodes that are created during hyphenation, kerning and ligaturing borrow their attributes from their surrounding glyphs, and it is possible to influence box attributes directly. When you assemble a box in a register, the attributes of the nodes contained in the box are unchanged when such a box is placed, unboxed, or copied. In this respect attributes act the same as characters that have been converted to references to glyphs in fonts. For instance, when you use attributes to implement color support, each node carries information about its eventual color. In that case, unless you implement mechanisms that deal with it, applying a color to already boxed material will have no effect. Keep in mind that this incompatibility is mostly due to the fact that separate specials and literals are a more unnatural approach to colors than attributes. It is possible to fine-tune the list of attributes that are applied to a \type {hbox}, \type {vbox} or \type {vtop} by the use of the keyword \type {attr}. The \type {attr} keyword(s) should come before a \type {to} or \type {spread}, if that is also specified. An example is: \startbuffer[tex] \attribute997=123 \attribute998=456 \setbox0=\hbox {Hello} \setbox2=\hbox attr 999 = 789 attr 998 = -"7FFFFFFF{Hello} \stopbuffer \startbuffer[lua] for b=0,2,2 do for a=997, 999 do tex.sprint("box ", b, " : attr ",a," : ",tostring(tex.box[b] [a])) tex.sprint("\\quad\\quad") tex.sprint("list ",b, " : attr ",a," : ",tostring(tex.box[b].list[a])) tex.sprint("\\par") end end \stopbuffer \typebuffer[tex] Box 0 now has attributes 997 and 998 set while box 2 has attributes 997 and 999 set while the nodes inside that box will all have attributes 997 and 998 set. Assigning the maximum negative value causes an attribute to be ignored. To give you an idea of what this means at the \LUA\ end, take the following code: \typebuffer[lua] Later we will see that you can access properties of a node. The boxes here are so called \nod {hlist} nodes that have a field \type {list} that points to the content. Because the attributes are a list themselves you can access them by indexing the node (here we do that with \type {[a]}. Running this snippet gives: \start \getbuffer[tex] \startpacked \tt \ctxluabuffer[lua] \stoppacked \stop Because some values are not set we need to apply the \type {tostring} function here so that we get the word \type {nil}. \stopsubsection \stopsection \startsection[title={\LUA\ related primitives}] \startsubsection[title={\prm {directlua}}] In order to merge \LUA\ code with \TEX\ input, a few new primitives are needed. The primitive \prm {directlua} is used to execute \LUA\ code immediately. The syntax is \startsyntax \directlua !crlf \directlua <16-bit number> \stopsyntax The \syntax {} is expanded fully, and then fed into the \LUA\ interpreter. After reading and expansion has been applied to the \syntax {}, the resulting token list is converted to a string as if it was displayed using \type {\the\toks}. On the \LUA\ side, each \prm {directlua} block is treated as a separate chunk. In such a chunk you can use the \type {local} directive to keep your variables from interfering with those used by the macro package. The conversion to and from a token list means that you normally can not use \LUA\ line comments (starting with \type {--}) within the argument. As there typically will be only one \quote {line} the first line comment will run on until the end of the input. You will either need to use \TEX|-|style line comments (starting with \%), or change the \TEX\ category codes locally. Another possibility is to say: \starttyping \begingroup \endlinechar=10 \directlua ... \endgroup \stoptyping Then \LUA\ line comments can be used, since \TEX\ does not replace line endings with spaces. Of course such an approach depends on the macro package that you use. The \syntax {<16-bit number>} designates a name of a \LUA\ chunk and is taken from the \type {lua.name} array (see the documentation of the \type {lua} table further in this manual). When a chunk name starts with a \type {@} it will be displayed as a file name. This is a side effect of the way \LUA\ implements error handling. The \prm {directlua} command is expandable. Since it passes \LUA\ code to the \LUA\ interpreter its expansion from the \TEX\ viewpoint is usually empty. However, there are some \LUA\ functions that produce material to be read by \TEX, the so called print functions. The most simple use of these is \type {tex.print( s)}. The characters of the string \type {s} will be placed on the \TEX\ input buffer, that is, \quote {before \TEX's eyes} to be read by \TEX\ immediately. For example: \startbuffer \count10=20 a\directlua{tex.print(tex.count[10]+5)}b \stopbuffer \typebuffer expands to \getbuffer Here is another example: \startbuffer $\pi = \directlua{tex.print(math.pi)}$ \stopbuffer \typebuffer will result in \getbuffer Note that the expansion of \prm {directlua} is a sequence of characters, not of tokens, contrary to all \TEX\ commands. So formally speaking its expansion is null, but it places material on a pseudo-file to be immediately read by \TEX, as \ETEX's \prm {scantokens}. For a description of print functions look at \in {section} [sec:luaprint]. Because the \syntax {} is a chunk, the normal \LUA\ error handling is triggered if there is a problem in the included code. The \LUA\ error messages should be clear enough, but the contextual information is still pretty bad. Often, you will only see the line number of the right brace at the end of the code. While on the subject of errors: some of the things you can do inside \LUA\ code can break up \LUATEX\ pretty bad. If you are not careful while working with the node list interface, you may even end up with assertion errors from within the \TEX\ portion of the executable. \stopsubsection \startsubsection[title={\lpr {latelua} and \lpr {lateluafunction}}] Contrary to \prm {directlua}, \lpr {latelua} stores \LUA\ code in a whatsit that will be processed at the time of shipping out. Its intended use is a cross between \PDF\ literals (often available as \orm {pdfliteral}) and the traditional \TEX\ extension \prm {write}. Within the \LUA\ code you can print \PDF\ statements directly to the \PDF\ file via \type {pdf.print}, or you can write to other output streams via \type {texio.write} or simply using \LUA\ \IO\ routines. \startsyntax \latelua !crlf \latelua <16-bit number> \stopsyntax Expansion of macros in the final \type {} is delayed until just before the whatsit is executed (like in \prm {write}). With regard to \PDF\ output stream \lpr {latelua} behaves as \PDF\ page literals. The \syntax {name } and \syntax {<16-bit number>} behave in the same way as they do for \prm {directlua}. The \lpr {lateluafunction} primitive takes a number and is similar to \lpr {luafunction} but gets delated to shipout time. It's just there for completeness. \stopsubsection \startsubsection[title={\lpr {luaescapestring}}] \topicindex {escaping} This primitive converts a \TEX\ token sequence so that it can be safely used as the contents of a \LUA\ string: embedded backslashes, double and single quotes, and newlines and carriage returns are escaped. This is done by prepending an extra token consisting of a backslash with category code~12, and for the line endings, converting them to \type {n} and \type {r} respectively. The token sequence is fully expanded. \startsyntax \luaescapestring \stopsyntax Most often, this command is not actually the best way to deal with the differences between \TEX\ and \LUA. In very short bits of \LUA\ code it is often not needed, and for longer stretches of \LUA\ code it is easier to keep the code in a separate file and load it using \LUA's \type {dofile}: \starttyping \directlua { dofile('mysetups.lua') } \stoptyping \stopsubsection \startsubsection[title={\lpr {luafunction}, \lpr {luafunctioncall} and \lpr {luadef}}] The \prm {directlua} commands involves tokenization of its argument (after picking up an optional name or number specification). The tokenlist is then converted into a string and given to \LUA\ to turn into a function that is called. The overhead is rather small but when you have millions of calls it can have some impact. For this reason there is a variant call available: \lpr {luafunction}. This command is used as follows: \starttyping \directlua { local t = lua.get_functions_table() t[1] = function() tex.print("!") end t[2] = function() tex.print("?") end } \luafunction1 \luafunction2 \stoptyping Of course the functions can also be defined in a separate file. There is no limit on the number of functions apart from normal \LUA\ limitations. Of course there is the limitation of no arguments but that would involve parsing and thereby give no gain. The function, when called in fact gets one argument, being the index, so in the following example the number \type {8} gets typeset. \starttyping \directlua { local t = lua.get_functions_table() t[8] = function(slot) tex.print(slot) end } \stoptyping The \lpr {luafunctioncall} primitive does the same but is unexpandable, for instance in an \prm {edef}. In addition \LUATEX\ provides a definer: \starttyping \luadef\MyFunctionA 1 \global\luadef\MyFunctionB 2 \protected\global\luadef\MyFunctionC 3 \stoptyping You should really use these commands with care. Some references get stored in tokens and assume that the function is available when that token expands. On the other hand, as we have tested this functionality in relative complex situations normal usage should not give problems. \stopsubsection \startsubsection[title={\lpr {luabytecode} and \lpr {luabytecodecall}}] Analogue to the function callers discussed in the previous section we have byte code callers. Again the call variant is unexpandable. \starttyping \directlua { lua.bytecode[9998] = function(s) tex.sprint(s*token.scan_int()) end lua.bytecode[5555] = function(s) tex.sprint(s*token.scan_dimen()) end } \stoptyping This works with: \starttyping \luabytecode 9998 5 \luabytecode 5555 5sp \luabytecodecall9998 5 \luabytecodecall5555 5sp \stoptyping The variable \type {s} in the code is the number of the byte code register that can be used for diagnostic purposes. The advantage of bytecode registers over function calls is that they are stored in the format (but without upvalues). \stopsubsection \stopsection \startsection[title={Catcode tables}] \startsubsection[title={Catcodes}] \topicindex {catcodes} Catcode tables are a new feature that allows you to switch to a predefined catcode regime in a single statement. You can have a practically unlimited number of different tables. This subsystem is backward compatible: if you never use the following commands, your document will not notice any difference in behaviour compared to traditional \TEX. The contents of each catcode table is independent from any other catcode table, and its contents is stored and retrieved from the format file. \stopsubsection \startsubsection[title={\lpr {catcodetable}}] \startsyntax \catcodetable <15-bit number> \stopsyntax The primitive \lpr {catcodetable} switches to a different catcode table. Such a table has to be previously created using one of the two primitives below, or it has to be zero. Table zero is initialized by \INITEX. \stopsubsection \startsubsection[title={\lpr {initcatcodetable}}] \startsyntax \initcatcodetable <15-bit number> \stopsyntax The primitive \lpr {initcatcodetable} creates a new table with catcodes identical to those defined by \INITEX. The new catcode table is allocated globally: it will not go away after the current group has ended. If the supplied number is identical to the currently active table, an error is raised. The initial values are: \starttabulate[|c|c|l|l|] \DB catcode \BC character \BC equivalent \BC category \NC \NR \TB \NC 0 \NC \tttf \letterbackslash \NC \NC \type {escape} \NC \NR \NC 5 \NC \tttf \letterhat\letterhat M \NC return \NC \type {car_ret} \NC \NR \NC 9 \NC \tttf \letterhat\letterhat @ \NC null \NC \type {ignore} \NC \NR \NC 10 \NC \tttf \NC space \NC \type {spacer} \NC \NR \NC 11 \NC {\tttf a} \endash\ {\tttf z} \NC \NC \type {letter} \NC \NR \NC 11 \NC {\tttf A} \endash\ {\tttf Z} \NC \NC \type {letter} \NC \NR \NC 12 \NC everything else \NC \NC \type {other} \NC \NR \NC 14 \NC \tttf \letterpercent \NC \NC \type {comment} \NC \NR \NC 15 \NC \tttf \letterhat\letterhat ? \NC delete \NC \type {invalid_char} \NC \NR \LL \stoptabulate \stopsubsection \startsubsection[title={\lpr {savecatcodetable}}] \startsyntax \savecatcodetable <15-bit number> \stopsyntax \lpr {savecatcodetable} copies the current set of catcodes to a new table with the requested number. The definitions in this new table are all treated as if they were made in the outermost level. The new table is allocated globally: it will not go away after the current group has ended. If the supplied number is the currently active table, an error is raised. \stopsubsection \stopsection \startsection[title={Suppressing errors}] \startsubsection[title={\lpr {suppressfontnotfounderror}}] \topicindex {errors} If this integer parameter is non|-|zero, then \LUATEX\ will not complain about font metrics that are not found. Instead it will silently skip the font assignment, making the requested csname for the font \prm {ifx} equal to \prm {nullfont}, so that it can be tested against that without bothering the user. \startsyntax \suppressfontnotfounderror = 1 \stopsyntax \stopsubsection \startsubsection[title={\lpr {suppresslongerror}}] \topicindex {errors} If this integer parameter is non|-|zero, then \LUATEX\ will not complain about \prm {par} commands encountered in contexts where that is normally prohibited (most prominently in the arguments of macros not defined as \prm {long}). \startsyntax \suppresslongerror = 1 \stopsyntax \stopsubsection \startsubsection[title={\lpr {suppressifcsnameerror}}] \topicindex {errors} If this integer parameter is non|-|zero, then \LUATEX\ will not complain about non-expandable commands appearing in the middle of a \prm {ifcsname} expansion. Instead, it will keep getting expanded tokens from the input until it encounters an \prm {endcsname} command. If the input expansion is unbalanced with respect to \prm {csname} \ldots \prm {endcsname} pairs, the \LUATEX\ process may hang indefinitely. \startsyntax \suppressifcsnameerror = 1 \stopsyntax \stopsubsection \startsubsection[title={\lpr {suppressoutererror}}] \topicindex {errors} If this new integer parameter is non|-|zero, then \LUATEX\ will not complain about \prm {outer} commands encountered in contexts where that is normally prohibited. \startsyntax \suppressoutererror = 1 \stopsyntax \stopsubsection \startsubsection[title={\lpr {suppressmathparerror}}] \topicindex {errors} \topicindex {math} The following setting will permit \prm {par} tokens in a math formula: \startsyntax \suppressmathparerror = 1 \stopsyntax So, the next code is valid then: \starttyping $ x + 1 = a $ \stoptyping \stopsubsection \startsubsection[title={\lpr {suppressprimitiveerror}}] \topicindex {errors} \topicindex {primitives} When set to a non|-|zero value the following command will not issue an error: \startsyntax \suppressprimitiveerror = 1 \primitive\notaprimitive \stopsyntax \stopsubsection \stopsection \startsection[title={Fonts}] \startsubsection[title={Font syntax}] \topicindex {fonts} \LUATEX\ will accept a braced argument as a font name: \starttyping \font\myfont = {cmr10} \stoptyping This allows for embedded spaces, without the need for double quotes. Macro expansion takes place inside the argument. \stopsubsection \startsubsection[title={\lpr {fontid} and \lpr {setfontid}}] \startsyntax \fontid\font \stopsyntax This primitive expands into a number. It is not a register so there is no need to prefix with \prm {number} (and using \prm {the} gives an error). The currently used font id is \fontid\font. Here are some more: \starttabulate[|l|c|c|] \DB style \BC command \BC font id \NC \NR \TB \NC normal \NC \type {\tf} \NC \bf \fontid\font \NC \NR \NC bold \NC \type {\bf} \NC \bf \fontid\font \NC \NR \NC italic \NC \type {\it} \NC \it \fontid\font \NC \NR \NC bold italic \NC \type {\bi} \NC \bi \fontid\font \NC \NR \LL \stoptabulate These numbers depend on the macro package used because each one has its own way of dealing with fonts. They can also differ per run, as they can depend on the order of loading fonts. For instance, when in \CONTEXT\ virtual math \UNICODE\ fonts are used, we can easily get over a hundred ids in use. Not all ids have to be bound to a real font, after all it's just a number. The primitive \lpr {setfontid} can be used to enable a font with the given id, which of course needs to be a valid one. \stopsubsection \startsubsection[title={\lpr {noligs} and \lpr {nokerns}}] \topicindex {ligatures+suppress} \topicindex {kerns+suppress} These primitives prohibit ligature and kerning insertion at the time when the initial node list is built by \LUATEX's main control loop. You can enable these primitives when you want to do node list processing of \quote {characters}, where \TEX's normal processing would get in the way. \startsyntax \noligs !crlf \nokerns \stopsyntax These primitives can also be implemented by overloading the ligature building and kerning functions, i.e.\ by assigning dummy functions to their associated callbacks. Keep in mind that when you define a font (using \LUA) you can also omit the kern and ligature tables, which has the same effect as the above. \stopsubsection \startsubsection[title={\type{\nospaces}}] \topicindex {spaces+suppress} This new primitive can be used to overrule the usual \prm {spaceskip} related heuristics when a space character is seen in a text flow. The value~\type{1} triggers no injection while \type{2} results in injection of a zero skip. In \in {figure} [fig:nospaces] we see the results for four characters separated by a space. \startplacefigure[reference=fig:nospaces,title={The \lpr {nospaces} options.}] \startcombination[3*2] {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 10mm}} {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 10mm}} {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 10mm}} {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 1mm}} {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 1mm}} {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 1mm}} \stopcombination \stopplacefigure \stopsubsection \stopsection \startsection[title={Tokens, commands and strings}] \startsubsection[title={\lpr {scantextokens}}] \topicindex {tokens+scanning} The syntax of \lpr {scantextokens} is identical to \prm {scantokens}. This primitive is a slightly adapted version of \ETEX's \prm {scantokens}. The differences are: \startitemize \startitem The last (and usually only) line does not have a \prm {endlinechar} appended. \stopitem \startitem \lpr {scantextokens} never raises an EOF error, and it does not execute \prm {everyeof} tokens. \stopitem \startitem There are no \quote {\unknown\ while end of file \unknown} error tests executed. This allows the expansion to end on a different grouping level or while a conditional is still incomplete. \stopitem \stopitemize \stopsubsection \startsubsection[title={\lpr {toksapp}, \lpr {tokspre}, \lpr {etoksapp}, \lpr {etokspre}, \lpr {gtoksapp}, \lpr {gtokspre}, \lpr {xtoksapp}, \lpr {xtokspre}}] Instead of: \starttyping \toks0\expandafter{\the\toks0 foo} \stoptyping you can use: \starttyping \etoksapp0{foo} \stoptyping The \type {pre} variants prepend instead of append, and the \type {e} variants expand the passed general text. The \type {g} and \type {x} variants are global. \stopsubsection \startsubsection[title={\prm {csstring}, \lpr {begincsname} and \lpr {lastnamedcs}}] These are somewhat special. The \prm {csstring} primitive is like \prm {string} but it omits the leading escape character. This can be somewhat more efficient than stripping it afterwards. The \lpr {begincsname} primitive is like \prm {csname} but doesn't create a relaxed equivalent when there is no such name. It is equivalent to \starttyping \ifcsname foo\endcsname \csname foo\endcsname \fi \stoptyping The advantage is that it saves a lookup (don't expect much speedup) but more important is that it avoids using the \prm {if} test. The \lpr {lastnamedcs} is one that should be used with care. The above example could be written as: \starttyping \ifcsname foo\endcsname \lastnamedcs \fi \stoptyping This is slightly more efficient than constructing the string twice (deep down in \LUATEX\ this also involves some \UTF8 juggling), but probably more relevant is that it saves a few tokens and can make code a bit more readable. \stopsubsection \startsubsection[title={\lpr {clearmarks}}] \topicindex {marks} This primitive complements the \ETEX\ mark primitives and clears a mark class completely, resetting all three connected mark texts to empty. It is an immediate command. \startsyntax \clearmarks <16-bit number> \stopsyntax \stopsubsection \startsubsection[title={\lpr {alignmark} and \lpr {aligntab}}] The primitive \lpr {alignmark} duplicates the functionality of \type {#} inside alignment preambles, while \lpr {aligntab} duplicates the functionality of \type {&}. \stopsubsection \startsubsection[title={\lpr {letcharcode}}] This primitive can be used to assign a meaning to an active character, as in: \starttyping \def\foo{bar} \letcharcode123=\foo \stoptyping This can be a bit nicer than using the uppercase tricks (using the property of \prm {uppercase} that it treats active characters special). \stopsubsection \startsubsection[title={\lpr {glet}}] This primitive is similar to: \starttyping \protected\def\glet{\global\let} \stoptyping but faster (only measurable with millions of calls) and probably more convenient (after all we also have \type {\gdef}). \stopsubsection \startsubsection[title={\lpr {expanded}, \lpr {immediateassignment} and \lpr {immediateassigned}}] \topicindex {expansion} The \lpr {expanded} primitive takes a token list and expands it content which can come in handy: it avoids a tricky mix of \prm {expandafter} and \prm {noexpand}. You can compare it with what happens inside the body of an \prm {edef}. But this kind of expansion it still doesn't expand some primitive operations. \startbuffer \newcount\NumberOfCalls \def\TestMe{\advance\NumberOfCalls1 } \edef\Tested{\TestMe foo:\the\NumberOfCalls} \edef\Tested{\TestMe foo:\the\NumberOfCalls} \edef\Tested{\TestMe foo:\the\NumberOfCalls} \meaning\Tested \stopbuffer \typebuffer The result is a macro that has the not expanded code in its body: \getbuffer Instead we can define \tex {TestMe} in a way that expands the assignment immediately. You need of course to be aware of preventing look ahead interference by using a space or \tex {relax} (often an expression works better as it doesn't leave an \tex {relax}). \startbuffer \def\TestMe{\immediateassignment\advance\NumberOfCalls1 } \edef\Tested{\TestMe foo:\the\NumberOfCalls} \edef\Tested{\TestMe foo:\the\NumberOfCalls} \edef\Tested{\TestMe foo:\the\NumberOfCalls} \meaning\Tested \stopbuffer \typebuffer This time the counter gets updates and we don't see interference in the resulting \tex {Tested} macro: \getbuffer Here is a somewhat silly example of expanded comparison: \startbuffer \def\expandeddoifelse#1#2#3#4% {\immediateassignment\edef\tempa{#1}% \immediateassignment\edef\tempb{#2}% \ifx\tempa\tempb \immediateassignment\def\next{#3}% \else \immediateassignment\def\next{#4}% \fi \next} \edef\Tested {(\expandeddoifelse{abc}{def}{yes}{nop}/% \expandeddoifelse{abc}{abc}{yes}{nop})} \meaning\Tested \stopbuffer \typebuffer It gives: \getbuffer A variant is: \starttyping \def\expandeddoifelse#1#2#3#4% {\immediateassigned{ \edef\tempa{#1}% \edef\tempb{#2}% }% \ifx\tempa\tempb \immediateassignment\def\next{#3}% \else \immediateassignment\def\next{#4}% \fi \next} \stoptyping The possible error messages are the same as using assignments in preambles of alignments and after the \prm {accent} command. The supported assignments are the so called prefixed commands (except box assignments). \stopsubsection \startsubsection[title={\lpr {ifcondition}}] \topicindex {conditions} This is a somewhat special one. When you write macros conditions need to be properly balanced in order to let \TEX's fast branch skipping work well. This new primitive is basically a no||op flagged as a condition so that the scanner can recognize it as an if|-|test. However, when a real test takes place the work is done by what follows, in the next example \tex {something}. \starttyping \unexpanded\def\something#1#2% {\edef\tempa{#1}% \edef\tempb{#2} \ifx\tempa\tempb} \ifcondition\something{a}{b}% \ifcondition\something{a}{a}% true 1 \else false 1 \fi \else \ifcondition\something{a}{a}% true 2 \else false 2 \fi \fi \stoptyping If you are familiar with \METAPOST, this is a bit like \type {vardef} where the macro has a return value. Here the return value is a test. \stopsubsection \stopsection \startsection[title={Boxes, rules and leaders}] \startsubsection[title={\lpr {outputbox}}] \topicindex {output} This integer parameter allows you to alter the number of the box that will be used to store the page sent to the output routine. Its default value is 255, and the acceptable range is from 0 to 65535. \startsyntax \outputbox = 12345 \stopsyntax \stopsubsection \startsubsection[title={\prm {vpack}, \prm {hpack} and \prm {tpack}}] These three primitives are like \prm {vbox}, \prm {hbox} and \prm {vtop} but don't apply the related callbacks. \stopsubsection \startsubsection[title={\prm {vsplit}}] \topicindex {splitting} The \prm {vsplit} primitive has to be followed by a specification of the required height. As alternative for the \type {to} keyword you can use \type {upto} to get a split of the given size but result has the natural dimensions then. \stopsubsection \startsubsection[title={Images and reused box objects},reference=sec:imagedandforms] These two concepts are now core concepts and no longer whatsits. They are in fact now implemented as rules with special properties. Normal rules have subtype~0, saved boxes have subtype~1 and images have subtype~2. This has the positive side effect that whenever we need to take content with dimensions into account, when we look at rule nodes, we automatically also deal with these two types. The syntax of the \type {\save...resource} is the same as in \PDFTEX\ but you should consider them to be backend specific. This means that a macro package should treat them as such and check for the current output mode if applicable. \starttabulate[|l|p|] \DB command \BC explanation \NC \NR \TB \NC \lpr {saveboxresource} \NC save the box as an object to be included later \NC \NR \NC \lpr {saveimageresource} \NC save the image as an object to be included later \NC \NR \NC \lpr {useboxresource} \NC include the saved box object here (by index) \NC \NR \NC \lpr {useimageresource} \NC include the saved image object here (by index) \NC \NR \NC \lpr {lastsavedboxresourceindex} \NC the index of the last saved box object \NC \NR \NC \lpr {lastsavedimageresourceindex} \NC the index of the last saved image object \NC \NR \NC \lpr {lastsavedimageresourcepages} \NC the number of pages in the last saved image object \NC \NR \LL \stoptabulate \LUATEX\ accepts optional dimension parameters for \type {\use...resource} in the same format as for rules. With images, these dimensions are then used instead of the ones given to \lpr {useimageresource} but the original dimensions are not overwritten, so that a \lpr {useimageresource} without dimensions still provides the image with dimensions defined by \lpr {saveimageresource}. These optional parameters are not implemented for \lpr {saveboxresource}. \starttyping \useimageresource width 20mm height 10mm depth 5mm \lastsavedimageresourceindex \useboxresource width 20mm height 10mm depth 5mm \lastsavedboxresourceindex \stoptyping The box resources are of course implemented in the backend and therefore we do support the \type {attr} and \type {resources} keys that accept a token list. New is the \type {type} key. When set to non|-|zero the \type {/Type} entry is omitted. A value of 1 or 3 still writes a \type {/BBox}, while 2 or 3 will write a \type {/Matrix}. \stopsubsection \startsubsection[title={\lpr {nohrule} and \lpr {novrule}}] \topicindex {rules} Because introducing a new keyword can cause incompatibilities, two new primitives were introduced: \lpr {nohrule} and \lpr {novrule}. These can be used to reserve space. This is often more efficient than creating an empty box with fake dimensions. \stopsubsection \startsubsection[title={\lpr {gleaders}}] \topicindex {leaders} This type of leaders is anchored to the origin of the box to be shipped out. So they are like normal \prm {leaders} in that they align nicely, except that the alignment is based on the {\it largest\/} enclosing box instead of the {\it smallest\/}. The \type {g} stresses this global nature. \stopsubsection \stopsection \startsection[title={Languages}] \startsubsection[title={\lpr {hyphenationmin}}] \topicindex {languages} \topicindex {hyphenation} This primitive can be used to set the minimal word length, so setting it to a value of~$5$ means that only words of 6 characters and more will be hyphenated, of course within the constraints of the \prm {lefthyphenmin} and \prm {righthyphenmin} values (as stored in the glyph node). This primitive accepts a number and stores the value with the language. \stopsubsection \startsubsection[title={\prm {boundary}, \prm {noboundary}, \prm {protrusionboundary} and \prm {wordboundary}}] The \prm {noboundary} command is used to inject a whatsit node but now injects a normal node with type \nod {boundary} and subtype~0. In addition you can say: \starttyping x\boundary 123\relax y \stoptyping This has the same effect but the subtype is now~1 and the value~123 is stored. The traditional ligature builder still sees this as a cancel boundary directive but at the \LUA\ end you can implement different behaviour. The added benefit of passing this value is a side effect of the generalization. The subtypes~2 and~3 are used to control protrusion and word boundaries in hyphenation and have related primitives. \stopsubsection \startsubsection[title={\prm {glyphdimensionsmode}}] Already in the early days of \LUATEX\ the decision was made to calculate the effective height and depth of glyphs in a way that reflected the applied vertical offset. The height got that offset added, the depth only when the offset was larger than zero. We can now control this in more detail with this mode parameter. An offset is added to the height and|/|or subtracted from the depth. The effective values are never negative. The zero mode is the default. \starttabulate[|l|pl|] \DB value \BC effect \NC\NR \TB \NC \type {0} \NC the old behaviour: add the offset to the height and only subtract the offset only from the depth when it is positive \NC \NR \NC \type {1} \NC add the offset to the height and subtract it from the depth \NC \NR \NC \type {2} \NC add the offset to the height and subtract it from the depth but keep the maxima of the current and previous results \NC \NR \NC \type {3} \NC use the height and depth of the glyph, so no offset is applied \NC \NR \LL \stoptabulate \stopsubsection \stopsection \startsection[title={Control and debugging}] \startsubsection[title={Tracing}] \topicindex {tracing} If \prm {tracingonline} is larger than~2, the node list display will also print the node number of the nodes. \stopsubsection \startsubsection[title={\lpr {outputmode}}] \topicindex {output} \topicindex {backend} The \lpr {outputmode} variable tells \LUATEX\ what it has to produce: \starttabulate[|l|l|] \DB value \BC output \NC \NR \TB \NC \type {0} \NC \DVI\ code \NC \NR \NC \type {1} \NC \PDF\ code \NC \NR \LL \stoptabulate \stopsubsection \startsubsection[title={\lpr {draftmode}}] The value of the \lpr {draftmode} counter signals the backend if it should output less. The \PDF\ backend accepts a value of~1, while the \DVI\ backend ignores the value. This is no critical feature so we can remove it in future versions when it can make the backend cleaner. \stopsubsection \stopsection \startsection[title={Files}] \startsubsection[title={File syntax}] \topicindex {files+names} \LUATEX\ will accept a braced argument as a file name: \starttyping \input {plain} \openin 0 {plain} \stoptyping This allows for embedded spaces, without the need for double quotes. Macro expansion takes place inside the argument. The \lpr {tracingfonts} primitive that has been inherited from \PDFTEX\ has been adapted to support variants in reporting the font. The reason for this extension is that a csname not always makes sense. The zero case is the default. \starttabulate[|l|l|] \DB value \BC reported \NC \NR \TB \NC \type{0} \NC \type{\foo xyz} \NC \NR \NC \type{1} \NC \type{\foo (bar)} \NC \NR \NC \type{2} \NC \type{ xyz} \NC \NR \NC \type{3} \NC \type{ xyz} \NC \NR \NC \type{4} \NC \type{} \NC \NR \NC \type{5} \NC \type{} \NC \NR \NC \type{6} \NC \type{ xyz} \NC \NR \LL \stoptabulate \stopsubsection \startsubsection[title={Writing to file}] \topicindex {files+writing} You can now open upto 127 files with \prm {openout}. When no file is open writes will go to the console and log. As a consequence a system command is no longer possible but one can use \type {os.execute} to do the same. \stopsubsection \stopsection \startsection[title={Math}] \topicindex {math} We will cover math extensions in its own chapter because not only the font subsystem and spacing model have been enhanced (thereby introducing many new primitives) but also because some more control has been added to existing functionality. Much of this relates to the different approaches of traditional \TEX\ fonts and \OPENTYPE\ math. \stopsection \stopchapter \stopcomponent