General overview of the program. (1) We read in the conversion directives and store them. (2) We read in the font information in the ChiWriter file and match it up against the information in the conversion directives. A translation table mapping ChiWriter file font numbers <---> Conversion file font numbers is built. (3) We read in the ChiWriter document, one line at a time, into a buffer. A ChiWriter line contains several super/subscripts. They are put into a rectangular buffer (or rather 2 parallel ones, one for font, one for character code.) +---------------------------------+ | 2 2 2ãi/n | | x + y = 1 if é = e and | <-baseRow | i i | <-lastRow +---------------------------------+ (4) Only looking at the buffer rectangle, we convert and write a line at a time. First, we search the block substitutions. After that we search for indices boundaries. Indices of current level must lie in rectangle, the left side of which is the leftmost column with non-empty symbol in it above or below the current baserow, from up- and down-sides it is bounded by the boundaries of rectangle of previous level and the current baserow. The right boundary is whether before the first symbol after the spaces in the current baserow or is defined by attributes of the letters in the current baserow. If the leftmost column in this rectangle contains only one symbol this row becomes the current baserow in this new rectangle. Else these rectangle is being splitting on the narrower ones along the empy rows, if this option is specified and this is possible. Simultaneously with the search for indices boundaries the stacked symbols are searched. Syntax of Chiwriter Files \+ superscript row follows \- subscript row follows \= end of text block \Hx, \Fx header/footer, x=D for default, E for even, 1..9 for page 1..9, \S separator \Nx footnote text \Uxname x=1..0, !..*, name = font name using font# x for the font with the given name \0 ... \9 font change (1-10) \! ... \* font change (11-20) \ soft space \, hard return \/ no/soft page break \G graphics \A after soft hyphen \\ backslash \@ page number \^ expanding marker (for centering) \[ tab \]x reverse tab \Fx footnote -------------------------------------------------------------------------*/ Syntax of Conversion Map file Each line of the map file contains a command, starting with a 2 letter command code. The code is followed by arguments, depending on the command. CH ch out-string [a] [b] [c] [M] [m] [C nnn] [P before-string after-string] [B before-string after-string] [N] a b c option govern the indices boundary recognition. a - search after b - search before c - don't search between two letters with this option (even if there is a or b option) P and B options govern the search of stacked symbols. P - if this symbol is over another (which is in the temporary baseline of indices search) and inside the current rectangular of this search, then insert before-string and after-string around the symbol in the baseline. B - the same with "under". N - don't search stacked symbols above and under this symbol. C nnn - denote the category of this character relative to word-searching algorithm. * 0 | l | L means letter * 1 | , means punctuation (non-.) * 2 | ? means unknown * 3 | + means math (default <=>+) * 4 | . means dot * 5 | ) means ) * 6 | n | N means nonmath (default ") * 7 | ! means non-math letter (as †) * 8 | ' means ' (effects on the single letters * compare I'm, I'd) * 9 | - means - M(ultiline), m(ultiline) - for Begin and end of multiline search Block substitution is allowed. It is done BEFORE the sub- and super-script search, so it affect one. The search is made by left-upper angle, (or by explicitely specified ^ point) so it is done recursively from left to right and from top to bottom, but without back step. The font of key letter (from the left-upper angle) must be specified exactly. Syntax is (FOR BLOCK 3x3) BB [a] [b] [nnnn] B1 from11 from12 from13 B1 from21 from22 from23 B1 from31 from32 from33 B2 to11 to12 to13 B2 to21 to22 to23 B2 to31 to32 to33 BE where from/toIJ is SP (for space) || * (for everything) || CH ft ch || CH * ch || FT ft# In block descriptor is possible to use ^ (modifier in the first part and fild in the second). In the first part of descriptor it denote the letter from which the search must begin, in the second the substitution of letter, denote in the first by ^ modifier. After BB you can specify options [a] [b] [nnnn] a denotes that baseline must be after block b denotes that baseline must be before block nnnn denotes that baseline must be in the nnnn'th line of the block RC ch - character to denote Fakereturn TC ch - character to denote FakeTAB RT CH ft# ch - character to replace HardReturn (will be decoded later) TA CH ft# ch - character to replace HardTAB (will be decoded later) CT CH ft# ch - character to replace page counter (will be decoded later) CE CH ft# ch - character to replace centering sign (will be decoded later) SE str1 str2 - strings to surround the separator block FR str1 str2 str3 - strings to surround the footer block (str2 is inserted after footernumber) HE str1 str2 str3 - strings to surround the header block (str2 is inserted after headernumber) FN str1 str2 str3 ft# - the same as for header-footer and font to replace reference (will be decoded later) TX str1 str2 - strings to surround the text type font block inside math MU str1 str2 str3 - strings to begin, separate rows and end multi-row deciphering session (which begins and ends with attribute 'M','m' characters) MB str1 str2 str3 - strings to begin, separate rows and end multi-row deciphering session in subscripts MP str1 str2 str3 - strings to begin, separate rows and end multi-row deciphering session in superscripts MC nnn - maximal length of block substitution cycle at one location FT xx in-string out-string [ t | p ] 't' denote that this pattern is text pattern and in math surround must be surrounded by additional TX attributes 'p' is for phantom font, that can be later translated into math or text VE ... - this line will be printed ; ... - comment line - not inside the block! SP ch - for fakeSPACE symbol EM ch - for fakeEmpty symbol, that denotes empty string ES ch - for Escape symbol, that denotes begin and end of special fonts FO ... in FO strings multiple reference to patterns is admissible AS ft1 ft2 - makes characters substitution table in ft1 as in ft2 (no changes, please! - they refer to the same table) LL xx - for linelength MA xx - no of font with plain in- and out-strings for math OR xx - no of font without in- and out-strings for ends of paragraphs RN ftfrom ftto - to rename fontfrom in math words (i.e., in short words outside nonmath list) to ftto NW - to input NEW nonmath list - for new font NM word - nonmath list (in alphabetical order) EN - end of nonmath list TI ft1 ch1 ft2 ch2 ... - list of tie characters - effects on punctuation mark and short words deciphering Notes. FO fontname font# ... to declare a font. The numbers should be between 1 and 20. They are used in other directives to avoid constantly writing out the full font name. If the numbers differ from the order of fonts in the ChiWriter file, they are automatically rearranged. e.g. FO ITALIC 3 FO GREEK 7 FO LINEDRAW 8 FO MATHII 10 STANDARD should always be 1. CH font# char-replacement ... e.g. CH 7 a \alpha to replace all Greek "a" by "\alpha" If the font number is 0, the replacement is done for all fonts. FT font# On-command Off-command ... to toggle a font on/off. e.g. FT 3 \it \rm To embed a blank into a TeX code word, use a þ (ASCII 254). (or specified by SP directive character). Fascinating trivia fact: On some European keyboards, this code is produced when hitting the "umlaut" key and [Space].