% This file generates the user manual; TeX it, don't read it! \def\tangref{3} % where the main explanation of CTANGLing is given \input cwebmac \pdffalse\acrohintfalse \def\page{\box255 } \normalbottom \parskip 0pt plus 1pt \def\RA{\char'31 } % right arrow \def\hang{\hangindent 4em\ignorespaces} \font\eightrm=cmr8 \font\ninerm=cmr9 \font\ninett=cmtt9 \font\eighttt=cmtt8 \font\twelvett=cmtt12 \font\quoterm=cmssq8 \font\quoteit=cmssqi8 \font\authorfont=cmr12 \font\sectionfont=cmbx12 \def\pb{\.{|...|}} \def\v{\.{\char'174}} % vertical (|) in typewriter font \def\lpile{\def\cr{\hfill\endline}\matrix} % I only use \lpile by itself \abovedisplayskip=.5\abovedisplayskip \belowdisplayskip=.5\belowdisplayskip \abovedisplayshortskip=.5\abovedisplayshortskip \belowdisplayshortskip=.5\belowdisplayshortskip \advance\baselineskip by -.5pt \advance\pageheight by \baselineskip % the manual just got a bit longer \advance\fullpageheight by \baselineskip \setpage \outer\def\section #1.{\penalty-500\bigskip \centerline{\sectionfont\def\.##1{{\twelvett##1}} #1}\nobreak\vskip 6pt \everypar{\hskip-\parindent\everypar{}}} \def\lheader{\mainfont\the\pageno\hfill\sc\runninghead\hfill} \def\rheader{\hfill\sc\runninghead\hfill\mainfont\the\pageno} \def\runninghead{{\tentt CWEB} USER MANUAL (VERSION 4.11)} % This verbatim mode assumes that ! marks are !! in the text being copied. \def\verbatim{\begingroup \def\do##1{\catcode`##1=12 } \dospecials \parskip 0pt \parindent 0pt \let\!=! \catcode`\ =13 \catcode`\^^M=13 \tt \catcode`\!=0 \verbatimdefs \verbatimgobble} {\catcode`\^^M=13{\catcode`\ =13\gdef\verbatimdefs{\def^^M{\ \par}\let =\ }} % \gdef\verbatimgobble#1^^M{}} \null\vfill \centerline{\titlefont The {\ttitlefont CWEB} System of Structured Documentation} \vskip 18pt\centerline{(Version 4.11 --- December 2023)} \vskip 24pt \centerline{\authorfont Donald E. Knuth and Silvio Levy} \vfill \noindent \TeX\ is a trademark of the American Mathematical Society. \bigskip\noindent The printed form of this manual is copyright \copyright\ 1994 by Addison-Wesley Publishing Company, Inc. All rights reserved. \smallskip\noindent The electronic form is copyright \copyright\ 1987, 1990, 1993, 2000 by Silvio Levy and Donald E. Knuth. \bigskip\noindent Permission is granted to make and distribute verbatim copies of the electronic form of this document provided that the electronic copyright notice and this permission notice are preserved on all copies. \smallskip\noindent Permission is granted to copy and distribute modified versions of the electronic form of this document under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. \smallskip\noindent Individuals may make copies of the documentation from the electronic files for their own personal use. \smallskip\noindent Internet page \.{http://www-cs-faculty.stanford.edu/\char`\~knuth/cweb.html} contains current info about \.{CWEB} and related topics. \smallskip\noindent From there you can also reach the \.{CWEB} development page \.{https://github.com/ascherer/cweb} with the really current news. \pageno=0 \titletrue\eject \titletrue \centerline{\titlefont The {\ttitlefont CWEB} System of Structured Documentation} \vskip 15pt plus 3pt minus 3pt \centerline{\authorfont Donald E. Knuth and Silvio Levy} \vskip 24pt plus 3pt minus 3pt \noindent This document describes a version of Don Knuth's \.{WEB} system, adapted to \CEE/ by Silvio Levy. Since its creation in 1987, \.{CWEB} has been revised and enhanced in various ways, by both Knuth and Levy. We now believe that its evolution is near an end; however, bug reports, suggestions and comments are still welcome, and should be sent to the \TeX-related mailing list \.{tex-k@tug.org}. Readers who are familiar with Knuth's memo ``The \.{WEB} System of Structured Documentation'' will be able to skim this material rapidly, because \.{CWEB} and \.{WEB} share the same philosophy and (essentially) the same syntax. In some respects \.{CWEB} is a simplification of \.{WEB}: for example, \.{CWEB} does not need \.{WEB}'s features for macro definition and string handling, because \CEE/ and its preprocessor already take care of macros and strings. Similarly, the \.{WEB} conventions of denoting octal and hexadecimal constants by \.{@'77} and \.{@"3f} are replaced by \CEE/'s conventions \.{077} and \.{0x3f}. All other features of \.{WEB} have been retained, and new features have been added. We thank all who contributed suggestions and criticism to the development of \.{CWEB}. We are especially grateful to Steve Avery, Nelson Beebe, Hans-Hermann Bode, Klaus Guntermann, Norman Ramsey, Joachim Schnitter, and Saroj Mahapatra, who contributed code, and to Cameron Smith, who made many suggestions improving the manual. Ramsey has made literate programming accessible to users of yet other languages by means of his \.{SPIDER} system [see {\sl Communications of the ACM\/ \bf32} (1989), 1051--1055]. The book {\sl Literate Programming\/} by Knuth (1992) contains a comprehensive bibliography of related early work. Bode, Schnitter, and Mahapatra adapted \.{CWEB} so that it works for \CPLUSPLUS/ as well; therefore in the text below you can read \CPLUSPLUS/ for \CEE/ if you so desire. \section Introduction. The philosophy behind \.{CWEB} is that programmers who want to provide the best possible documentation for their programs need two things simultaneously: a language like \TEX/ for formatting, and a language like \CEE/ for programming. Neither type of language can provide the best documentation by itself. But when both are appropriately combined, we obtain a system that is much more useful than either language separately. The structure of a software program may be thought of as a ``web'' that is made up of many interconnected pieces. To document such a program, we want to explain each individual part of the web and how it relates to its neighbors. The typographic tools provided by \TEX/ give us an opportunity to explain the local structure of each part by making that structure visible, and the programming tools provided by \CEE/ make it possible for us to specify the algorithms formally and unambiguously. By combining the two, we can develop a style of programming that maximizes our ability to perceive the structure of a complex piece of software, and at the same time the documented programs can be mechanically translated into a working software system that matches the documentation. The \.{CWEB} system consists of two programs named \.{CWEAVE} and \.{CTANGLE}. When writing a \.{CWEB} program the user keeps the \CEE/ code and the documentation in the same file, called the \.{CWEB} file and generally named \.{something.w}. The command `\.{cweave} \.{something}' creates an output file \.{something.tex}, which can then be fed to \TEX/, yielding a ``pretty printed'' version of \.{something.w} that correctly handles typographic details like page layout and the use of indentation, italics, boldface, and mathematical symbols. The typeset output also includes extensive cross-index information that is gathered automatically. Similarly, if you run the command `\.{ctangle} \.{something}' you will get a \CEE/ file \.{something.c}, which can then be compiled to yield executable code. Besides providing a documentation tool, \.{CWEB} enhances the \CEE/ language by providing the ability to permute pieces of the program text, so that a large system can be understood entirely in terms of small sections and their local interrelationships. The \.{CTANGLE} program is so named because it takes a given web and moves the sections from their web structure into the order required by \CEE/; the advantage of programming in \.{CWEB} is that the algorithms can be expressed in ``untangled'' form, with each section explained separately. The \.{CWEAVE} program is so named because it takes a given web and intertwines the \TEX/ and \CEE/ portions contained in each section, then it knits the whole fabric into a structured document. (Get it? Wow.) Perhaps there is some deep connection here with the fact that the German word for ``weave'' is ``{\it webe\/}'', and the corresponding Latin imperative is ``{\it texe\/}''! A user of \.{CWEB} should be fairly familiar with the \CEE/ programming language. A minimal amount of acquaintance with \TEX/ is also desirable, but in fact it can be acquired as one uses \.{CWEB}, since straight text can be typeset in \TEX/ with virtually no knowledge of that language. To someone familiar with both \CEE/ and \TEX/ the amount of effort necessary to learn the commands of \.{CWEB} is small. \section Overview. Two kinds of material go into \.{CWEB} files: \TEX/ text and \CEE/ text. A programmer writing in \.{CWEB} should be thinking both of the documentation and of the \CEE/ program being created; i.e., the programmer should be instinctively aware of the different actions that \.{CWEAVE} and \.{CTANGLE} will perform on the \.{CWEB} file. \TEX/ text is essentially copied without change by \.{CWEAVE}, and it is entirely deleted by \.{CTANGLE}; the \TEX/ text is ``pure documentation.'' \CEE/ text, on the other hand, is formatted by \.{CWEAVE} and it is shuffled around by \.{CTANGLE}, according to rules that will become clear later. For now the important point to keep in mind is that there are two kinds of text. Writing \.{CWEB} programs is something like writing \TEX/ documents, but with an additional ``\CEE/ mode'' that is added to \TEX/'s horizontal mode, vertical mode, and math mode. A \.{CWEB} file is built up from units called {\sl sections\/} that are more or less self-contained. Each section has three parts: \yskip\item{$\bullet$} A \TEX/ part, containing explanatory material about what is going on in the section. \item{$\bullet$} A middle part, containing macro definitions that serve as abbreviations for \CEE/ constructions that would be less comprehensible if written out in full each time. They are turned by \.{CTANGLE} into preprocessor macro definitions. \item{$\bullet$} A \CEE/ part, containing a piece of the program that \.{CTANGLE} will produce. This \CEE/ code should ideally be about a dozen lines long, so that it is easily comprehensible as a unit and so that its structure is readily perceived. \yskip\noindent The three parts of each section must appear in this order; i.e., the \TEX/ commentary must come first, then the middle part, and finally the \CEE/ code. Any of the parts may be empty. A section begins with either of the symbols `\.{@\ }' or `\.{@*}', where `\.{\ }' denotes a blank space. A section ends at the beginning of the next section (i.e., at the next `\.{@\ }' or `\.{@*}'), or at the end of the file, whichever comes first. The \.{CWEB} file may also contain material that is not part of any section at all, namely the text (if any) that occurs before the first section. Such text is said to be ``in limbo''; it is ignored by \.{CTANGLE} and copied essentially verbatim by \.{CWEAVE}, so its function is to provide any additional formatting instructions that may be desired in the \TEX/ output. Indeed, it is customary to begin a \.{CWEB} file with \TEX/ code in limbo that loads special fonts, defines special macros, changes the page sizes, and/or produces a title page. Sections are numbered consecutively, starting with 1. These numbers appear at the beginning of each section of the \TEX/ documentation output by \.{CWEAVE}, and they appear as bracketed comments at the beginning and end of the code generated by that section in the \CEE/ program output by \.{CTANGLE}. \section Section Names. Fortunately, you never mention these numbers yourself when you are writing in \.{CWEB}. You just say `\.{@\ }' or `\.{@*}' at the beginning of each new section, and the numbers are supplied automatically by \.{CWEAVE} and \.{CTANGLE}. As far as you are concerned, a section has a {\sl name\/} instead of a number; its name is specified by writing `\.{@<}' followed by \TEX/ text followed by `\.{@>}'. When \.{CWEAVE} outputs a section name, it replaces the `\.{@<}' and `\.{@>}' by angle brackets and inserts the section number in small type. Thus, when you read the output of \.{CWEAVE} it is easy to locate any section that is referred to in another section. For expository purposes, a section name should be a good description of the contents of that section; i.e., it should stand for the abstraction represented by the section. Then the section can be ``plugged into'' one or more other sections in such a way that unimportant details of its inner workings are suppressed. A section name therefore ought to be long enough to convey the necessary meaning. Unfortunately, it is laborious to type such long names over and over again, and it is also difficult to specify a long name twice in exactly the same way so that \.{CWEAVE} and \.{CTANGLE} will be able to match the names to the sections. To ameliorate this situation, \.{CWEAVE} and \.{CTANGLE} let you abbreviate a section name, so long as the full name appears somewhere in the \.{CWEB} file; you can type simply `\.{@<$\alpha$...@>}', where $\alpha$ is any string that is a prefix of exactly one section name appearing in the file. For example, `\.{@}' can be abbreviated to `\.{@}' if no other section name begins with the five letters `\.{Clear}'. Elsewhere you might use the abbreviation `\.{@}', and so on. Section names must otherwise match character for character, except that consecutive characters of white space (spaces, tab marks, newlines, and/or form feeds) are treated as equivalent to a single space, and such spaces are deleted at the beginning and end of the name. Thus, `\.{@< Clear { }the arrays @>}' will also match the name in the previous example. Spaces following the ellipsis in abbreviations are ignored as well, but not those before, so that `\.{@}' would not match `\.{@}'. \section What \.{CTANGLE} Does. We have said that a section begins with `\.{@\ }' or `\.{@*}', but we didn't say how it gets divided up into a \TEX/ part, a middle part, and a \CEE/ part. The middle part begins with the first appearance of `\.{@d}' or `\.{@f}' in the section, and the \CEE/ part begins with the first appearance of `\.{@c}' or `\.{@
=}'. In the latter case you are saying, in effect, that the section name stands for the \CEE/ text that follows. Alternatively, if the \CEE/ part begins with `\.{@c}' instead of a section name, the current section is said to be {\sl unnamed}. The construct `\.{@
}' can appear any number of times in the \CEE/ part of a section: Subsequent appearances indicate that a named section is being ``used'' rather than ``defined.'' In other words, the \CEE/ code for the named section, presumably defined elsewhere, should be spliced in at this point in the \CEE/ program. Indeed, the main idea of \.{CTANGLE} is to make a \CEE/ program out of individual sections, named and unnamed. The exact way in which this is done is this: First all the macro definitions indicated by `\.{@d}' are turned into \CEE/ preprocessor macro definitions and copied at the beginning. Then the \CEE/ parts of unnamed sections are copied down, in order; this constitutes the first-order approximation to the text of the program. (There should be at least one unnamed section, otherwise there will be no program.) Then all section names that appear in the first-order approximation are replaced by the \CEE/ parts of the corresponding sections, and this substitution process continues until no section names remain. All comments are removed, because the \CEE/ program is intended only for the eyes of the \CEE/ compiler. If the same name has been given to more than one section, the \CEE/ text for that name is obtained by putting together all of the \CEE/ parts in the corresponding sections. This feature is useful, for example, in a section named `Global variables', since one can then declare global variables in whatever sections those variables are introduced. When several sections have the same name, \.{CWEAVE} assigns the first section number as the number corresponding to that name, and it inserts a note at the bottom of that section telling the reader to `See also sections so-and-so'; this footnote gives the numbers of all the other sections having the same name as the present one. The \CEE/ text corresponding to a section is usually formatted by \.{CWEAVE} so that the output has an equivalence sign in place of the equals sign in the \.{CWEB} file; i.e., the output says `$\langle\,$section name$\,\rangle\equiv\null$\CEE/ text'. However, in the case of the second and subsequent appearances of a section with the same name, this `$\equiv$' sign is replaced by `$\mathrel+\equiv$', as an indication that the following \CEE/ text is being appended to the \CEE/ text of another section. As \.{CTANGLE} enters and leaves sections, it inserts preprocessor \.{\#line} commands into the \CEE/ output file. This means that when the compiler gives you error messages, or when you debug your program, the messages refer to line numbers in the \.{CWEB} file, and not in the \CEE/ file. In most cases you can therefore forget about the \CEE/ file altogether. \section What \.{CWEAVE} Does. The general idea of \.{CWEAVE} is to make a \.{.tex} file from the \.{CWEB} file in the following way: The first line of the \.{.tex} file tells \TEX/ to input a file with macros that define \.{CWEB}'s documentation conventions. The next lines of the file will be copied from whatever \TEX/ text is in limbo before the first section. Then comes the output for each section in turn, possibly interspersed with end-of-page marks. Finally, \.{CWEAVE} will generate a cross-reference index that lists each section number in which each \CEE/ identifier appears, and it will also generate an alphabetized list of the section names, as well as a table of contents that shows the page and section numbers for each ``starred'' section. What is a ``starred'' section, you ask? A section that begins with `\.{@*}' instead of `\.{@\ }' is slightly special in that it denotes a new major group of sections. The `\.{@*}' should be followed by the title of this group, followed by a period. Such sections will always start on a new page in the \TEX/ output, and the group title will appear as a running headline on all subsequent pages until the next starred section. The title will also appear in the table of contents, and in boldface type at the beginning of its section. Caution: Do not use \TEX/ control sequences in such titles, unless you know that the \.{cwebmac} macros will do the right thing with them. The reason is that these titles are converted to uppercase when they appear as running heads, and they are converted to boldface when they appear at the beginning of their sections, and they are also written out to a table-of-contents file used for temporary storage while \TEX/ is working; whatever control sequences you use must be meaningful in all three of these modes. The \TEX/ output produced by \.{CWEAVE} for each section consists of the following: First comes the section number (e.g., `\.{\\M123.}' at the beginning of section 123, except that `\.{\\N}' appears in place of `\.{\\M}' at the beginning of a starred section). Then comes the \TEX/ part of the section, copied almost verbatim except as noted below. Then comes the middle part and the \CEE/ part, formatted so that there will be a little extra space between them if both are nonempty. The middle and \CEE/ parts are obtained by inserting a bunch of funny-looking \TEX/ macros into the \CEE/ program; these macros handle typographic details about fonts and proper math spacing, as well as line breaks and indentation. \section C Code in \TEX/ Text and Vice Versa. When you are typing \TEX/ text, you will probably want to make frequent reference to variables and other quantities in your \CEE/ code, and you will want those variables to have the same typographic treatment when they appear in your text as when they appear in your program. Therefore the \.{CWEB} language allows you to get the effect of \CEE/ editing within \TEX/ text, if you place `\.|' marks before and after the \CEE/ material. For example, suppose you want to say something like this: $$\hbox{If \PB{\\{pa}} is declared as `\PB{\&{int} ${}{*}\\{pa}$}', the assignment \PB{$\\{pa}\K{\AND}\|a[\T{0}]$} makes \PB{\\{pa}} point to the zeroth element of \PB{\|a}.}$$ The \TEX/ text would look like this in your \.{CWEB} file: $$\lpile{\.{If |pa| is declared as `|int *pa|', the assignment}\cr \.{|pa=\&a[0]| makes |pa| point to the zeroth element of |a|.}\cr}$$ And \.{CWEAVE} translates this into something you are glad you didn't have to type: $$\lpile{\.{If \\PB\{\\\\\{pa\}\} is declared as `\\PB\{\\\&\{int\} \$\{\}\{*\}\\\\\{pa\}\$\}', the}\cr \.{assignment \\PB\{\$\\\\\{pa\}\\K\{\\AND\}\\|a[\\T\{0\}]\$\} makes \\PB\{\\\\\{pa\}\} point}\cr \.{to the zeroth element of \\PB\{\\|a\}.}\cr}$$ Incidentally, the cross-reference index that \.{CWEAVE} would make, in the presence of a comment like this, would include the current section number as one of the index entries for \\{pa}, even though \\{pa} might not appear in the \CEE/ part of this section. Thus, the index covers references to identifiers in the explanatory comments as well as in the program itself; you will soon learn to appreciate this feature. However, the identifiers \&{int} and \|a\ would not be indexed, because \.{CWEAVE} does not make index entries for reserved words or single-letter identifiers. Such identifiers are felt to be so ubiquitous that it would be pointless to mention every place where they occur. Although a section begins with \TEX/ text and ends with \CEE/ text, we have noted that the dividing line isn't sharp, since \CEE/ text can be included in \TEX/ text if it is enclosed in `\pb'. Conversely, \TEX/ text appears frequently within \CEE/ text, because everything in comments (i.e., between \.{/*} and \.{*/}, or following \.{//}) is treated as \TEX/ text. Likewise, the text of a section name consists of \TEX/ text, but the construct \.{@
} as a whole is expected to be found in \CEE/ text; thus, one typically goes back and forth between the \CEE/ and \TEX/ environments in a natural way, as in these examples: $$ \displaylines{ \hbox{\.{if} \.{(x==0)} \.{@}} \cr \hbox{\.{...} \.{using} \.{the} \.{algorithm} \.{in} \.{|@|.}} } $$ The first of these excerpts would be found in the \CEE/ part of a section, into which the code from the section named ``Empty the \\{buffer} array'' is being spliced. The second excerpt would be found in the \TEX/ part of the section, and the named section is being ``cited'', rather than defined or used. (Note the `\pb' surrounding the section name in this case.) \section Macros. The control code \.{@d} followed by $$\\{identifier}\.{ }\hbox{\CEE/ text}\qquad\hbox{or by}\qquad \\{identifier}\.(\\{par}_1,\ldots,\\{par}_n\.{) }\hbox{\CEE/ text}$$ (where there is no blank between the \\{identifier} and the parentheses in the second case) is transformed by \.{CTANGLE} into a preprocessor command, starting with \.{\#define}, which is printed at the top of the \CEE/ output file as explained earlier. A `\.{@d}' macro definition can go on for several lines, and the newlines don't have to be protected by backslashes, since \.{CTANGLE} itself inserts the backslashes. If for any reason you need a \.{\#define} command at a specific spot in your \CEE/ file, you can treat it as \CEE/ code, instead of as a \.{CWEB} macro; but then you do have to protect newlines yourself. \section Strings and constants. If you want a string to appear in the \CEE/ file, delimited by pairs of \.' or \." marks as usual, you can type it exactly so in the \.{CWEB} file, except that the character `\.@' should be typed `\.{@@}' (it becomes a control code, the only one that can appear in strings; see below). Strings should end on the same line as they begin, unless there's a backslash at the end of lines within them. \TEX/ and \CEE/ have different ways to refer to octal and hex constants, because \TEX/ is oriented to technical writing while \CEE/ is oriented to computer processing. In \TEX/ you make a constant octal or hexadecimal by prepending \.' or \.", respectively, to it; in \CEE/ the constant should be preceded by \.0 or \.{0x}. In \.{CWEB} it seems reasonable to let each convention hold in its respective realm; so in \CEE/ text you get $40_8$ by typing `\.{040}', which \.{CTANGLE} faithfully copies into the \CEE/ file (for the compiler's benefit) and which \.{CWEAVE} prints as $\T{\~40}$. Similarly, \.{CWEAVE} prints the hexadecimal \CEE/ constant `\.{0x20}' as \T{\^20}. The use of italic font for octal digits and typewriter font for hexadecimal digits makes the meaning of such constants clearer in a document. For consistency, then, you should type `\.{|040|}' or `\.{|0x20|}' in the \TEX/ part of the section. And if you type a binary constant like `\.{0b00101010}', \.{CWEAVE} prints it as $\T{\\00101010}$. In all numeric literals you may add \.' separators for improved readability. \section Control codes. A \.{CWEB} {\sl control code\/} is a two-character combination of which the first is `\.@'. We've already seen the meaning of several control codes; it's time to list them more methodically. In the following list, the letters in brackets after a control code indicate in what contexts that code is allowed. $L$ indicates that the code is allowed in limbo; $T$ (for \TEX/), $M$ (for middle), and $C$ (for \CEE/) mean that the code is allowed in each of the three parts of a section, at top level---that is, outside such constructs as `\pb' and section names. An arrow $\to$ means that the control code terminates the present part of the \.{CWEB} file, and inaugurates the part indicated by the letter following the arrow. Thus $[LTMC\to T]$ next to \.{@\ } indicates that this control code can occur in limbo, or in any of the three parts of a section, and that it starts the (possibly empty) \TEX/ part of the following section. Two other abbreviations can occur in these brackets: The letter $r$ stands for {\it restricted context}, that is, material inside \CEE/ comments, section names, \CEE/ strings and control texts (defined below); the letter $c$ stands for {\it inner \CEE/ context}, that is, \CEE/ material inside `\pb' (including `\pb's inside comments, but not those occurring in other restricted contexts). An asterisk $*$ following the brackets means that the context from this control code to the matching \.{@>} is restricted. Control codes involving letters are case-insensitive; thus \.{@d} and \.{@D} are equivalent. Only the lowercase versions are mentioned specifically below. \gdef\@#1[#2] {\penalty-50\yskip\hangindent 2em\noindent\.{@#1\unskip \spacefactor1000{ }}$[#2]$\quad} \def\more{\hangindent 2em \hangafter0} \def\subsec{\penalty-300\medskip\noindent} \@@ [LTMCrc] A double \.@ denotes the single character `\.@'. This is the only control code that is legal everywhere. Note that you must use this convention if you are giving an internet email address in a \.{CWEB} file (e.g., \.{tex-k@@tug.org}). \subsec Here are the codes that introduce the \TEX/ part of a section. \@\ [LTMC\to T] This denotes the beginning of a new (unstarred) section. A tab mark or form feed or end-of-line character is equivalent to a space when it follows an \.@ sign (and in most other cases). \@* [LTMC\to T] This denotes the beginning of a new starred section, i.e., a section that begins a new major group. The title of the new group should appear after the \.{@*}, followed by a period. As explained above, \TEX/ control sequences should be avoided in such titles unless they are quite simple. When \.{CWEAVE} and \.{CTANGLE} read a \.{@*}, they print an asterisk on the terminal followed by the current section number, so that the user can see some indication of progress. The very first section should be starred. \more You can specify the ``depth'' of a starred section by typing \.* or a decimal number after the \.{@*}; this indicates the relative ranking of the current group of sections in the program hierarchy. Top-level portions of the program, introduced by \.{@**}, get their names typeset in boldface type in the table of contents; they are said to have depth~$-1$. Otherwise the depth is a nonnegative number, which governs the amount of indentation on the contents page. Such indentation helps clarify the structure of a long program. The depth is assumed to be 0 if it is not specified explicitly; when your program is short, you might as well leave all depths zero. A starred section always begins a new page in the output, unless the depth is greater than~1. \subsec The middle part of each section consists of any number of macro definitions (beginning with \.{@d}) and format definitions (beginning with \.{@f} or \.{@s}), intermixed in any order. \@d [TM\to M] Macro definitions begin with \.{@d}, followed by an identifier and optional parameters and \CEE/ text as explained earlier. \@f [TM\to M] Format definitions begin with \.{@f}; they cause \.{CWEAVE} to treat identifiers in a special way when they appear in \CEE/ text. The general form of a format definition is `\.{@f} \|l \|r', followed by an optional comment enclosed between \.{/*} and \.{*/}, where \|l and \|r are identifiers; \.{CWEAVE} will subsequently treat identifier \|l as it currently treats \|r. This feature allows a \.{CWEB} programmer to invent new reserved words and/or to unreserve some of \CEE/'s reserved identifiers. For example, the common words `error' and `line' have been given a special meaning in the \CEE/ preprocessor, so \.{CWEAVE} is set up to format them specially; if you want a variable named \\{error} or \\{line}, you should say $$\.{@f error normal}\qquad\qquad\.{@f line normal}$$ somewhere in your program. \more If \|r is the special identifier `\\{TeX}', identifier \|l will be formatted as a \TEX/ control sequence; for example, `\.{@f foo TeX}' in the \.{CWEB} file will cause identifier \\{foo} to be output as \.{\\foo} by \.{CWEAVE}. The programmer should define \.{\\foo} to have whatever custom format is desired, assuming \TEX/ math mode. (Each underline character is converted to \.{x} when making the \TEX/ control sequence, and each dollar sign is converted to~\.X; thus \\{foo\_bar} becomes \.{\\fooxbar}. Other characters, including digits, are left untranslated, so \TEX/ will consider them as macro parameters, not as part of the control sequence itself. For example, $$\.{\\def\\x\#1\{x\_\{\#1\}\} @f x1 TeX @f x2 TeX}$$ will format \.{x1} and \.{x2} not as \\{x1} and \\{x2} but as $x_1$ and $x_2$.) \more If \|r is the special identifier `\\{make\_pair}', identifier \|l will be treated as a \CPLUSPLUS/ function template. For example, after \.{@f}~\.{convert}~\.{make\_pair} one can say `\.{convert(2.5)}' without having \.< and \.> misunderstood as less-than and greater-than signs. \more \.{CWEAVE} knows that identifiers being defined with a \&{typedef} should become reserved words; thus you don't need format definitions very often. \@s [TM\to M;\;L] Same as \.{@f}, but \.{CWEAVE} does not show the format definition in the output, and the optional \CEE/ comment is not allowed. This is used mostly in \.{@i} files. \subsec Next come the codes that govern the \CEE/ part of a section. \@{c @p} [TM\to C] The \CEE/ part of an unnamed section begins with \.{@c} (or with \.{@p} for ``program''; both control codes do the same thing). This causes \.{CTANGLE} to append the following \CEE/ code to the first-order program text, as explained on page~\tangref. Note that \.{CWEAVE} does not print a `\.{@c}' in the \TEX/ output, so if you are creating a \.{CWEB} file based on a \TEX/-printed \.{CWEB} documentation you have to remember to insert \.{@c} in the appropriate places of the unnamed sections. \@< [TM\to C;\; C;\; c] $*$ This control code introduces a section name (or unambiguous prefix, as discussed above), which consists of \TEX/ text and extends to the matching \.{@>}. The whole construct \.{@<...@>} is conceptually a \CEE/ element. The behavior is different depending on the context: \more A \.{@<} appearing in contexts $T$ and $M$ attaches the following section name to the current section, and inaugurates the \CEE/ part of the section. The closing \.{@>} should be followed by \.{=} or \.{+=}. \more In context $C$, \.{@<} indicates that the named section is being used---its \CEE/ definition is spliced in by \.{CTANGLE}, as explained on page~\tangref. As an error-detection measure, \.{CTANGLE} and \.{CWEAVE} complain if such a section name is followed by \.=, because most likely this is meant as the definition of a new section, and so should be preceded by \.{@\ }. If you really want to say $\langle\,$foo$\,\rangle=\\{bar}$, where $\langle\,$foo$\,\rangle$ is being used and not defined, put a newline before the \.=. \more Finally, in inner \CEE/ context (that is, within `\pb' in the \TEX/ part of a section or in a comment), \.{@<...@>} means that the named section is being cited. Such an occurrence is ignored by \.{CTANGLE}. Note that even here we think of the section name as being a \CEE/ element, hence the \pb. \@( [TM\to C;\;C;\;c] $*$ A section name can begin with \.{@(}. Everything works just as for \.{@<}, except that the \CEE/ code of the section named \.{@(foo@>} is written by \.{CTANGLE} to file \.{foo}. In this way you can get multiple-file output from a single \.{CWEB} file. (The \.{@d} definitions are not output to such files, only to the master \.{.c} file.) One use of this feature is to produce header files for other program modules that will be loaded with the present one. Another use is to produce a test routine that goes with your program. By keeping the sources for a program and its header and test routine together, you are more likely to keep all three consistent with each other. Notice that the output of a named section can be incorporated in several different output files, because you can mention \.{@} in both \.{@(bar1@>} and \.{@(bar2@>}. \@h [Cc] Causes \.{CTANGLE} to insert at the current spot the \.{\#define} statements from the middle parts of all sections, and {\it not\/} to write them at the beginning of the \CEE/ file. Useful when you want the macro definitions to come after the include files, say. (Ignored by \.{CTANGLE} inside `\pb'.) \subsec The next several control codes introduce ``control texts,'' which end with the next `\.{@>}'. The closing `\.{@>}' must be on the same line of the \.{CWEB} file as the line where the control text began. The context from each of these control codes to the matching \.{@>} is restricted. \@\^ [TMCc] $*$ The control text that follows, up to the next `\.{@>}', will be entered into the index together with the identifiers of the \CEE/ program; this text will appear in roman type. For example, to put the phrase ``system dependencies'' into the index that is output by \.{CWEAVE}, type `\.{@\^system dependencies@>}' in each section that you want to index as system dependent. \@. [TMCc] $*$ The control text that follows will be entered into the index in \.{typewriter} \.{type}. \@: [TMCc] $*$ The control text that follows will be entered into the index in a format controlled by the \TEX/ macro `\.{\\9}', which you should define as desired. \@t [MCc] $*$ The control text that follows will be put into a \TEX/ \.{\\hbox} and formatted along with the neighboring \CEE/ program. This text is ignored by \.{CTANGLE}, but it can be used for various purposes within \.{CWEAVE}. For example, you can make comments that mix \CEE/ and classical mathematics, as in `$\\{size}<2^{15}$', by typing `\.{|size < 2@t\$\^\{15\}\$@>|}'. \@= [MCc] $*$ The control text that follows will be passed verbatim to the \CEE/ program. \@q [LTMCc] $*$ The control text that follows will be totally ignored---it's a comment for readers of the \.{CWEB} file only. A file intended to be included in limbo, with \.{@i}, can identify itself with \.{@q} comments. Another use is to balance unbalanced parentheses in \CEE/ strings, so that your text editor's parenthesis matcher doesn't go into a tailspin. \@! [TMCc] $*$ The section number in an index entry will be underlined if `\.{@!}' immediately precedes the identifier or control text being indexed. This convention is used to distinguish the sections where an identifier is defined, or where it is explained in some special way, from the sections where it is used. A~reserved word or an identifier of length one will not be indexed except for underlined entries. An `\.{@!}' is implicitly inserted by \.{CWEAVE} when an identifier is being defined or declared in \CEE/ code; for example, the definition $$\hbox{\&{int} \\{array}[\\{max\_dim}], \\{count}${}=\\{old\_count};$}$$ makes the names \\{array} and \\{count} get an underlined entry in the index. Statement labels, function definitions like \\{main}(\&{int}~\\{argc},\,\&{char}~$*$\\{argv}[\,]), and \&{typedef} definitions also imply underlining. An old-style function definition (without prototyping) doesn't define its arguments; the arguments will, however, be considered to be defined (i.e., their index entries will be underlined) if their types are declared before the body of the function in the usual way (e.g., `\&{int}~\\{argc}; \&{char}~${*}\\{argv}[\,]$; $\{\,\ldots\,\}$'). Thus \.{@!} is not needed very often, except in unusual constructions or in cases like $$\.{enum boolean \{@!false, @!true\};}$$ here \.{@!} gives the best results because individual constants enumerated by \.{enum} are not automatically underlined in the index at their point of definition. \subsec We now turn to control codes that affect only the operation of \.{CTANGLE}. \@' [MCc] This control code is dangerous because it has quite different meanings in \.{CWEB} and the original \.{WEB}. In \.{CWEB} it produces the decimal constant corresponding to the ASCII code for a string of length~1 (e.g., \.{@'a'} is \.{CTANGLE}d into \.{97} and \.{@'\\t'} into \.9). You might want to use this if you need to work in ASCII on a non-ASCII machine; but in most cases the \CEE/ conventions of \.{} are adequate for character-set-independent programming. \@\& [MCc] The \.{@\&} operation causes whatever is on its left to be adjacent to whatever is on its right, in the \CEE/ output. No spaces or line breaks will separate these two items. \@l [L] \.{CWEB} programmers have the option of using any 8-bit character code from the often-forbidden range 128--255 within \TEX/ text; such characters are also permitted in strings and even in identifiers of the \CEE/ program. Under various extensions of the basic ASCII standard, the higher 8-bit codes correspond to accented letters, letters from non-Latin alphabets, and so on. When such characters occur in identifiers, \.{CTANGLE} must replace them by standard ASCII alphanumeric characters or \.{\_}, in order to generate legal \CEE/ code. It does this by means of a transliteration table, which by default associates the string \.{Xab} to the character with ASCII code \T{\^}$ab$ (where $a$ and $b$ are hexadecimal digits, and $a\ge8$). By placing the construction \.{@l\ ab\ newstring} in limbo, you are telling \.{CTANGLE} to replace this character by \.{newstring} instead. For example, the ISO Latin-1 code for the letter `\"u' is \T{\^FC} (or \.{'\char`\\374'}), and \.{CTANGLE} will normally change this code to the three-character sequence \.{XFC} if it appears in an identifier. If you say \.{@l} \.{fc} \.{ue}, the code will be transliterated into \.{ue} instead. \more \.{CWEAVE} passes 8-bit characters straight through to \TEX/ without transliteration; therefore \TEX/ must be prepared to receive them. If you are formatting all your nonstandard identifiers as ``custom'' control sequences, you should make \TEX/ treat all their characters as letters. Otherwise you should either make your 8-bit codes ``active'' in \TEX/, or load fonts that contain the special characters you need in the correct positions. (The font selected by \TEX/ control sequence \.{\\it} is used for identifiers.) Look for special macro packages designed for \.{CWEB} users in your language; or, if you are brave, write one yourself. \subsec The next eight control codes (namely `\.{@,}', `\.{@/}', `\.{@|}', `\.{@\#}', `\.{@+}', `\.{@;}', `\.{@[}', and `\.{@]}') have no effect on the \CEE/ program output by \.{CTANGLE}; they merely help to improve the readability of the \TEX/-formatted \CEE/ that is output by \.{CWEAVE}, in unusual circumstances. \.{CWEAVE}'s built-in formatting method is fairly good when dealing with syntactically correct \CEE/ text, but it is incapable of handling all possible cases, because it must deal with fragments of text involving macros and section names; these fragments do not necessarily obey \CEE/'s syntax. Although \.{CWEB} allows you to override the automatic formatting, your best strategy is not to worry about such things until you have seen what \.{CWEAVE} produces automatically, since you will probably need to make only a few corrections when you are touching up your documentation. \@, [MCc] This control code inserts a thin space in \.{CWEAVE}'s output. Sometimes you need this extra space if you are using macros in an unusual way, e.g., if two identifiers are adjacent. \@/ [MC] This control code causes a line break to occur within a \CEE/ program formatted by \.{CWEAVE}. Line breaks are chosen automatically by \TEX/ according to a scheme that works 99\%\ of the time, but sometimes you will prefer to force a line break so that the program is segmented according to logical rather than visual criteria. If a comment follows, say `\.{@/@,}' to break the line before the comment. \@| [MC] This control code specifies an optional line break in the midst of an expression. For example, if you have a long expression on the right-hand side of an assignment statement, you can use `\.{@|}' to specify breakpoints more logical than the ones that \TEX/ might choose on visual grounds. \@\# [MC] This control code forces a line break, like \.{@/} does, and it also causes a little extra white space to appear between the lines at this break. You might use it, for example, between groups of macro definitions that are logically separate but within the same section. \.{CWEB} automatically inserts this extra space between functions, between external declarations and functions, and between declarations and statements within a function. \@+ [MC] This control code cancels a line break that might otherwise be inserted by \.{CWEAVE}, e.g., before the word `\&{else}', if you want to put a short if--else construction on a single line. If you say `\.{\{@+}' at the beginning of a compound statement that is the body of a function, the first declaration or statement of the function will appear on the same line as the left brace, and it will be indented by the same amount as the second declaration or statement on the next line. \@; [MC] This control code is treated like a semicolon, for formatting purposes, except that it is invisible. You can use it, for example, after a section name or macro when the \CEE/ text represented by that section or macro is a compound statement or ends with a semicolon. Consider constructions like $$\lpile{\.{if (condition) macro @;}\cr \.{else break;}\cr}$$ where \\{macro} is defined to be a compound statement (enclosed in braces). This is a well-known infelicity of \CEE/ syntax. You can add a visible semicolon with \.{@t;@>} (before \.{@;}). \@{[} [MC] See \.{@]}. \@] [MC] Place \.{@[...@]} brackets around program text that \.{CWEAVE} is supposed to format as an expression, if it doesn't already do so. (This occasionally applies to unusual macro arguments.) Also insert `\.{@[@]}' between a simple type name and a left parenthesis when declaring a pointer to a function, as in $$\.{int @[@] (*f)();}$$ otherwise \.{CWEAVE} will confuse the first part of that declaration with the \CPLUSPLUS/ expression `\&{int}($*f$)'. Another example, for people who want to use low-level \.{\#define} commands in the midst of \CEE/ code and the definition begins with a cast: $$\.{\#define foo @[(int)(bar)@]}$$ \subsec The remaining control codes govern the input that \.{CWEB} sees. \@{x @y @z}[\\{change\_file}] \.{CWEAVE} and \.{CTANGLE} are designed to work with two input files, called \\{web\_file} and \\{change\_file}, where \\{change\_file} contains data that overrides selected portions of \\{web\_file}. The resulting merged text is actually what has been called the \.{CWEB} file elsewhere in this report. \more Here's how it works: The change file consists of zero or more ``changes,'' where a change has the form `\.{@x}$\langle$old lines$\rangle$\.{@y}$\langle$% new lines$\rangle$\.{@z}'. The special control codes \.{@x}, \.{@y}, \.{@z}, which are allowed only in change files, must appear at the beginning of a line; the remainder of such a line is ignored. The $\langle$old lines$\rangle$ represent material that exactly matches consecutive lines of the \\{web\_file}; the $\langle$new lines$\rangle$ represent zero or more lines that are supposed to replace the old. Whenever the first ``old line'' of a change is found to match a line in the \\{web\_file}, all the other lines in that change must match too. \more Between changes, before the first change, and after the last change, the change file can have any number of lines that do not begin with `\.{@x}', `\.{@y}', or~`\.{@z}'. Such lines are bypassed and not used for matching purposes. \more This dual-input feature is useful when working with a master \.{CWEB} file that has been received from elsewhere (e.g., \.{ctangle.w} or \.{cweave.w} or \.{ctex.w}), when changes are desirable to customize the program for your local computer system. You will be able to debug your system-dependent changes without clobbering the master web file; and once your changes are working, you will be able to incorporate them readily into new releases of the master web file that you might receive from time to time. \@i [\\{web\_file}] Furthermore the \\{web\_file} itself can be a combination of several files. When either \.{CWEAVE} or \.{CTANGLE} is reading a file and encounters the control code \.{@i} at the beginning of a line, it interrupts normal reading and starts looking at the file named after the \.{@i}, much as the \CEE/ preprocessor does when it encounters an \.{\#include} line. After the included file has been entirely read, the program goes back to the next line of the original file. The file name following \.{@i} can be surrounded by \." characters, but such delimiters are optional. Include files can nest. \more Change files can have lines starting with \.{@i}. In this way you can replace one included file with another. Conceptually, the replacement mechanism described above does its work first, and its output is then checked for \.{@i} lines. If \.{@i} \.{foo} occurs between \.{@y} and \.{@z} in a change file, individual lines of file \.{foo} and files it includes are not changeable; but changes can be made to lines from files that were included by unchanged input. \more On \UNIX/ systems (and others that support environment variables), if the environment variable \.{CWEBINPUTS} is set, or if the compiler flag of the same name was defined at compile time, \.{CWEB} will look for include files in the directory thus named, if it cannot find them in the current directory. \section Additional features and caveats. 1. In certain installations of \.{CWEB} that {\def\\#1#2{`{\tentex\char'#1#2}'}% have an extended character set, the characters \\13, \\01, \\31, \\32, \\34, \\35, \\36, \\37, \\04, \\20, and \\21} can be typed as abbreviations for `\.{++}', `\.{--}', `\.{->}', `\.{!=}', `\.{<=}', `\.{>=}', `\.{==}', `\.{\v\v}', `\.{\&\&}', `\.{<<}', and `\.{>>}', respectively. 2. If you have an extended character set, you can use it with only minimal restrictions, as discussed under the rules for \.{@l} above. But you should stick to standard ASCII characters if you want to write programs that will be useful to all the poor souls out there who don't have extended character sets. 3. The \TEX/ file output by \.{CWEAVE} is broken into lines having at most 80 characters each. When \TEX/ text is being copied, the existing line breaks are copied as well. If you aren't doing anything too tricky, \.{CWEAVE} will recognize when a \TEX/ comment is being split across two or more lines, and it will append `\.\%' to the beginning of such continued comments. 4. \CEE/ text is translated by a ``bottom up'' procedure that identifies each token as a ``part of speech'' and combines parts of speech into larger and larger phrases as much as possible according to a special grammar that is explained in the documentation of \.{CWEAVE}. It is easy to learn the translation scheme for simple constructions like single identifiers and short expressions, just by looking at a few examples of what \.{CWEAVE} does, but the general mechanism is somewhat complex because it must handle much more than \CEE/ itself. Furthermore the output contains embedded codes that cause \TEX/ to indent and break lines as necessary, depending on the fonts used and the desired page width. For best results it is wise to avoid enclosing long \CEE/ texts in \pb, since the indentation and line breaking codes are omitted when the \pb\ text is translated from \CEE/ to \TEX/. Stick to simple expressions or statements. If a \CEE/ preprocessor command is enclosed in \pb, the \.\# that introduces it must be at the beginning of a line, or \.{CWEAVE} won't print it correctly. 5. Comments are not permitted in \pb\ text. After a `\.|' signals the change from \TEX/ text to \CEE/ text, the next `\.|' that is not part of a string or control text or section name ends the \CEE/ text. 6. A comment must have properly nested occurrences of left and right braces, otherwise \.{CWEAVE} will complain. But it does try to balance the braces, so that \TEX/ won't foul up too much. 7. When you're debugging a program and decide to omit some of your \CEE/ code, do NOT simply ``comment it out.'' Such comments are not in the spirit of \.{CWEB} documentation; they will appear to readers as if they were explanations of the uncommented-out instructions. Furthermore, comments of a program must be valid \TEX/ text; hence \.{CWEAVE} will get confused if you enclose \CEE/ statements in \.{/*...*/} instead of in \.{/*|...|*/}. If you must comment out \CEE/ code, you can surround it with preprocessor commands like \.{\#if 0==1} and \.{\#endif}. 8. The \.{@f} feature allows you to define one identifier to act like another, and these format definitions are carried out sequentially. In general, a given identifier has only one printed format throughout the entire document, and this format is used even before the \.{@f} that defines it. The reason is that \.{CWEAVE} operates in two passes; it processes \.{@f}'s and cross-references on the first pass and it does the output on the second. (However, identifiers that implicitly get a boldface format, thanks to a \.{typedef} declaration, don't obey this rule; they are printed differently before and after the relevant \.{typedef}. This is unfortunate, but hard to fix. You can get around the restriction by saying, say, `\.{@s} \.{foo} \.{int}', before or after the \.{typedef}.) 9. Sometimes it is desirable to insert spacing into formatted \CEE/ code that is more general than the thin space provided by `\.{@,}'. The \.{@t} feature can be used for this purpose; e.g., `\.{@t\\hskip 1in@>}' will leave one inch of blank space. Furthermore, `\.{@t\\4@>}' can be used to backspace by one unit of indentation, since the control sequence \.{\\4} is defined in \.{cwebmac} to be such a backspace. (This control sequence is used, for example, at the beginning of lines that contain labeled statements, so that the label will stick out a little at the left.) You can also use `\.{@t\}\\3\{-5@>}' to force a break in the middle of an expression. 10. Each identifier in \.{CWEB} has a single formatting convention. Therefore you shouldn't use the same identifier to denote, say, both a type name and part of a \.{struct}, even though \CEE/ does allow this. \section Running the programs. The \UNIX/ command line for \.{CTANGLE} is $$\.{ctangle [options] webfile[.w] [\{changefile[.ch]|-\} [outfile[.c]]]}$$ and the same conventions apply to \.{CWEAVE}. If `\.-' or no change file is specified, the change file is null. The extensions \.{.w} and \.{.ch} are appended only if the given file names contain no dot. If the web file defined in this way cannot be found, the extension \.{.web} will be tried. For example, `\.{cweave} \.{cob}' will try to read \.{cob.w}; failing that, it will try \.{cob.web} before giving up. If no output file name is specified, the name of the \CEE/ file output by \.{CTANGLE} is obtained by appending the extension \.{.c}; the name of the \TEX/ file output by \.{CWEAVE} gets the extension \.{.tex}. Index files output by \.{CWEAVE} replace \.{.tex} by \.{.idx} and \.{.scn}. Programmers who like terseness might choose to set up their operating shell so that `\.{wv}' expands to `\.{cweave -bhp}'; this will suppress most terminal output from \.{CWEAVE} except for error messages. Options are introduced either by a \.- sign, to turn an option off, or by a \.+ sign to turn one on. For example, `\.{-fb}' turns off options \.f and \.b; `\.{+s}' turns on option \.s. Options can be specified before the file names, after the file names, or both. The following options are currently implemented: \yskip \def\option#1 {\textindent{\.#1}\hangindent2\parindent} \option b Print a banner line at the beginning of execution. (On by default.) \option e Enclose \CEE/ material formatted by \.{CWEAVE} in brackets \.{\\PB\{...\}}, so that special hooks can be used. (On by default.) (Has no effect on \.{CTANGLE}.) \option f Force line breaks after each \CEE/ statement formatted by \.{CWEAVE}. (On by default; \.{-f} saves paper but looks less \CEE/-like to some people.) (Has no effect on \.{CTANGLE}.) \option h Print a happy message at the conclusion of a successful run. (On by default.) \option k Keep single quotes (\.') in numeric literals in the \CEE//\CPLUSPLUS/ output. (Off by default.) (\.{CTANGLE} only.) \option p Give progress reports as the program runs. (On by default.) \option s Show statistics about memory usage after the program runs to completion. (Off by default.) If you have large \.{CWEB} files or sections, you may need to see how close you come to exceeding the capacity of \.{CTANGLE} and/or \.{CWEAVE}. \option t Treat \&{typename} in a template like \&{typedef}. (Off by default.) (Has no effect on \.{CTANGLE}.) \option x Include indexes and a table of contents in the \TEX/ file output by \.{CWEAVE}. (On by default.) (Has no effect on \.{CTANGLE}.) \section Further details about formatting. You may not like the way \.{CWEAVE} handles certain situations. If you're desperate, you can customize \.{CWEAVE} by changing its grammar. This means changing the source code, a task that you might find amusing. A table of grammar rules appears in the \.{CWEAVE} source listing, and you can make a separate copy of that table by copying the file \.{prod.w} found in the \.{CWEB} sources and saying `\.{cweave}~\.{-x}~\.{prod}', followed by `\.{tex}~\.{prod}'. You can see exactly how \.{CWEAVE} is parsing your \CEE/ code by preceding it with the line `\.{@ @c @2}'. (The control code `\.{@2}' turns on a ``peeping'' mode, and `\.{@0}' turns it off.) For example, if you run \.{CWEAVE} on the file \medskip \begingroup \verbatim @ @c @2 main (argc,argv) char **argv; { for (;argc>0;argc--) printf("%s\n",argv[argc-1]); } !endgroup \endgroup \medskip\noindent you get the following gibberish on your screen: \medskip \begingroup \verbatim [...] 14:*exp +(+ exp... 11:*exp +exp+ raw... 10:*+exp+ raw +ubinop?+... [...] 60: +fn_decl+*+{+ -stmt- +}- 55:*+fn_decl+ -stmt- 52:*+function- [...] !endgroup \endgroup \medskip The first line says that grammar rule 14 has just been applied, and \.{CWEAVE} currently has in its memory a sequence of chunks of \TEX/ code (called ``scraps'') that are respectively of type \\{exp} (for expression), open-parenthesis, \\{exp} again, close-parenthesis, and further scraps that haven't yet been considered by the parser. (The \.+ and \.- signs stipulate that \TEX/ should be in or out of math mode at the scrap boundaries. The \.* shows the parser's current position.) Then rule 11 is applied, and the sequence $(\,exp\,)$ becomes an \\{exp} and so on. In the end the whole \CEE/ text has become one big scrap of type \\{function}. Sometimes things don't work as smoothly, and you get a bunch of lines lumped together. This means that \.{CWEAVE} could not digest something in your \CEE/ code. For instance, suppose `\.{@}' had appeared instead of `\.{char **argv;}' in the program above. Then \.{CWEAVE} would have been somewhat mystified, since it thinks that section names are just \\{exp}s. Thus it would tell \TEX/ to format `\X2:Argument declarations\X' on the same line as `$\\{main}(\\{argc},\39\\{argv}{}$)'. In this case you should help \.{CWEAVE} by putting `\.{@/}' after `\.{main(argc,argv)}' (plus `\.{@t\\qquad@>}' for consistent indentation). \.{CWEAVE} automatically inserts a bit of extra space between declarations and the first apparent statement of a block. One way to defeat this spacing locally is $$\vbox{\halign{#\hfil\cr \.{int x;@+@t\}\\6\{@>}\cr \.{@@;@\#}\cr}}$$ the `\.{@\#}' will put extra space after `$\langle\,$Other locals$\,\rangle$'. \section Hypertext and hyperdocumentation. Many people have of course noticed analogies between \.{CWEB} and the World Wide Web. The \.{CWEB} macros are in fact set up so that the output of \.{CWEAVE} can be converted easily into Portable Document Format, with clickable hyperlinks that can be read with your favorite {\mc PDF} viewer, using a widely available open-source program called \.{dvipdfm} developed by Mark~A. Wicks. After using \.{CWEAVE} to convert \.{cob.w} into \.{cob.tex}, you can prepare a hypertext version of the program by giving the commands $$\vbox{\halign{\.{#}\hfil\cr tex "\\let\\pdf+ \\input cob"\cr dvipdfm cob\cr}}$$ instead of invoking \TeX\ in the normal way. (Thanks to Hans Hagen, C\'esar Augusto Rorato Crusius, and Julian Gilbey for the macros that make this work.) Alternatively, thanks to H\`an Th\^e\kern-.3em\raise.3ex\hbox{\'{}} Th\`anh and Andreas Scherer, you can generate \.{cob.pdf} in one step by simply saying `\.{pdftex}~\.{cob}'. Alternative ways to create {\mc PDF} output from \.{CWEB} input are to say `\.{xetex}~\.{cob}' or `\.{luatex}~\.{cob}'. Similar output for ``smart'' devices can be created with Martin Ruckert's Hi\TeX\ and its dynamic \.{HINT} format; just say `\.{hitex}~\.{cob}'. \.{HINT} files can be viewed with the \.{hintview} program, which is available from \.{https://hint.userweb.mwn.de/hint/hintview.html}. A more elaborate system called \.{CTWILL}, which extends the usual cross references of \.{CWEAVE} by preparing links from the uses of identifiers to their definitions, is also available---provided that you are willing to work a bit harder in cases where an identifier is multiply defined. \.{CTWILL} is intended primarily for hardcopy output, but its principles could be used for hypertext as well. See Chapter 11 of {\sl Digital Typography\/} by D.~E. Knuth (1999), and the program sources at \.{ftp://ftp.cs.stanford.edu/pub/ctwill}. \section Appendices. As an example of a real program written in \.{CWEB}, Appendix~A contains an excerpt from the \.{CWEB} program itself. The reader who examines the listings in this appendix carefully will get a good feeling for the basic ideas of \.{CWEB}. Appendix B displays the files that set \TEX/ up to accept the output of \.{CWEAVE}, and Appendix~C discusses how to use some of those macros to vary the output formats. A ``long'' version of this manual, which can be produced from the \.{CWEB} sources via the \UNIX/ command \.{make} \.{fullmanual}, also contains appendices D, E, and~F, which exhibit the complete source code for \.{CTANGLE} and \.{CWEAVE}. \vfil\eject\titletrue \def\runninghead{APPENDIX A --- {\tentt CWEB} FILE FORMAT} \section Appendix A: Excerpts from a \.{CWEB} Program. This appendix consists of four listings. The first shows the \.{CWEB} input that generated sections 27--31 of the file \.{common.w}, which contains routines common to \.{CWEAVE} and \.{CTANGLE}. Note that some of the lines are indented to show the program structure; the indentation is ignored by \.{CWEAVE} and \.{CTANGLE}, but users find that \.{CWEB} files are quite readable if they have some such indentation. The second and third listings show corresponding parts of the \CEE/ code output by \.{CTANGLE} and of the \TEX/ code output by \.{CWEAVE}, when run on \.{common.w}. The fourth listing shows how that output looks in print. \vskip 6pt \begingroup \def\tt{\eighttt} \baselineskip9pt \verbatim @ Procedure |prime_the_change_buffer| sets |change_buffer| in preparation for the next matching operation. Since blank lines in the change file are not used for matching, we have |(change_limit==change_buffer && !!changing)| if and only if the change file is exhausted. This procedure is called only when |changing| is |true|; hence error messages will be reported correctly. @c static void prime_the_change_buffer(void) { change_limit=change_buffer; /* this value is used if the change file ends */ @@; @@; @@; } @ @=@+static void prime_the_change_buffer(void); @ While looking for a line that begins with \.{@@x} in the change file, we allow lines that begin with \.{@@}, as long as they don't begin with \.{@@y}, \.{@@z}, or \.{@@i} (which would probably mean that the change file is fouled up). @= while(true) { change_line++; if (!!input_ln(change_file)) return; if (limit } } @ Here we are looking at lines following the \.{@@x}. @= do { change_line++; if (!!input_ln(change_file)) { err_print("!! Change file ended after @@x"); @.Change file ended...@> return; } } while (limit==buffer); @ @= change_limit=change_buffer+(ptrdiff_t)(limit-buffer); strncpy(change_buffer,buffer,(size_t)(limit-buffer+1)); !endgroup \endgroup \vfill\eject \def\runninghead{APPENDIX A --- TRANSLATION BY {\tentt CTANGLE}} Here's the portion of the \CEE/ code generated by \.{CTANGLE} that corresponds to the source on the preceding page. Notice that sections~29, 30 and~31 have been tangled into section~27. \vskip6pt \begingroup \def\tt{\eighttt} \baselineskip9pt \verbatim /*:23*//*27:*/ #line 227 "common.w" static void prime_the_change_buffer(void) { change_limit= change_buffer; /*29:*/ #line 243 "common.w" while(true){ change_line++; if(!!input_ln(change_file))return; if(limit}', the backslash character gets in the way, and this entry wouldn't appear in the index with the T's. The solution is to use the `\.{@:}' feature, declaring a macro that simply removes a sort key as follows: $$\.{\\def\\9\#1\{\}}$$ Now you can say, e.g., `\.{@:TeX\}\{\\TeX@>}' in your \.{CWEB} file; \.{CWEAVE} puts it into the index alphabetically, based on the sort key, and produces the macro call `\.{\\9\{TeX\}\{\\TeX\}}' which will ensure that the sort key isn't printed. A similar idea can be used to insert hidden material into section names so that they are alphabetized in whatever way you might wish. Some people call these tricks ``special refinements''; others call them ``kludges.'' \point 12. The control sequence \.{\\secno} is set to the number of the section being typeset. \point 13. If you want to list only the sections that have changed, together with the index, put the command `\.{\\let\\maybe=\\iffalse}' in the limbo section before the first section of your \.{CWEB} file. It's customary to make this the first change in your change file. This feature has a \TeX nical limitation, however: You cannot use it together with control sequences like \.{\\proclaim} or \.{\\+} or \.{\\newcount} that plain \TeX\ has declared to be `\.{\\outer}', because \TeX\ refuses to skip silently over such control sequences. One way to work around this limitation is to say $$\.{\\fi \\let\\proclaim\\relax \\def\\proclaim\{...\} \\ifon}$$ where \.{\\proclaim} is redefined to be the same as usual but without an \.{\\outer} qualification. (The \.{\\fi} here stops the conditional skipping, and the \.{\\ifon} turns it back on again.) Similarly, $$\.{\\fi \\newcount\\n \\ifon}$$ is a safe way to use \.{\\newcount}. Plain \TeX\ already provides a non-outer macro \.{\\tabalign} that does the work of \.{\\+}; you can say $$\postdisplaypenalty=10000 \.{\\fi \\let\\+\\tabalign \\ifon}$$ if you prefer the shorter notation \.{\\+}. \point 14. To get output in languages other than English, redefine the macros \.{\\A}, \.{\\As}, \.{\\ATH}, \.{\\ET}, \.{\\ETs}, \.{\\Q}, \.{\\Qs}, \.{\\U}, \.{\\Us}, \.{\\ch}, \.{\\fin}, \.{\\con}, \.{\\today}, \.{\\datethis}, and \.{\\datecontentspage}. \.{CWEAVE} itself need not be changed. \point 15. Some output can be selectively suppressed with the macros \.{\\noatl}, \.{\\noinx}, \.{\\nosecs}, \.{\\nocon}. \point 16. All accents and special text symbols of plain \TEX/ format will work in \.{CWEB} documents just as they are described in Chapter~9 of {\sl The \TEX/book}, with one exception. The dot accent (normally \.{\\.}) must be typed \.{\\:} instead. \point 17. Several commented-out lines in \.{cwebmac.tex} are suggestions that users may wish to adopt. For example, one such line inserts a blank page if you have a duplex printer. Appendices D, E, and F of the complete version of this manual are printed using a commented-out option that substitutes `$\gets$' for `$=$' in the program listings. Looking at those appendices might help you decide which format you like better. \point 18. Andreas Scherer has contributed a macro called \.{\\pdfURL} with which one can say things like the following, anywhere in the \TeX\ parts or the \CEE/ comments of a \.{CWEB} file: $$\vbox{\halign{\.{#}\hfil\cr You can send email to \\pdfURL\{the author\}\{mailto:andreas\\UNDER/github@@freenet.de\}\cr or visit \\pdfURL\{his home page\}\{https://github.com/ascherer\}\cr}}$$ In a {\mc PDF} document, the first argument will appear in blue as clickable text; the {\mc PDF} viewer, if correctly configured, will then redirect those links to the user's browser and open either the email client or the {\mc HTML} viewer. In a hardcopy document, both arguments will be printed ({\tt the second in parentheses and typewriter type}). Certain special characters in an Internet address need to be handled in a somewhat awkward way, so that \.{CWEAVE} and/or \TeX\ will not confuse them with formatting controls: Use \.{@@} for \.@ and \.{\\TILDE/} for \.\~ and \.{\\UNDER/} for \.\_. \point 19. {\mc PDF} documents contain bookmarks that list all the major group titles in the table of contents, some of which will be subsidiary to others if the depth feature of \.{@*} has been used. Such bookmark entries are also known as ``outlines.'' Moreover, the final group title, `Names of the sections', can be opened up to list every section name; users can therefore navigate easily to any desired section. The macros of \.{cwebacromac.tex} are careful to ``sanitize'' all the names that appear as bookmarks, by removing special characters and formatting codes that are inappropriate for the limited typographic capabilities of {\mc PDF} outlines. For example, one section of \.{CWEAVE} is named `Cases for \\{case\_like}', which is represented by the \TeX\ code `\.{Cases} \.{for} \.{\\PB\{\\\\\{case\\\_like\}\}}' in \.{cweave.tex}; its sanitized name is simply `\.{Cases} \.{for} \.{case\_like}'. (When \.{.pdf} files are produced, the fifth parameter of every \.{\\ZZ} in the \.{.toc} file is set to the sanitized form of the first parameter; see point~9 above and point~20 below.) In general, sanitization removes \TeX\ control sequences and braces, except for control sequences defined by \.{CWEB} itself. Such a translation works most of the time, but you can override the defaults and obtain any translation that you want by using \TeX nical tricks. For example, after $$\.{\\sanitizecommand\\foo\{bar\}}$$ the control sequence \.{\\foo} will sanitize to `\.{bar}'. And after $$\.{\\def\\kluj\#1\\\\\{foo\}}$$ the \TeX\ code `\.{\\kluj bar\\\\}' will print as `foo' but sanitize to `\.{bar}', because the control sequences \.{\\kluj} and \.{\\\\} are removed by sanitization. \point 20. Furthermore, group titles can be converted to an arbitrary sanitized text while also changing their form in running headlines, by using \.{\\ifheader}. Consider, for example, a \.{CWEB} source file that begins with the two lines $$\lpile{\.{\\def\\klujj\#1\\\\\{\\ifheader FOO\\else foo\\fi\}}\cr \.{@*Chinese \\klujj bar\\.}\cr}$$ This coding introduces a major group entitled `{\bf Chinese foo}', with running headline `{\eightrm CHINESE FOO}' and table-of-contents entry `Chinese foo'. The corresponding bookmark is, however, `\.{Chinese} \.{bar}'. And the corresponding \.{.toc} file entry is `\.{\\ZZ \{Chinese \\klujj bar\\\\\}\{1\}\{1\}\{1\}\{Chinese bar\}}'. \vfill\end