55

I am more and more interested in the system TeX (and LaTeX) and I want to study it more deeply. I am not telling that I want to learn how to use it, but I want to understand its mechanism now.

The first thing that I can imagine is to download its source code Plain Tex. But

...the current TeX software is written in WEB, a mixture of documentation written in TeX and a Pascal subset in order to ensure portability... (Wikipedia)

I am an amateur C programmer with a little experience of Win32 API, and I always use Microsoft Visual Studio to do programming.

I have totally no idea what to do next with the souce code above, for example, compile it and get outputs (on Windows and on Linux), use the outputs to compile a .tex file and get the output.

Can some expert(s) talk and explain a little about this beautiful sytem, or, at least, tutorials about how to build it, please?

5
  • Unless you plan to do something very unusual, understanding the source of TeX is not necessary to use TeX. Certainly I would not claim to know much Pascal, but I'm quite good at programming TeX. Commented Apr 29, 2013 at 12:29
  • 2
    One piece of advise from an old fart: Concentrate on what is (more or less) immediately useful. That way you have more motivation, and the risk of getting sidetracked is much less. TeX is not a good way to learn how to build packages, it uses a very idiosyncratic language (web), and was written to install on all sort of extremely weird machines/operating systems (most of them long defunct). Commented Apr 29, 2013 at 18:51
  • 3
    Like you, I also wanted to study the TeX code. After all, it was written by an eminent computer scientist, and, for a large piece of software, it is remarkably free of bugs. It looked like something significant could be learned. But, I gave up after a while. Mostly because the code is full of clever tricks to pack data into memory locations to save space. This made sense 30 years ago, but not so much today, and it makes the code very difficult to read (for me, anyway). There is a system called JavaTex, by Tim Murphy, which might be a bit easier to read, maybe. But I ran out of time. Commented Apr 30, 2013 at 10:31
  • 1
    Also, I agree with vonbrand -- Tex includes a lot of exotic code that was required to make things portable 30 years ago. Commented Apr 30, 2013 at 10:34
  • 1
    ctan.math.illinois.edu/info/knuth-pdf/tex/tex.pdf Commented May 31, 2021 at 18:58

7 Answers 7

33

You are linking there to plain.tex which is a file written in TeX not the source of tex-the-program (which is tex.web)

These days if you want to compile from source it is probably best to start with a full download of the texlive build sources.

The sources are at

the description has hints about where to start if you want to compile. See in particular:

https://www.tug.org/texlive/build.html

Good luck:-)

5
  • 2
    @user565739 Building just tex without a minimal support stuff for fonts and so one will not give you a workable system. Perhaps take a look at KerTeX if you want something smaller that you can compile. Commented Apr 29, 2013 at 12:32
  • 3
    @user565739 well.. the original tex sources are tex.web which is pascal more or less, but you are hard pushed to find a usable pascal compiler these days so you probably want at least web2c if not the whole of texlive. web2c as its name suggests translates web to c so it can be compiled with C but you also need the kpathsea file search library if you want a tex that can handle modern filesystems, and so it goes. It is probably easier to start with texlive and cut things out of the makefiles than to just get the original Knuth sources and try to build on a current system. Commented Apr 29, 2013 at 12:33
  • @JosephWright: (Maybe very stupid) After compile the source, I won't get the fonts? Commented Apr 29, 2013 at 12:33
  • 1
    @DavidCarlisle: So now I am curious that how texlive do with tex.web? When I build texlive, it also compile tex.web? Commented Apr 29, 2013 at 12:37
  • 4
    @user565739 yes of course the heart of the sources is still tex.web but you need to build the scaffolding before you build the building and web2c provides that. texlive itself of course then goes on to provide all the tex packages and updates mechanisms and stuff which you don't need but still it's easier to download a working build setup and run the build scripts and watch and learn what happens rather than just download the raw web sources and a pascal compiler and then try to recreate a working system as it was in 1982. Commented Apr 29, 2013 at 12:42
33

The original sources for TeX (and friends) can be found on Knuth's CTAN. All the sources have very comprehensive documentation, but trying to compile them is still an epic task.

TeX is written in WEB, a programming language invented by Knuth. So first we're going to need WEB.

WEB is written in WEB itself, it consists of two programs: weave which produces the TeX documentation from a WEB program, and tangle which produces Pascal code from a WEB program. To compile the WEB system we need an implementation of tangle; you can either get one from an existing TeX package, or you can compile it from Pascal source.

Note that if you want to read about the implementation of any WEB program you can use weave and TeX to produce copious documentation; this is a good starting point to Knuth's code. (For TeX and Metafont you can buy a printed version of the WEB output as Computers and Typesetting volumes B and D respectively.)

Now we need to talk a little about the dialect of Pascal the Knuth uses, which he calls Pascal H. TeX was written before Pascal was standardised; to my knowledge no native Pascal H compiler exists and it is not compatible with modern Pascal compilers. However Knuth wrote the programs in a relatively portable way so it's only moderately Herculean to port them. At this point you have some choices:

  1. Write a compiler for Knuth's Pascal H
  2. Port the WEB source to an existing Pascal dialect using change files
  3. Translate Knuth's Pascal H to another programming language

tex-gpc takes approach 2, TeX live (and Miktex) take approach 3 via web2c.

Now if you can do this the actual process of initialising and running TeX (initex, fonts, etc.) will be relatively easy; make sure you validate your build against TRIP (see the TeX sources). If you're feeling adventurous do the same for Metafont, the TeX tools, Metafont tools, and WEB.

About the design of TeX and Metafont: these programs were designed by Knuth to be highly robust, efficient and portable in the late 1970s. Today programmers take for granted the speed of modern processors and programming standards that allow them to write adequately functioning programs much more quickly. Much of what happens in these programs (e.g. carefully enumerating character codes, statically allocating memory at compile time, on-line error recovery) rarely happens in today's programming; and many of the modern annoyances (having to compile LaTeX twice for back references, difficulty with fonts, the intricacy of the macro language) are a result of these design goals and decisions. I wouldn't advocate Knuth's methods for most projects today involving multiple people, efficient computers, and tight deadlines. Still TeX is among the oldest programs to still be running today (and into the future unless LuaTeX supplants it), delicately designed, intricately implemented, pretty portable and copiously documented.

Good luck!

25

David Carlisle explains how to compile the sources for the modern versions of Tex that are the basis for Texlive (Pdftex, Xetex, and Luatex, among others). These derive from Karl Berry's Web2c fork of Knuth's source code, which is a mechanical way of translating code Web sources to C code that can be compiled just about anywhere.

If you want to compile sources that are closer to what Knuth wrote (and documents in The TeXbook), take a look at:

http://www.ctan.org/tex-archive/systems/unix/tex-gpc/

This project allows you to compile Pascal WEB sources directly, using GNU Pascal. This apparently wasn't trivial; as the author, Wolfgang Helbig, writes:

I was somewhat intrigued while building TeX from its sources, since some of these depend on others to be built and installed. Knuth wrote these programs in the WEB language (WEB is only remotely related to the last W from CERN's WWW). WEB programs are converted to Pascal sources by tangle and to a TeX input file by weave. Of course, tangle and weave are WEB programs as well. So one needs tangle to build tangle---and weave and TeX to read a beautifully typeset WEB program. But don't despair, I cut this indefinite recursion and provided tangle.p, the Pascal source of tangle, and tex.pdf. It shows what, why and how I changed Knuth's program.

His tex.pdf documents in minute detail these changes.

2
17

All the programs in the TeX series (TeX by itself, but also METAFONT and all the auxiliaries) are written in WEB, but to ease the translation, D. E. Knuth has not really written in a flavour of Pascal, but in some very abstract Pascal; much an Algol: there is not a lot of Pascal idiosyncrazies left specifically to allow to convert the programs to whatever. The simplest, nowadays, is to convert the Pascal like code in raw C.

In kerTeX this is what is made by kertex_M: the matrix, the tools needed to obtain C code. The matrix tools are compiled... on the matrix, to obtain C code that can be compiled for whatever target. (Allowing cross-compilation).

If you want to see the thing, just use kerTeX and with the get_mk_install.sh (or rc) program, when asked if you want the program to remove the intermediary products (including the *.c files in this case), just answer : NO to keep them.

For the bootstrapping problem (tangling tangle), D. E. Knuth has written that the first version was done by hand: he mimicked by hand what tangle would do (this is why such a program has to be basic, the same way a compiler has to be able to bootstrap a more basic version of itself to use itself to compile more complex one). This can also be done with text tools (I have sketched in kertex_M/bin1/tangle/tangleboot.sh such a program; it is not used, and lacks change file support).

The conversion from Pascal to C was needed because Pascal was not standardized, Pascal compilers not ubiquitous etc. And D.E. Knuth has also tried several years ago to encourage the possibility to compile TeX and al. with a Pascal compiler, just to be sure his programs will not be unavailable because of some compiling nightmare. And it was a pain.

It is more simple to convert to C, since it is, on purpose, a very basic Pascal (bibtex is far more hard to convert because it uses Pascal idiosyncrazies).

For information, the conversion from Web to C was web-to-c, initially from Tomas Rokicki. This is still what is used with TeXLive, or what has been used as a basis for kerTeX.

1
  • No use of ctangle and its ilk then in TeXlive? Strange... Commented Apr 29, 2013 at 18:46
15

Talk is cheap. Show me the code. ~Linus Torvalds

  1. Install Free Pascal Compiler (fpc) on the system.

    pacman -S --noconfirm fpc # Arch apt-get install fpc # Debian 
  2. Download, then bootstrap tangle and font-related programs.

     export TEX_HOME="$HOME/dev/tex" export PATH="${PATH}:${TEX_HOME}/work/bin" mkdir -p $TEX_HOME cd $TEX_HOME wget http://mirrors.ctan.org/systems/knuth/dist.zip wget http://mirrors.ctan.org/systems/knuth/local.zip wget http://mirrors.ctan.org/systems/unix/tex-fpc.zip for i in *.zip; do unzip $i; done rm *.zip mkdir -p work && cd work mkdir -p TeXinputs TeXformats TeXfonts PKfonts DVIPSconf mkdir -p MFbases MFinputs mkdir -p bin cd .. cp dist/lib/*.tex work/TeXinputs cp dist/cm/*mf dist/lib/*mf work/MFinputs cp local/cm/*mf local/lib/*mf work/MFinputs cp tex-fpc/shell/* work/bin fpc tex-fpc/tangle.p mv tex-fpc/tangle work/bin itgl dist/mf/mf.web tex-fpc/mf.ch itgl dist/tex/tex.web tex-fpc/tex.ch mv inimf initex work/bin mv tex.pool work/TeXformats mv mf.pool work/MFbases 
  3. Update tex-fpc/local.mf to define printer settings. See comments in modes.mf for details. For example:

     % Brother HL-L2320D (name must be one to eight letters, lowercase) mode_def brhlld = proofing := 0; fontmaking := 1; tracingtitles := 0; pixels_per_inch := 600; o_correction := 1; enddef; localfont := brhlld; 
  4. Build the tex and weave programs.

     cd work inimf ../dist/lib/plain input ../tex-fpc/local dump mv plain.base MFbases cd .. sed -i '99,106s/\t//' tex-fpc/mf.ch tgl dist/mf/mf.web tex-fpc/mf.ch mv mf work/bin cd dist/mfware tgl gftopk.web ../../tex-fpc/gftopk.ch mv gftopk ../../work/bin cd ../../work ../tex-fpc/MFT/plainfonts ../tex-fpc/MFT/webfonts ../tex-fpc/MFT/mfwebfonts ../tex-fpc/MFT/manfonts ../tex-fpc/MFT/logmacfonts ../tex-fpc/MFT/tripmanfonts cp ../dist/lib/hyphen.tex TeXinputs cp ../tex-fpc/webmac-fpc.tex TeXinputs initex ../dist/lib/plain \\dump mv plain.fmt TeXformats cd .. sed -i '198,204s/\t//' tex-fpc/tex.ch tgl dist/tex/tex.web tex-fpc/tex.ch mv tex work/bin tgl dist/web/weave.web tex-fpc/weave.ch mv weave work/bin 
  5. Build the xdvi program.

     cd tex-fpc/xdvi wget https://math.berkeley.edu/~vojta/xdvi/xdvi-22.86.tar.gz tar xf *gz rm *gz mkdir -p $TEX_HOME/xdvi/bin mkdir -p $TEX_HOME/xdvi/man/man1 mkdir -p $TEX_HOME/xdvi/share/dvips/config cd xdvi* CPPFLAGS='-DBDPI=600 -DMAKEPK=4' ./configure --prefix=$TEX_HOME/xdvi \ --with-default-texmf-path=$TEX_HOME/work/DVIPSconf \ --disable-freetype \ --enable-old-make-pk \ --enable-extra-app-defaults=$TEX_HOME/work/DVIPSconf \ --with-default-font-path=$TEX_HOME/work/PKfonts \ --with-default-header-path=$TEX_HOME/work/DVIPSconf \ --with-default-fig-path=.:$TEX_HOME/work/TeXinputs \ --without-mfmode make && make install cd ../../.. mv xdvi/bin/xdvi* work/bin cd work 
  6. Weave the TeX-FPC document.

     wve ../dist/tex/tex.web ../tex-fpc/tex.ch && xdvi tex 

This results in:

TeX-FPC

See Wolfgang Helbig's TeX-FPC README file for details.

0
3

I followed the answer shared by Dave Jarvis and the TeX-FPC README file to build it for the most recent version of TeX (released on Feb 2021). With the latest version, I encountered a memory issue in generating tex.dvi and had to apply a patch for webmac written by Joachim Kuebart.

I made the build available in a Docker image. It is documented here.

If you want to build it without Docker, then here are the steps.

Step 1 Customize these variables and save them to your ~/.bashrc file. Restart the terminal after doing so.

env TEX_HOME="/root/tex" env PATH="${PATH}:${TEX_HOME}/distro/bin/" env PATH="${PATH}:${TEX_HOME}/tex-fpc/shell/" env PATH="${PATH}:${TEX_HOME}/tex-fpc/MFT/" 

Step 2 Install the build dependencies. These are the commands that I use for debian.

apt-get update -y && apt-get upgrade -y apt-get install patch fpc zip unzip procps ed tree -y --no-install-recommends apt-get install wget -y 

Step 3 Run the prepare-build.sh script.

#!/usr/bin/env bash set -e mkdir -p "$TEX_HOME" cd "$TEX_HOME" # get the source files wget --no-verbose http://mirrors.ctan.org/systems/knuth/dist.zip wget --no-verbose http://mirrors.ctan.org/systems/knuth/local.zip wget --no-verbose http://mirrors.ctan.org/systems/unix/tex-fpc.zip for i in *.zip; do unzip -q $i; done rm *.zip # base folders that will be required for metafont and tex mkdir distro cd distro mkdir -p TeXinputs TeXformats TeXfonts MFbases MFinputs bin cd "$TEX_HOME" cp -r dist/* tex-fpc # build tangle, which converts .web + .ch files into Pascal files fpc ./tex-fpc/tangle.p mv tex-fpc/tangle distro/bin/ # build weave, which converts .web + .ch files into .tex files cd $TEX_HOME/tex-fpc/web cp ../weave.ch . ../ch.ch/mkprod weave tgl weave.web weave.ch mv weave ../../distro/bin/ 

Step 4 Run the build-mf.sh script.

#!/usr/bin/env bash set -e # build inimf (the initialization version of metafont, which supports the dump command) cd "$TEX_HOME" itgl ./tex-fpc/mf/mf.web ./tex-fpc/mf.ch mv mf.pool distro/MFbases/ mv inimf distro/bin/ # build plain.base (base files to metafont are like format files to tex) # also note the use of the inimf dump command cd "$TEX_HOME" cp /tmp/local.mf tex-fpc/MFT/ cd distro inimf ../tex-fpc/lib/plain input ../tex-fpc/MFT/local dump mv plain.base MFbases/ # build the production version of metafont cd $TEX_HOME/tex-fpc/mf/ cp ../mf.ch . ../ch.ch/mkprod mf tgl mf.web mf.ch mv mf ../../distro/bin/ # get the source font files cd $TEX_HOME mv local/cm/*mf local/lib/*mf distro/MFinputs/ cp tex-fpc/lib/manfnt.mf distro/MFinputs/ cp tex-fpc/lib/logo10.mf distro/MFinputs/ cp tex-fpc/lib/logo.mf distro/MFinputs/ # use metafont to build the fonts that are required for plain.fmt cd $TEX_HOME/tex-fpc/cm/ ln -s ../../distro/MFbases/ . ln -s ../../distro/MFinputs/ . ln -s ../../distro/TeXfonts/ . plainfonts manfonts webfonts cd $TEX_HOME/distro/ mkfont manfnt mkfont logo10 

Step 5 Create the webmac-memory.patch file.

@@ -81,18 +81,17 @@ \outer\def\N#1.#2.{\MN#1.\vfil\eject % beginning of starred section \def\rhead{\uppercase{\ignorespaces#2}} % define running headline \message{*\modno} % progress report \edef\next{\write\cont{\Z{#2}{\modno}{\the\pageno}}}\next % to contents file \ifon\startsection{\bf\ignorespaces#2.\quad}\ignorespaces} \def\MN#1.{\par % common code for \M, \N {\xdef\modstar{#1}\let\*=\empty\xdef\modno{#1}}% remove \* from section name \ifx\modno\modstar \onmaybe \else\ontrue \fi - \mark{{{\tensy x}\modno}{\rhead}}} - % each \mark is {section reference or null}{group title} + \mark{{\tensy x}\modno}} \def\O#1{\hbox{\rm\char'23\kern-.2em\it#1\/\kern.05em}} % octal constant \def\P{\rightskip=0pt plus 100pt minus 10pt % go into Pascal mode \sfcode`;=3000 \pretolerance 10000 \hyphenpenalty 10000 \exhyphenpenalty 10000 \global\ind=2 \1\ \unskip} \def\Q{\rightskip=0pt % get out of Pascal mode \sfcode`;=1500 \pretolerance 200 \hyphenpenalty 50 \exhyphenpenalty 50 } @@ -116,31 +115,29 @@ \let\*=* \def\onmaybe{\let\ifon=\maybe} \let\maybe=\iftrue \newif\ifon \newif\iftitle \newif\ifpagesaved \def\lheader{\mainfont\the\pageno\eightrm\qquad\rhead \hfill\title\qquad\mainfont\topsecno} % top line on left-hand pages \def\rheader{\mainfont\topsecno\eightrm\qquad\title\hfill \rhead\qquad\mainfont\the\pageno} % top line on right-hand pages -\def\topsecno{\expandafter\takeone\topmark} -\def\takeone#1#2{#1} -\def\taketwo#1#2{#2} +\let\topsecno=\topmark \def\nullsec{\eightrm\kern-2em} % the \kern-2em cancels \qquad in headers \def\page{\box255 } \def\normaloutput#1#2#3{\ifodd\pageno\hoffset=\pageshift\fi \shipout\vbox{ \vbox to\fullpageheight{ \iftitle\global\titlefalse \else\hbox to\pagewidth{\vbox to10pt{}\ifodd\pageno #3\else#2\fi}\fi \vfill#1}} % parameter #1 is the page itself \global\advance\pageno by1} \def\rhead{\.{WEB} OUTPUT} % this running head is reset by starred sections -\mark{\noexpand\nullsec{\rhead}} +\mark{\noexpand\nullsec} \def\title{} % an optional title can be set by the user \def\topofcontents{\centerline{\titlefont\title} \vfill} % this material will start the table of contents page \def\botofcontents{\vfill} % this material will end the table of contents page \def\contentspagenumber{0} % default page number for table of contents \newdimen\pagewidth \pagewidth=6.5in % the width of each page \newdimen\pageheight \pageheight=8.7in % the height of each page \newdimen\fullpageheight \fullpageheight=9in % page height including headlines 

Step 6 Run the build-tex.sh script.

#!/usr/bin/env bash set -e # build initex (an initialization version of tex, which supports the dump command to create formats) cd $TEX_HOME itgl ./tex-fpc/tex/tex.web ./tex-fpc/tex.ch mv tex.pool distro/TeXformats/ mv initex distro/bin/ # create the plain format (uses the dump command from initex) cd $TEX_HOME/ cp ./tex-fpc/lib/hyphen.tex distro/TeXinputs/ cd ./tex-fpc/tex ln -s ../../distro/TeXformats/ . ln -s ../../distro/TeXfonts/ . ln -s ../../distro/TeXinputs/ . initex ../lib/plain \\dump mv plain.fmt TeXformats/ # build the production version of tex cd $TEX_HOME/tex-fpc/tex cp ../tex.ch . ../ch.ch/mkprod tex tgl tex.web tex.ch mv tex ../../distro/bin/ # add the webmac files and apply the patch by Joachim Kuebart cd $TEX_HOME/distro cp ../tex-fpc/webmac-fpc.tex ./TeXinputs/ cp ../tex-fpc/lib/webmac.tex ./TeXinputs/ patch ./TeXinputs/webmac.tex -i /tmp/webmac-memory.patch # build the tex.dvi document cd $TEX_HOME/distro weave ../tex-fpc/tex/tex.web ../tex-fpc/tex/tex.ch tex.tex tex tex.tex # list the files in your distro tree 

By performing these steps, you will be able to compile .tex documents into .dvi using the tex command, or more preferably, the tex.sh script. The tex.sh script creates soft links for the directories TeXformats and TeXfonts in the work directory and then invokes tex.

0
1

This post on StackExchange describes how to build TeX on Windows with the Free Pascal Compiler but without the CTAN package TeX-FPC. It is a short and clear way to build TeX, because it applies fewer modifications to the original source code and uses precompiled fonts to avoid compiling Metafont.

I ported the files to Linux and applied further simplifications. The three files that are required to compile TeX with FPC are tangle.p, tex.ch and weave.ch. These files and further explanations can be found in my git repository.

Here are the steps:

  1. Install the Free Pascal Compiler and other dependencies (adapt for other Linux distributions)

    apt-get install fpc wget unzip git 
  2. Download source code

    mkdir build cd build wget https://mirrors.ctan.org/systems/knuth/dist.zip unzip dist.zip wget https://mirrors.ctan.org/systems/knuth/local.zip unzip local.zip git clone https://github.com/bobbl/play_with_tex.git git switch --detach aa2a21c6c881 # remove this line for latest version 
  3. Bootstrap TANGLE from modified source code

    cp ./play_with_tex/tex82/tangle.p . fpc tangle.p 
  4. Compile INITEX and TEX

    mkdir -p TeXformats ./tangle ./dist/tex/tex.web ./play_with_tex/tex82/tex.ch tex.p TeXformats/tex.pool fpc -dinitex tex.p -oinitex fpc tex.p 
  5. Download precompiled metric font files

    wget https://mirrors.ctan.org/fonts/cm/tfm.zip unzip tfm.zip wget https://mirrors.ctan.org/fonts/manual.zip unzip manual.zip wget https://mirrors.ctan.org/fonts/mflogo.zip unzip mflogo.zip mkdir -p TeXfonts cp tfm/*.tfm TeXfonts/ cp manual/tfm/*.tfm TeXfonts/ cp mflogo/tfm/*.tfm TeXfonts/ 
  6. Make plain.fmt with INITEX

    cp dist/lib/plain.tex . cp dist/lib/hyphen.tex . ./initex plain \\dump mv plain.fmt TeXformats/plain.fmt 

The TeX build is finished with this step. The last two steps are just to test it:

  1. Compile WEAVE for the test

    ./tangle ./dist/web/weave.web ./play_with_tex/tex82/weave.ch weave.p /dev/null fpc weave.p cd .. 
  2. Generate the TeX source code documentation with WEAVE and run TeX on it

    ln -s ./build/TeXformats ln -s ./build/TeXfonts ./build/weave ./build/dist/tex/tex.web ./build/play_with_tex/tex82/tex.ch tex.tex cp ./build/dist/lib/webmac.tex . ./build/tex tex.tex 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.