A wrapper around pdflatex which gives some flexibility for compiling large files.
Let say you have a (very) large LaTeX document that is divided into dozen or hundreds of .tex files (like this one for example -- 3500 pages). The usual mechanism that allows to compile only a part of the document is \includeonly.
The problems are :
\includeonlyimplies that you input your.texfiles with\include, which basically restricts you to only one file by chapter.- Each time you want to compile an other part of the document, you have to change the
\includeonlyline.
pytex is a python script that generates on the fly an intermediate .tex file and then launches pdflatex on it.
Suppose to have the LaTeX document main.tex as
\documentclass[a4paper,oneside,11pt]{book}
\input{preamble}
\chapter{First chapter}
\input{foo1}
\input{foo2}
\chapter{Second chapter}
\input{bar1}
\input{bar2}
\chapter{Third chapter}
\input{foo3}
\input{foo4}If you want to compile the files premable.tex, foo1 and foo2 you creates the following file named lst_example.py :
#! /usr/bin/python
# -*- coding: utf8 -*-
from __future__ import unicode_literals
import LaTeXparser
import LaTeXparser.PytexTools
myRequest = LaTeXparser.PytexTools.Request()
myRequest.original_filename="main.tex"
myRequest.ok_filenames_list=["preamble"]
myRequest.ok_filenames_list.extend(["foo1"])
myRequest.ok_filenames_list.extend(["foo3"])
myRequest.new_output_filename="MySmallPart.pdf"and you compile with
pytex lst_example.py
This will generates MySmallPart.pdf containing the result of the compilation with only premable, foo1 and foo2 (and also some intermediate files).
- Download pytex and save it somewhere bash will be able to find.
-
pytexgenerates on the fly an intermediate.texfile that contains the requested\inputlines. You can perform arbitrary string manipulations in Python on that file before the compilation. Some are predefined. -
pytexwill compile as much times as necessary for all the cross-references to be done. -
pytexreads the.auxfile and presents the missing and multiple labels in a convenient way. -
The option
--verifchecks if the document contains\refor\eqreffor which the corresponding\labellies later in the document (in a text math, one should refer to theorems that will be proven later). You can define exceptions : sentences that you allow to refer to "future" label.
-
The documents le frido and mazhe are created from the same main LaTeX file.
pytexperforms quite a lot of "pre-compilation" work on the fly. Notice by example the fact that the first is not divided in parts while the second is. -
The paper BTZ black hole from the structure of so(2,n) is divided in two parts : one "short version" and one "long version" that share a lot of text. Believe it or not : there is no code duplucation on my computer. I wrote only once each statements and
pytexmade the work or recomposing thetexfile. There are of course a lot of code duplication in thetexfile I uploaded, which was automatically generated.
LaTeXParser is a small parser of LaTeX written in Python. It allows to know, in Python, where such and such macro is used and replace the occurrence of a macro by an user-defined string.
The aim is to help writing pre-(LaTeX)compilation scripts in Python for complex documents. Examples are :
- There is in fact no code duplication in the sources of the preprint BTZ black hole from the structure of the algebra so(2,n)
- Extracting a managing the source code of Le Frido from the ones of mazhe
The XML file in which are recorded the sha1sum of the followed files is of the form
<?xml version="1.0" ?>
<Followed_files>
<fichier name="ess.py" sha1sum="a329313819092a183ca8b08bb7c178807a1a68b7"/>
<fichier name="ess.aux" sha1sum="be730c54ff1d1a75398a496283efe45c675dc54f"/>
</Followed_files>The principal XML object is got by root = minidom.parse()
Then the «list of lists» of elements "Followed_files" is got by fileNodes = root.getElementsByTagName("Followed_files")
In the example above, there is only one. At this point fileNodes is a list whole element 0 represents the lines
Each element in these lines has the tag "fichier". Then the list is given by fileNode.getElementsByTagName("fichier")
The first element of that list represents the line
If F = fileNode.getElementsByTagName("fichier")[0], then we get the name by F.getAttribute("sha1sum")
See the "DOM example" in "Python Library Reference Release 2.3.5".
The file containing the pieces of LaTeX code have the structure +++++++++++++++++++++++++++++++++++++++++++
Bonjour Au revoir +++++++++++++++++++++++++++++++++++++++++++We extract the interesting informations in the following way :
dom = minidom.parse("ess.xml") for box in dom.getElementsByTagName("CodeBox"): print box.getAttribute("label") text = getText(box.childNodes) print "\n".join(text.split("\n")[1:-1]) # Because minidom adds an empty line at first and last position.
See also tests.py and magical_box.tex