Python module#
Overview#
Manipulate LaTeX files and BibTeX databases#
|
Interpret TeX file to allow simple manipulations. |
|
Limit a BibTeX file to a list of keys. |
Indent LaTeX files#
|
Indent text. |
Support functions#
|
Return list with present environments. |
|
Placeholder for text. |
|
Replace text with placeholders. |
|
Replace placeholders with original text. |
|
Find comments. |
|
Per character if it corresponds to commented text. |
|
Remove comments from a string. |
Details#
- class texplain.GeneratePlaceholder(base: str, name: str)#
Class to generate a new placeholder. The following placeholder is generated every time the object is called:
-{base}-{name}-{i:d}-
For example:
>>> gen = GeneratePlaceholder(base="foo", name="bar") >>> gen() '-foo-bar-1-' >>> gen() '-foo-bar-2-'
- Parameters:
base – The base of the placeholder.
name – The name of the placeholder.
- property search_placeholder: str#
Return the regex that can be used to search for the placeholder.
- class texplain.Placeholder(placeholder: str, content: str, space_front: str = None, space_back: str = None, ptype: PlaceholderType = None, search_placeholder: str = None)#
Placeholder for text. This class stores the text to be replaced by a placeholder and the placeholder itself. In addition, it can store the whitespace before and after the placeholder.
- Parameters:
placeholder – The placeholder to use.
content – The text replaced by the placeholder.
space_front – The whitespace before the placeholder.
space_back – The whitespace after the placeholder.
ptype – The type of placeholder, see
PlaceholderType.search_placeholder – The regex used to search for the placeholder (optional, but speeds up greatly for batch searches).
- classmethod from_text(placeholder: str, text: str, start: int, end: int, ptype: PlaceholderType = None, search_placeholder: str = None)#
Replace text with placeholder. Save the content and the current whitespace before and after the placeholder. To restore the original text precisely:
placeholder, text = Placeholder.from_text(placeholder, text, start, end) text = placeholder.to_text(text)
- Parameters:
placeholder – The placeholder to use.
text – The text to consider.
start – The start index of
textto be replaced by the placeholder.end – The end index of
textto be replaced by the placeholder.ptype – The type of placeholder, see
PlaceholderType.search_placeholder – The regex used to search the placeholder.
- Returns:
(Placeholder, text)where intextthe placeholder is inserted.
- to_text(text: str, index: int = None, keep_placeholder: bool = False) str#
Replace placeholder with content. If the whitespace before and after the placeholder is stored, it is restored.
- Parameters:
text – Text.
index – The index of the placeholder.
keep_placeholder – If
Truekeep the placeholder, change only the whitespace.
- Returns:
Text with placeholder replaced by content.
- class texplain.PlaceholderType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)#
Type of placeholder. The placeholders’ practical definition is in
text_to_placeholders(). The intended use is:line: A single line of content (no newline).inline_comment: A comment that is preceded by some content (no newline).comment: A comment that is the only content on that line (no newline).tabular: Block\begin{tabular} ... \end{tabular}.math: Block of displaymath. E.g.\begin{equation} ... \end{equation}.inline_math: Block of inline math. E.g.$ ... $.math_line: A single line of content in math mode (no newline).environment: Block of environment:\begin{...} ... \end{...}.command: Block of command:\command[...]{...}.curly_braced: Block of curly braced content:{...}.command_like: Block ofcommandorcurly_braced.texindent_block: Block of% \begin{texindent} ... % \end{texindent}.noindent_block: Block of% \begin{noindent} ... % \end{noindent}.verbatim: Block of\begin{verbatim} ... \end{verbatim}.let_command: Definition\let....newif_command: Definition\newif....
Except for
line,math_line,comment,inline_comment, andmath_lineall placeholders can span more than one line.- command = 9#
- command_like = 11#
- comment = 3#
- curly_braced = 10#
- environment = 8#
- inline_comment = 2#
- inline_math = 6#
- let_command = 15#
- line = 1#
- math = 5#
- math_line = 7#
- newif_command = 16#
- noindent_block = 13#
- tabular = 4#
- texindent_block = 12#
- verbatim = 14#
- class texplain.TeX(text: str)#
Interpret TeX file to allow simple manipulations. The manipulations are the member functions.
- Parameters:
text – LaTeX code.
- change_label(old_label: str, new_label: str, overwrite: bool = False)#
Change label in
\label{...}and\ref{...}(-like) commands.- Parameters:
old_label – Old label.
new_label – New label.
overwrite – Overwrite existing labels.
- changed()#
Check if the document has changed.
- citation_keys() list[str]#
Read the citation keys in the TeX file (keys in
\cite{...},\citet{...},\citep{...}).- Returns:
Unique list of keys in the order or appearance.
- config_files() list[str]#
Read configuration files in the directory of the TeX file.
- Returns:
List of filenames.
- environments() list[str]#
Return list with present environments (between
\begin{...} ... \end{...}).
- find_by_extension(ext: str) list[str]#
Find all files with a certain extensions in the directory of the TeX file.
- Parameters:
ext – File extension.
- Returns:
List of filenames.
- fix_quotes()#
Replace:
"..."by\`\`...''.'...'by\`...'.
- float_filenames(cmd: str = '\\includegraphics') list[tuple[str]]#
Extract the keys of ‘float’ commands (e.g.
\includegraphics{...},\bibliography{...}) and reconstruct their filenames. This operation is read-only.- Parameters:
cmd – The command to look for.
- Returns:
A list
[('key', 'filename')]in order of appearance.
- format_labels(prefix: str = None)#
Format all labels as:
sec:...: Section labels.ch:...: Chapter labels.fig:...: Figure labels.tab:...: Table labels.eq:...: Math labels.note:...: Footnote.misc:...: Anything else.
- Parameters:
prefix – Add optional
prefix. E.g.key:prefix:....
- classmethod from_file(filename: str)#
Read from file.
- Parameters:
filename – Path to the file to read.
- get()#
Return document.
- labels() list[str]#
Return list of labels (in order of appearance).
- remove_commentlines()#
Remove lines that are entirely a comment.
- remove_comments()#
Remove comments form the main text.
- rename_float(old: str, new: str, cmd: str = '\\includegraphics')#
Rename a key of a ‘float’ command (e.g.
\includegraphics{...},\bibliography{...}). This changes the TeX file.- Parameters:
old – Old key.
new – New key.
cmd – The command to look for.
- replace_command(cmd: str, replace: str, ignore_commented: bool = False)#
Replace command. For example:
Remove the command:
replace_command(r"{\TG}[1]", "")
>>> This is a \TG{I would replace this} text. <<< This is a text.Select a part of the command:
replace_command(r"{\TG}[2]", "#1")
>>> This is a \TG{text}{test}. <<< This is a test.Change the command:
replace_command(r"{\TG}[2]", "\mycomment{#1}{#2}")
>>> This is a \TG{text}{test}. <<< This is a \mycomment{text}{test}.
- Parameters:
cmd – The command’s definition. Given
\newcommand{cmd}[args]{def}you should specify{cmd}[args], or{cmd}(or evencmd) which defaults to{cmd}[1]replace – The
defpart (curly braces around are optional). As in LaTeX replacement is done on#1,#2, …ignore_commented – If
Truethe command is not replaced if it is commented out.
- use_cleveref()#
Replace:
Eq.~\eqref{...} Fig.~\ref{...} ...
By:
\cref{...}
everywhere.
- texplain.bib_select(text: str, keys: list[str], reorder: bool = False) str#
Limit a BibTeX file to a list of keys.
- Parameters:
test – The BibTeX file as string.
keys – The list of keys to select.
reorder – Reorder the entries in the bib-file to match the order of
keys.
- Returns:
The (reduced) BibTeX file, as string.
- texplain.environments(text: str) list[str]#
Return list with present environments. This corresponds to the text between
\begin{...}and\end{...}.
- texplain.find_command(text: str, name: str = None, regex: str = '(?<!\\\\)(\\\\)([a-zA-Z\\@]+)(\\*?)', is_comment: list[bool] = None) list[list[tuple[int]]]#
Find indices of commands, and their options, and arguments. The following pattern is searched for:
Backslash
Word
Any number of matching
[]and{}(in any order).
- Parameters:
text – Text.
name – Name of command without backslash (e.g.
"textbf").regex – Regex to match search the command name.
is_comment – Per character of
text,Trueif the character is part of a comment. Default: search for comments usingis_commented().
- Returns:
List of indices of commands and their arguments:
[[(name_start, name_end), (arg1_start, arg1_end), ...], ...]Note the definition is such that one can extract thej-th component of thei-th command as follows:text[cmd[i][j][0]:cmd[i][j][1]].
- texplain.find_commented(text: str) list[list[int]]#
Find comments.
The output is such that one can find the comments text as follows:
for i, j in find_commented(text): print(text[i : j]) # i is the index of "%"
- Parameters:
text – Text.
- Returns:
List of of indices of the beginning and end of the comments.
- texplain.find_matching(text: str, opening: str, closing: str, ignore_escaped: bool = True, ignore_commented: bool = False, escape: bool = True, opening_match: int = 0, closing_match: int = 0, return_array: bool = False) dict#
Find matching ‘brackets’.
- Parameters:
text – The string to consider.
opening – The opening bracket (e.g. “(”, “[”, “{“).
closing – The closing bracket (e.g. “)”, “]”, “}”).
ignore_escaped – Ignore escaped bracket (e.g. “(”, “[”, “{”, “)”, “]”, “}”).
ignore_commented – Ignore any text that is commented (e.g. “% …”).
escape – If
True,openingandclosingare escaped.opening_match – Select index of begin (
0) or end (1) of opening bracket match.closing_match – Select index of begin (
0) or end (1) of closing bracket match.return_array – If
True, return NumPy-array of indices instead of dictionary.
- Returns:
Dictionary with
{index_opening: index_closing}
- texplain.find_matching_index(opening: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], closing: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], return_array: bool = False) dict#
Find matching ‘brackets’, based on a list of indices corresponding to opening and closing ‘brackets’.
- Parameters:
opening – Indices of the opening brackets.
closing – Indices of the closing brackets.
return_array – If
True, return NumPy-array of indices instead of dictionary.
- Returns:
Dictionary with
{index_opening: index_closing}
- texplain.indent(text: str, indentation: str = ' ', rstrip: bool = True, lstrip: bool = True, squashlines: bool = True, squashspaces: bool = True, symbols: bool = True, environment: bool = True, argument: bool = True, inlinemath: bool = True, linebreak: bool = True, itemize: bool = True, sentence: bool = True, alignment: bool = True, texindent: bool = True, noindent: bool = True) str#
Indent text.
- Parameters:
text – The text to indent.
indentation –
Set indentation of lines between:
\begin{...}[...]{...}and\end{...}.\[and\].{and}.[and](as command option).
Comment lines follow indentation. Requires:
lstrip,inlinemath,environment. To switch off indentation, setindentation="".rstrip – Remove trailing spaces on all lines.
lstrip – Remove all leading spaces before applying indentation.
squashlines – Reduce the maximum number of consecutive blank lines to 2.
squashspaces – Reduce the maximum number of consecutive spaces to 1.
symbols – In math-mode: all symbols are separated by a space.
environment –
\begin{...}[...]{...}and\end{...}(and\[and\]) are placed on separate lines.argument –
Any option or argument that spans more than one line is placed on separate lines. For example:
xxx { This is a very long argument that is more than one line long. } yyy
is formatted to:
xxx { This is a very long argument that is more than one line long. } yyy
inlinemath – Inline math is placed on one line.
linebreak –
\\is followed by a newline.itemize – Each
\itemis placed on a separate line.sentence –
One sentence per line. Every sentence should start on a new line, and it should be (as much as possible) on a single line. The following rules of thumb are followed:
A sentence ends with:
A period, question mark, or exclamation mark.
\begin{...}or\end{...}.Two white lines.
\\The end of an argument (
}or]), see below.A command on the next line.
Commands and inline math are treated as a single word. Formatting is applied on the arguments of commands.
Requires:
rstrip,lstrip,squashspaces.alignment –
If the resulting line is less that 100 characters columns in tabular environments are aligned at
&and also\\are aligned.In other cases single spaces are placed around
&and before\\.
Requires:
environment.texindent –
Custom formatting in blocks:
% \begin{texindent}{...} ... % \end{texindent}
where the
{...}argument is a comma-separated list of options of this function; for example:% \begin{texindent}{sentence=False, inlinemath=False} ... % \end{texindent}
noindent –
Verbatim environments and everything between
% \begin{noindent} ... % \end{noindent}
is not formatted.
- Returns:
The indented text.
- texplain.is_commented(text: str) ndarray[Any, dtype[bool_]]#
Per character if it corresponds to commented text.
- Parameters:
text – Text.
- Returns:
Array of booleans of size
len(text).
- texplain.remove_comments(text: str) str#
Remove comments from a string.
- Parameters:
text – Text
- Returns:
Text without comments.
- texplain.texcleanup(args: list[str])#
Command-line tool to copy to clean output directory, see
--help.
- texplain.texindent_cli(args: list[str])#
Indent TeX file, see
--help.
- texplain.texplain(args: list[str])#
Command-line tool to copy to clean output directory, see
--help.
- texplain.text_from_placeholders(text: str, placeholders: list[Placeholder], keep_placeholders: bool = False) str#
Replace placeholders with original text. The whitespace before and after the placeholder is modified to the match
Placeholder.space_frontandPlaceholder.space_back.- Parameters:
text – Text with placeholders.
placeholders – List of placeholders.
keep_placeholders – If
True, the placeholders are kept (they are merely positioned).
- Returns:
Text with content of the placeholders.
- texplain.text_to_placeholders(text: str, ptypes: list[PlaceholderType], base: str = 'TEXINDENT', placeholders_comments: list[Placeholder] = None) tuple[str, list[Placeholder]]#
Replace text with placeholders. The following placeholders are supported:
PlaceholderType.noindent_block:% \begin{noindent} ... % \end{noindent}
is replaced with
-BASE-NOINDENT-1-
PlaceholderType.texindent_block:% \begin{texindent} ... % \end{texindent}
is replaced with
-BASE-TEXINDENT-1-
-
\begin{verbatim} ... \end{verbatim}
is replaced with
-BASE-VERBATIM-1-
-
A comment on a line that contains no other text.
% ...is replaced with
-BASE-COMMENT-1-
PlaceholderType.inline_comment:A comment following some other text on the same line.
xxx % ...is replaced with
xxx -BASE-INLINE-COMMENT-1-
-
$...$
is replaced with
-BASE-INLINE-MATH-1-
Also looks for
\(...\)and\begin{math}...\end{math}. -
\begin{equation} ... \end{equation}
is replaced with
-BASE-MATH-1-
Also looks for
\[...\],\begin{equation*}...\end{equation*},\begin{align}...\end{align}, and\begin{align*}...\end{align*}. -
A line of display mode math (see
PlaceholderType.math)\begin{equation} ... ... \end{equation}
is replaced with
\begin{equation} -BASE-MATH-LINE-1- -BASE-MATH-LINE-2- \end{equation}
-
\begin{...} ... \end{...}
is replaced with
-BASE-ENVIRONMENT-1-
-
\begin{tabular} ... \end{tabular}
is replaced with
-BASE-TABULAR-1-
-
\foo[...]{...}
is replaced with
-BASE-COMMAND-1-
-
{...} \foo[...]{...} {\foo[...]{...}}
is replaced with
-BASE-COMMAND-1- -BASE-COMMAND-2- -BASE-COMMAND-3-
-
{...}
is replaced with
-BASE-CURLY-BRACED-1-
-
\let\iffoois replaced with
-BASE-LET-1-
PlaceholderType.newif_command:\newif\iffoois replaced with
-BASE-NEWIF-1-
- Parameters:
text – Text.
ptypes – List of placeholder types to replace
base – Base string for placeholders
placeholders_comments – List of placeholders that are comments (needed to search commands)
- Returns:
(text, placeholders)withtext: Text with placeholdersplaceholders: List of placeholders