innoConv (mintmod) API documentation¶
What is innoConv (mintmod)?¶
This package converts mintmod-flavoured LaTeX into Markdown.
It can be seen as a shim for mintmod.tex
and handles important mintmod
commands by translating them to regular Pandoc elements.
Technically spoken it’s essentially a wrapper to Pandoc.
Read more about innoconv-mintmods Architecture.
Table of contents¶
Installation¶
Prerequisites¶
innoconv-mintmod is mainly used on Linux machines. It might work on Mac OS, Windows/Cygwin/WSL. You are invited to share experiences in doing so.
Dependencies¶
The only dependencies you have to provide yourself is Pandoc and the Python interpreter.
All others can be installed into a Virtual environment.
Python interpreter¶
While other versions of Python might work fine, innoconv-mintmod was tested with Python 3.9. Make sure you have it installed.
Pandoc¶
You need to make sure to have a recent version of the pandoc binary available in
PATH
(Pandoc 2.11.x at the time of writing). There are several ways on
installing Pandoc.
Older versions of Pandoc might not work.
Virtual environment¶
Setup and activate a virtual environment in a location of your choice.
$ python3 -m venv venv
$ source venv/bin/activate
Install innoconv-mintmod in your virtual environment using pip.
$ pip install --process-dependency-links -e git+https://gitlab.tubit.tu-berlin.de/innodoc/innoconv-mintmod.git#egg=innoconv-mintmod
If everything went fine you should now have access to the innoconv-mintmod
command.
$ innoconv-mintmod
usage: innoconv-mintmod [-h] [-o OUTPUT_DIR_BASE]
[-f {latex+raw_tex,markdown}]
[-t {html5,json,latex,markdown,asciidoc}] [-l {de,en}]
[-d] [-i] [-r] [-s]
source
innoconv-mintmod: error: the following arguments are required: source
Congratulations!
How to use innoconv-mintmod
¶
You can run the converter in your content directory.
$ innoconv-mintmod .
This will trigger the conversion for this folder.
Command line arguments¶
usage: innoconv_mintmod [-h] [-o OUTPUT_DIR_BASE]
[-f {latex+raw_tex,markdown}]
[-t {html5,json,latex,markdown,asciidoc}] [-l {de,en}]
[-d] [-i] [-r] [-g]
source
Positional Arguments¶
- source
content directory or file
Named Arguments¶
- -o, --output-dir-base
output base directory
Default: “./innoconv_mintmod_output”
- -f, --from
Possible choices: latex+raw_tex, markdown
input format
Default: “latex+raw_tex”
- -t, --to
Possible choices: html5, json, latex, markdown, asciidoc
output format
Default: “markdown”
- -l, --language-code
Possible choices: de, en
two-letter language code
Default: “de”
- -d, --debug
debug mode (output HTML and highlight unknown commands)
Default: False
- -i, --ignore-exercises
don’t show logs for unknown exercise commands/envs
Default: False
- -r, --remove-exercises
remove all exercise commands/envs
Default: False
- -g, --generate-innodoc
split sections and generate manifest.yaml
Default: True
Converting legacy mintmod content¶
In this chapter some findings are documented on how to prepare content so it can be read by the innoconv-mintmod command.
Note
It’s not a complete list and there might be things missing that need to be done in your specific case.
First of all make sure all content is UTF-8 encoded. If not, tools like iconv can be helpful.
Adjust commands¶
There are some mintmod commands Pandoc is not able to parse. You need to manually replace them throughout your project.
Remove \ifttm…\else…\fi
commands¶
mintmod_ifttm
can get rid of all \ifttm
commands.
Usage:
$ mintmod_ifttm < file_in.tex > file_out.tex
Automate on many files:
$ find . -name '*.tex' | xargs -I % sh -c 'mintmod_ifttm < % > %_changed && mv %_changed %'
Warning
The script cares only about \ifttm…\else…\fi
with an \else
command.
There may be occurences of \ifttm…\fi
(without \else
). You need to
remove them manually!
Unwanted LaTeX commands¶
A couple of commands are superflous or doesn’t make sense in a web-first content publishing platform like innoDoc. So remove any occurences of the following commands.
\input{mintmod.tex}
\input{english.tex}
\begin{document}
\begin{document}
\MPragma{MathSkip}
\Mtikzexternalize
\relax
\-
(hyphenation)\pagebreak
\newpage
\MPrintIndex
\relax
Automate:
find . -type f -name '*.tex' -or -name '*.rtex' | xargs perl -i -pe 's/\\input{mintmod(.tex|)}\w*\n//igs'
Including other modules¶
Pandoc doesn’t understand \IncludeModule
. Change these statements to proper
LaTeX commands.
\IncludeModule{folder}{file.tex}
→ \input{folder/file.tex}
.
Replace strings¶
There are a couple of special characters you need to replace yourself.
\"a
→ä
\"o
→ö
\"u
→ü
\"A
→Ä
\"O
→Ö
\"U
→Ü
\"s
→ß
\"s
→ß
{\ss}
→ß
\ss `` → ``ß
\ss\
→ß
\ss{}
→ß
\ss
→ß
"a
→ä
"o
→ö
"u
→ü
"A
→Ä
"O
→Ö
"U
→Ü
"`
→„
``
→„
''
→“
"'
→“
Automate:
find . -type f -name '*.tex' -or -name '*.rtex' | xargs sed -i 's/\\"a/ä/g'
Clean up code¶
Remove unused files from your project and keep track of your changes using a VCS.
Architecture¶
This section gives an overview of innoconv-mintmods architecture.
The command line interface¶
The entry point is the command line tool innoconv-mintmod
.
It calls panzer with the correct parameters.
Most of the magic happens in the package MintmodFilterAction
.
It is implemented as a Pandoc filter and provides functions to deal with a number of special LaTeX mintmod commands Pandoc would otherwise just ignore.
All special commands are translated into primitives Pandoc knows already. Additionally information is encoded in attributes that are attached to the resulting elements.
The result of the MintmodFilterAction
is a regular Pandoc AST that can
be further processed by Pandoc output modules, thus be translated to Markdown,
LaTeX, HTML and so forth.
The Pandoc JSON output is processed by generate_innodoc.py. It’s implemented as a post-flight panzer script.
panzer¶
panzer is a small wrapper script around Pandoc. It enriches Pandoc with serveral useful features that just happened to match this projects needs.
First of all it is possible to define profiles (called styles in panzer) that can already define parameters on how to run Pandoc.
Furthermore it can manage applied filters, run pre- and postprocessors etc.
You can find its configuration in the sub-directory .panzer
.
Module overview¶
innoconv_mintmod.constants¶
Project constants are defined here.
-
innoconv_mintmod.constants.
INDEX_LABEL_PREFIX
¶ Element class for index labels
-
innoconv_mintmod.constants.
COMMANDS_IRREGULAR
¶ Math commands with irregular arguments, key=command-name, value=formatstring or value=dict (number of arguments, formatstring)
-
innoconv_mintmod.constants.
REGEX_PATTERNS
¶ Regular expressions
-
innoconv_mintmod.constants.
ELEMENT_CLASSES
¶ Element classes
-
innoconv_mintmod.constants.
MINTMOD_SUBJECTS
¶ Subjects as used in mintmod command
\MSetSubject
-
innoconv_mintmod.constants.
LANGUAGE_CODES
¶ Supported language codes
-
innoconv_mintmod.constants.
DEFAULT_LANGUAGE_CODE
¶ Default language code
-
innoconv_mintmod.constants.
DEFAULT_OUTPUT_DIR_BASE
¶ Default innoconv output directory
-
innoconv_mintmod.constants.
DEFAULT_OUTPUT_FORMAT
¶ Default innoconv output format
-
innoconv_mintmod.constants.
OUTPUT_FORMAT_EXT_MAP
¶ mapping between output formats and file extensions
-
innoconv_mintmod.constants.
OUTPUT_FORMAT_CHOICES
¶ Output format choices
-
innoconv_mintmod.constants.
ROOT_DIR
¶ project root dir
-
innoconv_mintmod.constants.
PANZER_SUPPORT_DIR
¶ panzer support directory
-
innoconv_mintmod.constants.
ENCODING
¶ encoding used in this project
innoconv_mintmod.errors¶
Exceptions are defined here.
innoconv_mintmod_mintmod.mintmod_filter¶
This module handles mintmod LaTeX commands.
Commands and environments are defined in the classes
Commands
and
Environments
.
innoconv_mintmod.mintmod_filter.commands¶
Handle mintmod LaTeX commands.
Note
Provide a handle_CMDNAME
function for handling CMDNAME
command.
You need to slugify the
command name.
Example: handle_msection
method will receive the command \MSection
.
-
class
innoconv_mintmod.mintmod_filter.commands.
Commands
[source]¶ Handlers for commands are defined here.
Given the command:
\MSection{Foo}
The handler method
handle_msection
receives the following arguments:cmd_args
:['Foo']
elem
:panflute.base.Element
-
handle_highlight
(cmd_args, elem)[source]¶ Handle highlight command.
This seems to be some sort of formatting command. There’s no documentation and it does nothing in the mintmod code. We just keep the information here.
-
handle_jhtmlhinweiseingabefunktionen
(cmd_args, elem)[source]¶ Handle
\jHTMLHinweisEingabeFunktionen
command.
-
handle_jhtmlhinweiseingabefunktionenexp
(cmd_args, elem)[source]¶ Handle
\jHTMLHinweisEingabeFunktionenExp
command.
-
handle_mdeclaresiteuxid
(cmd_args, elem)[source]¶ Handle
\MDeclareSiteUXID
command.The command can occur in an environment that is parsed by a subprocess. In this case there’s no last header element. The process can’t set the ID because it can’t access the doc tree. Instead it replaces the
\MDeclareSiteUXID
by an element that is found by the parent process using functioninnoconv.utils.extract_identifier()
.
-
handle_mdirectrouletteexercises
(cmd_args, elem)[source]¶ Handle
\MDirectRouletteExercises
command.Remember points for next question.
-
handle_mentry
(cmd_args, elem)[source]¶ Handle
\MEntry
command.This command creates an entry for the index.
-
handle_mextlink
(cmd_args, elem)[source]¶ Handle
\MExtLink
command.This command inserts an external link.
-
handle_mgraphics
(cmd_args, elem, add_desc=True)[source]¶ Handle
\MGraphics
.Embed an image with title.
Example: MGraphics{img.png}{scale=1}{title}
-
handle_mgraphicssolo
(cmd_args, elem)[source]¶ Handle
\MGraphicsSolo
.Embed an image without title. Uses filename as image title.
Handle
\MGroupButton
command.Render empty as this button is displayed automatically in clients.
-
handle_mindex
(cmd_args, elem)[source]¶ Handle
\MIndex
command.This command creates an invisible entry for the index.
-
handle_mlabel
(cmd_args, elem)[source]¶ Handle
\MLabel
command.Will search for the previous header element and update its ID to the ID defined in the command. Otherwise proceed like
\MDeclareSiteUXID
.Hides identifier in fake element like (
innoconv_mintmod.mintmod_filter.commands.Commands.handle_mdeclaresiteuxid()
).
-
handle_mlfunctionquestion
(cmd_args, elem)[source]¶ Handle questions defined by
\MLFunctionQuestion
command
-
handle_mlintervalquestion
(cmd_args, elem)[source]¶ Handle questions defined by
\MLIntervalQuestion
command
-
handle_mlparsedquestion
(cmd_args, elem)[source]¶ Handle questions defined by
\MLParsedQuestion
command
-
handle_mlsimplifyquestion
(cmd_args, elem)[source]¶ Handle questions defined by
\MLSimplifyQuestion
command
-
handle_mlspecialquestion
(cmd_args, elem)[source]¶ Handle questions defined by
\MLSpecialquestion
command
-
handle_mmodstartbox
(cmd_args, elem)[source]¶ Handle
\MModStartBox
command.This command displays a table of content for the current chapter. This is handled elswhere and becomes a no-op.
-
handle_mpragma
(cmd_args, elem)[source]¶ Handle
\MPragma
command.This command was used to embed build time flags for mintmod. It becomes a no-op.
-
handle_mprintindex
(cmd_args, elem)[source]¶ Handle
\MPrintIndex
command.Index will be printed automatically. It becomes a no-op.
-
handle_msetpoints
(cmd_args, elem)[source]¶ Handle
\MSetPoints
command.Remember points for next question.
-
handle_msetsectionid
(cmd_args, elem)[source]¶ Handle
\MSetSectionID
command.Will search for the previous header element and update its ID to the ID defined in the command.
-
handle_msetsubject
(cmd_args, elem)[source]¶ Handle
\MSetSubject{}
command.Command defines the category.
-
handle_msref
(cmd_args, elem)[source]¶ Handle
\MSRef
command.This command inserts a fragment-style link.
-
handle_msubject
(cmd_args, elem)[source]¶ Handle
\MSubject{title}
command.Command defines the document title.
-
handle_msubsubsectionx
(cmd_args, elem)[source]¶ Handle
\MSubsubsectionx
command. Which will generate a level 4 header.
-
handle_msubsubsubsectionx
(cmd_args, elem)[source]¶ Handle
\MSubsubsubsectionx
command. Which will generate a level 4 header.From logical point of view this should be level 5. But from looking at the sources, level 4 is correct.
-
handle_mtikzauto
(cmd_args, elem)[source]¶ Handle
\MTikzAuto
command.Create a
CodeBlock
with TikZ code.
-
handle_mtitle
(cmd_args, elem)[source]¶ Handle
\MTitle
command.This is an equivalent to
\subsubsection
-
handle_mugraphicssolo
(cmd_args, elem)[source]¶ Handle
\MUGraphicsSolo
.Embed an image without title.
-
handle_mzahl
(cmd_args, elem)[source]¶ Handle
\MZahl
command.This is a math command but in fact occurs also in text.
-
handle_mzxyzhltrennzeichen
(cmd_args, elem)[source]¶ Handle
\MZXYZhltrennzeichen
command.It is transformed to a
\decmarker
command and later substituted by MathJax. This is already in math substitions but as it occurs outside of math environments it’s defined here too.
-
handle_newpage
(cmd_args, elem)[source]¶ Handle
\newpage
command.A display related command. It becomes a no-op.
innoconv_mintmod.mintmod_filter.elements¶
Convenience functions and classes for creating common elements.
-
class
innoconv_mintmod.mintmod_filter.elements.
Question
(*args, **kwargs)[source]¶ Wrapper/Factory class that inherits from pf.Element and will return pf.Code instances, with special classes and attributes, depending on the given mintmod class.
-
innoconv_mintmod.mintmod_filter.elements.
create_content_box
(elem_content, elem_classes, lang)[source]¶ Create a content box.
Convenience function for creating content boxes that only differ by having diffent content and classes.
innoconv_mintmod.mintmod_filter.environments¶
Handle mintmod LaTeX environments.
Note
Provide a handle_ENVNAME
function for handling ENVNAME
environment.
You need to slugify the
environment name.
Example: handle_mxcontent
method will receive the
\begin{MXContent}…\end{MXContent}
environment.
-
class
innoconv_mintmod.mintmod_filter.environments.
Environments
[source]¶ Handlers for environments are defined here.
Given the environment:
\begin{MXContent}{Foo title long}{Foo title}{STD} Foo content \end{MXContent}
The handler method
handle_mxcontent
receives the following arguments:elem_content
:'Foo content'
cmd_args
:['Foo title long', 'Foo title', 'STD']
-
handle_itemize
(elem_content, env_args, elem)[source]¶ Handle itemize environments, that were not correctly recognized by pandoc. This e.g. happens if there are
\MExerciseItems
environments contained in the items.
-
handle_mexercisecollection
(elem_content, env_args, elem)[source]¶ Handle
\MExerciseCollection
environment.
-
handle_mexerciseitems
(elem_content, env_args, elem)[source]¶ Handle
\MExerciseitems
environments by returning an ordered list containing the\item
s defined in the environment. This is needed on top of handle_itemize as there are also mexerciseitems environments outside itemize environments.
innoconv_mintmod.mintmod_filter.filter_action¶
Pandoc filter that transforms mintmod commands.
-
class
innoconv_mintmod.mintmod_filter.filter_action.
MintmodFilterAction
(debug=False)[source]¶ The Pandoc filter is defined in this class.
-
filter
(elem, doc)[source]¶ Receive document elements.
This method receives document elements from Pandoc and delegates handling of simple subtitutions, mintmod commands and environments.
- Parameters
elem (
panflute.base.Element
) – Element to handledoc (
panflute.elements.Doc
) – Document
-
innoconv_mintmod.runner¶
Runner module
-
class
innoconv_mintmod.runner.
InnoconvRunner
(source, output_dir_base, language_code, ignore_exercises=False, remove_exercises=False, generate_innodoc=False, input_format='latex+raw_tex', output_format='markdown', generate_innodoc_markdown=False, debug=False)[source]¶ innoConv (mintmod) runner that spawns a panzer instance.
innoconv_mintmod.utils¶
Utility module
-
innoconv_mintmod.utils.
block_wrap
(elem, orig_elem)[source]¶ Wraps an element in a block if necessary.
If the original element was block panflute expects the return value to be also block. In many places we need to detect this and wrap an inline.
- Parameters
elem (
panflute.base.Element
) – Element to be wrappedorig_elem (
panflute.base.Element
) – Original element
- Return type
- Returns
elem
orelem
wrapped inpanflute.elements.Plain
-
innoconv_mintmod.utils.
convert_simplification_code
(code)[source]¶ Convert binary flags to string flags.
-
innoconv_mintmod.utils.
destringify
(string)[source]¶ Takes a string and transforms it into list of Str and Space objects.
This function breaks down strings with whitespace. It could be done by calling
parse_fragment()
but doesn’t have the overhead involed.
-
innoconv_mintmod.utils.
extract_identifier
(content)[source]¶ Extract identifier from content and remove annotation element.
\MLabel
/MDeclareSiteUXID
commands that occur within environments are parsed in a child process (e.g.innoconv_mintmod.mintmod_filter.commands.handle_mlabel()
). The id attribute can’t be set directly as they can’t access the whole doc tree. As a workaround they create a fake element and add the identifier.
-
innoconv_mintmod.utils.
get_remembered
(doc, key, keep=False)[source]¶ Retrieve rememembered element from the document and forget it.
To remember elements use
remember()
.- Parameters
- Return type
- Returns
The remembered element or None
-
innoconv_mintmod.utils.
log
(msg_string, level='INFO')[source]¶ Log messages when running as a panzer filter.
-
innoconv_mintmod.utils.
parse_cmd
(text)[source]¶ Parse a LaTeX command using regular expressions.
Parses a command like:
\foo{bar}{baz}
-
innoconv_mintmod.utils.
parse_fragment
(parse_string, lang, as_doc=False, from_format='latex+raw_tex')[source]¶ Parse a source fragment using panzer.
- Parameters
- Return type
list of
panflute.base.Element
orpanflute.elements.Doc
- Returns
parsed elements
- Raises
OSError – if panzer executable is not found
RuntimeError – if panzer recursion depth is exceeded
RuntimeError – if panzer output could not be parsed
-
innoconv_mintmod.utils.
parse_nested_args
(to_parse)[source]¶ Parse LaTeX command arguments that can have nested commands. Returns arguments and rest string.
Parses strings like:
{bar}{baz{}}rest
into[['bar', 'baz{}'], 'rest']
.
-
innoconv_mintmod.utils.
remember
(doc, key, elem)[source]¶ Rememember an element in the document for later.
To retrieve remembered elements use
get_remembered()
.- Parameters
doc (
panflute.elements.Doc
) – Document where to store the memorykey (str) – Key under which element is stored
elem (
panflute.base.Element
) – Element to remember
-
innoconv_mintmod.utils.
remove_annotations
(doc)[source]¶ Remove left-over annotation elements from document.
- Parameters
doc (
panflute.elements.Doc
) – Document
generate_innodoc¶
This is the final step to generate innoDoc content from Mintmod input.
Load pandoc output from single JSON file.
Generate a section tree from headings.
Create a mapping between mintmod section IDs and section paths. (a)
Create a mapping between element IDs and section paths. (b)
Rewrite all links by using (a) and (b).
Save individual sections to innoDoc-specific directory structure.
Generate a
manifest.yml
.Removes single JSON file.
-
class
generate_innodoc.
CreateMapOfIds
(sections)[source]¶ Create a mapping between link IDs and section path.
-
class
generate_innodoc.
CreateMapOfSectionIds
(sections)[source]¶ Create mapping between mintmod section id and section path.
-
class
generate_innodoc.
ExtractSectionTree
(nodes, level)[source]¶ Generate section tree from a flat document structure.
-
class
generate_innodoc.
GenerateInnodoc
(debug=False)[source]¶ Main class for generate_innodoc postflight filter.
-
LANGKEY
= 'languages'¶ Languages key in manifest.yml
-
-
generate_innodoc.
MAX_LEVELS
= 3¶ Max. depth of headers to consider when splitting sections