A translate script is provided to facilitate working with pandoc and deepl translation services.
The user manual is available: https://jodygarnett.github.io/translate/
The user manual is written as an example in sphinx reStructuredText and translated to mkdocs as a regression test:
-
This script requires pandoc be installed:
Ubuntu:
apt-get install pandoc
macOS:
brew install pandoc
References:
-
A writable python environment is required.
If you use homebrew (popular on macOS). This installs into user space so it is a witable environment.
brew install python
You may also use the system python provided by:
- Linux distribution
- Microsoft App Store
- https://www.python.org/ (windows and macOS)
The system python is not used directly, it includes
virtualenvused to setup a writable Python enviornment:virtualenv venv source venv/bin/activate -
Install mkdocs_translate into your writable Python environment.
To install latest release from pypi:
pip install mkdocs-translateTo install development version use (to preview and provide feedback):
pip install git+https://github.com/jodygarnett/translate.git -
To check it is installed correctly:
mkdocs_translate --help -
The script is intended to run from the location of your mkdocs project (with
docsandmkdocs.ymlfiles):cd core-genetwork/docs/manual -
The script makes use of existing
build(ortarget) folder for scratch files:mkdir build -
Optional: Create a translate.yml filling in the conversion parameters for your project.
This file is used to indicate the
buildortargetdirectory to use for temporary files.Additional configuration options are required for advanced sphinx-build
config.pyoptions like substitutions and external links.
A working example is provided to be adapted for your project:
-
Create requirements.txt with mkdocs plugins required.
-
Create mkdocs.yml.
-
Optional: If your content uses
downloaddirective to include external content, there is amkdocshook for processing ofdownload.txtfiles.Create download.py.
Register hook with
mkdocs.yml:# Customizations hooks: - download.py
-
Use
.gitignoreto ignore the following:build target -
The resulting directory structure is:
doc/ source/ .gitignore requirements.txt mkdocs.yml download.py
GeoServer is used as an example here, which is a maven project with a convention of target for temporary files.
-
Initial setup of
docsfolder structure (so all the images fromsourcefolder are present):mkdocs_translate init -
To scan
rstfiles before conversion:mkdocs_translate scanThe scan collects an index of pages and headings, and looks for any download files that have been managed by sphinx.
--scan=all: (default)--scan=index: scan anchors and headings intotarget/convert/anchors.txtfordocandrefdirectives.--scan=download: scandownloaddirectives for external content, intodocsfolder, producingdownload/download.txtfolders.
mkdocs_translate scan -
To migrate content from
rsttomd:mkdocs_translate migrate -
Review this content you may find individual files to fix.
Some formatting is easier to fix in the
rstfiles before conversion:-
Indention of nested lists in
rstis often incorrect, resulting in restarted numbering or block quotes. -
Random
{.title-ref}snippets is a general indication to simplify the rst and re-translate. -
Anchors or headings with trailing whitespace throwing off the heading scan, resulting in broken references
To reconvert migrate accepts paths to a file or folder:
mkdocs_translate migrate source/introduction/license.rst mkdocs_translate migrate source/introduction/**/*.rst -
-
To generate out navigation tree:
mkdocs_translate nav
Supply path information for a file or folder:
mkdocs_translate nav source/index.rst mkdocs_translate nav source/introdction/**/*.rstThe output is printed to standard out and may be appended to
mkdocs.ymlfile.
Some things are not supported by pandoc, which will produce WARNING: messages:
-
Substitutions used for inline images
-
Underlines: replace with bold or italic
WARNING: broken reference 'getting_involved' link:getting_involved-broken.rst
Translations are listed alongside english markdown:
example.mdexample.fr.md
Using pandoc to convert to html, and then using the Deepl REST API.
-
Provide environmental variable with Deepl authentication key:
export DEEPL_AUTH="xxxxxxxx-xxx-...-xxxxx:fx" -
Translate a document to french using pandoc and deepl:
mkdocs_translate french docs/help/index.md
-
To translate several documents in a folder:
mkdocs_translate french docs/overview/*.mdDeepl charges by the character so bulk translation not advisable.
See mkdocs_translate french --help for more options.
You are welcome to use google translate, ChatGPT, or Deepl directly - keeping in mind markdown formatting may be lost.
Please see the writing guide for what mkdocs functionality is supported.
To build and test locally:
-
Clone:
git clone https://github.com/jodygarnett/translate.git translate -
Install requirements:
cd translate pip3 install -r mkdocs_translate/requirements.txt -
Install locally:
pip3 install -e .
Distribution:
-
Update version number in
mkdocs_translate/__init__.pyversion:__version__ = 0.4.2
-
Build wheel:
python3 -m build
-
Upload wheel:
python3 -m twine upload --repository pypi dist/*
Debugging:
-
Recommend troubleshooting a single file at a time:
mkdocs_translate rst docs/index.rst
-
Compare the temporary files staged for pandoc conversion:
bbedit docs/index.rst docs/index.md target/convert/index.tmp.html target/convert/index/tmp.md -
To turn on logging during conversion:
mkdocs_translate --log=DEBUG translate.yml rst
Pandoc:
-
The pandoc plugin settings are in two constants:
md_extensions_to = 'markdown+definition_lists+fenced_divs+backtick_code_blocks+fenced_code_attributes-simple_tables+pipe_tables' md_extensions_from = 'markdown+definition_lists+fenced_divs+backtick_code_blocks+fenced_code_attributes+pipe_tables'
-
The pandoc extensions are chosen to align with mkdocs use of markdown extensions, or with post-processing:
markdown extension pandoc extension post processing tables pipe_tables pymdownx.keys post processing pymdownx.superfences backtick_code_blocks post processing admonition fenced_divs post processing -
To troubleshoot just the markdown to html conversion:
mkdocs_translate internal_html manual/docs/contributing/style-guide.md mkdocs_translate internal_markdown target/contributing/style-guide.html diff manual/docs/contributing/style-guide.md target/contributing/style-guide.md
For geoserver or core-geonetwork (or other projects following maven conventions) no configuration is required.
To override configuration on command line add -concfig <file.yml> before the command:
mkdocs_translate --config translate.yml rstThe file mkdocs_translate/config.yml file contains some settings (defaults are shown below):
-
deepl_base_url: "https://api-free.deepl.com"Customize if you are paying customer.
-
project_folder: "."Default assumes you are running from the current directory.
-
rst_folder: "source" -
docs_folder: "docs" -
build_folder: "target"The use of "target" follows maven convention, python projects may wish to use "build"
-
docs_folder: "docs"mkdocs convention.
-
anchor_file: 'anchors.txt' -
upload_folder: "translate"Combined with
build_folderto stage html files for translation (example:build/translate) -
convert_folder: "convert"Combined with
build_folderfor rst conversion temporary files (example:build/convert). Temporary files are required for use by pandoc. -
download_folder: "translate"Combined with
build_folderto retrieve translation results (example:build/translate) Temporary files are required for use by pandoc. -
substitutions: dictionary of|substitutions|to use when converting config.py rst_epilog common substitutions.project: GeoServer author: Open Source Geospatial Foundation copyright: 2023, Open Source Geospatial Foundation project_copyright: 2023, Open Source Geospatial Foundation -
The built-in substitutions for
|version|and|release|are changed to{{ version }}and{{ release }}`` variables for use withmkdocs-macros-plugin` variable substitution:Use
mkdocs.ymlto define:extra: homepage: https://geoserver.org/ version: '2.24' release: '2.24.2' -
extlinks: dictionary of config.py extlinks substitutions.To convert sphinx-build config.py:
extlinks = { 'wiki': ('https://github.com/geoserver/geoserver/wiki/%s', None), 'user': ('https://docs.geoserver.org/'+branch+'/en/user/%s', None), 'geos': ('https://osgeo-org.atlassian.net/browse/GEOS-%s','GEOS-%s') }Use config.yml (note use of mkdocs-macros-plugin for variable substitution:
extlinks: wiki: https://github.com/geoserver/geoserver/wiki/%s user: https://docs.geoserver.org/{{ branch }}/en/user/%s geos: https://osgeo-org.atlassian.net/browse/GEOS-%s|GEOS-%s download_release: https://sourceforge.net/projects/geoserver/files/GeoServer/{{ release }}/geoserver-{{ release }}-%s.zip|geoserver-{{ release }}-%s.zip