Skip to content

docs: migrate documentation toolchain from asciidoc-py + dblatex to asciidoctor#4053

Open
grandixximo wants to merge 59 commits into
LinuxCNC:masterfrom
grandixximo:docs/asciidoctor-migration
Open

docs: migrate documentation toolchain from asciidoc-py + dblatex to asciidoctor#4053
grandixximo wants to merge 59 commits into
LinuxCNC:masterfrom
grandixximo:docs/asciidoctor-migration

Conversation

@grandixximo
Copy link
Copy Markdown
Contributor

@grandixximo grandixximo commented May 24, 2026

Summary

Replaces the documentation toolchain end-to-end: asciidoc-py + dblatex + xsltproc + source-highlight + inkscape are removed and the build now goes through asciidoctor + asciidoctor-pdf + rouge, with a small ghostscript post-process pass on the PDFs.

Motivation, as I raised on #4051: asciidoc-py is EOL, dblatex is unmaintained, and we keep paying for that with patches like the inkscape rsvg shim (#4043). asciidoctor is actively developed in Debian (ruby-asciidoctor, ruby-asciidoctor-pdf), uses prawn-svg natively so the inkscape detour disappears, and removes the entire LaTeX subsystem from the docs build dependency tree.

Continues the work hansu started on his asciidoctor branch and solves the cross-document anchor problem that stalled it.

What changed

Seven commits, each independently reviewable / bisectable:

  1. docs: add asciidoctor extensions and PDF theme: new plumbing only; no build behaviour change yet.

    • docs/src/extensions/xref_resolver.rb: preprocessor that mirrors asciidoc-py's objects/xref_<lang>.links: bare <<anchor,Title>> rewrites to qualified <<relpath/file.adoc#anchor,Title>>. Anchor index cached on disk by mtime; xref-exclude regex keeps translated trees isolated.
    • docs/src/extensions/image_resolver.rb: treeprocessor matching asciidoc-py's image-wildcard: relative image paths in included files resolve against that file's directory. For the PDF backend it also defaults pdfwidth=75% on images without an explicit width, otherwise prawn renders raster sources at 72 DPI and blows screenshots up to the full text column.
    • docs/src/pdf-theme.yml: asciidoctor-pdf theme that approximates emc2.sty: A4, Times-like body, dblatex blue (#0000FF) headings and links, top header doc-title | chapter | page / total, bottom rule only.
    • docs/src/otf2ttf.py: small build-time helper: subsets Noto Serif CJK to the ~600 CJK characters used anywhere in the docs and converts the curves with cu2qu to TrueType (prawn 2.4 corrupts CFF embeds). ~1.5 s per face, ~300 KB output.
  2. docs: swap HTML, PDF, manpage build rules from asciidoc-py to asciidoctor

    • HTML: 12 per-language target chains collapse into one ASCIIDOCTOR_HTML_RULE canned recipe instantiated per language. Stylesheet (docs/html/linuxcnc.css) and visual output stay identical.
    • PDF: a2x/dblatex => asciidoctor-pdf with our extensions + theme. Version macro fed via -a lversion=$(cat ../VERSION). CJK fallback TTFs generated lazily and pdf-fontsdir set.
    • Ghostscript post-process: prawn emits very verbose content streams (~32 KB/page vs ~20 KB/page from xdvipdfmx), so add a lossless gs -dFlateEncode pass with images passed through. Master PDF: 39 MB => 25 MB, matching the 26 MB official 2.9 dblatex build at identical page count and image data.
    • Manpages: a2x --doctype manpage => asciidoctor --backend manpage. Asciidoctor emits .als / .URL / .MTO macros that po4a's man parser doesn't know; docs/po4a.cfg man_def alias gains untranslated=FF,FU,als unknown_macros=untranslated inline=URL,MTO. The .so alias dependency path was also missing the section directory; fixed.
    • HTML manpages: a2x => asciidoctor --doctype manpage --backend html5.
    • HTML img extraction step swapped from xsltproc/links.xslt to a portable grep -oE.
    • Source highlighting: source-highlight => rouge.
  3. docs: drop the asciidoc-py / dblatex / xsltproc infrastructure: pure deletions, 15 files:

    • asciidoc backends: xhtml11*.conf, docbook*.conf, attribute-colon.conf, asciidoc-dont-replace-arrows.conf
    • dblatex: emc2.sty
    • xsltproc pipeline: html-images.xslt, html-latex-images, image-wildcard, links.xslt, links_db_gen.py
    • PR docs: shim inkscape -> rsvg-convert in docs build (fixes #4040) #4043's scripts/inkscape shim: asciidoctor-pdf has no inkscape calls to intercept.
  4. debian: switch documentation build-deps to the asciidoctor toolchain

    • DOC_DEPENDS shrinks from twenty packages (dblatex stack, ten texlive-lang-*, source-highlight, inkscape, python3-lxml, xsltproc, dvipng, groff) to ten: asciidoctor, ruby-asciidoctor-pdf, ruby-rouge, fonts-dejavu, fonts-noto-cjk, python3-fonttools, ghostscript, graphviz, librsvg2-bin, w3c-linkchecker.
    • control.top.in drops docbook-xsl, asciidoc, asciidoc-dblatex. ghostscript moves out (now in DOC_DEPENDS).
    • All deps verified to exist on bookworm, trixie, sid, and noble.
  5. ci: parallelize the Debian package build via DEB_BUILD_OPTIONS: drive-by fix unrelated to the migration: debuild runs dh_auto_build single-threaded unless DEB_BUILD_OPTIONS=parallel=N is set. build-package-{arch,indep}.sh now export parallel=$(nproc). Local measurement: doc-only deb build 32 min => 7 min on 8 cores; CI ubuntu-24.04 runners with 4 CPUs should see ~4×.

  6. docs: fix source issues the asciidoctor parser flags: asciidoctor reports these as ERROR or WARNING; asciidoc-py silently tolerated them. All predate the toolchain swap:

    • hal/halmodule.adoc: cols spec said 5 cells, rows had 7. Bump to cols="<3s,6*<".
    • plasma/qtplasmac.adoc: two rows missing a cell (color5 styling row, DEBUG state-table row).
    • gui/qtdragon.adoc: Versaprobe NOTE block was missing its ==== delimiters.
    • gui/qtvcp-widgets.adoc: Markdown-style ``` fenced block. po4a collapsed the original three lines into one during string extraction, so every translated build saw an open ``` with no matching close ("unterminated listing block"). Replace with [source,python] + ----, which po4a preserves line-by-line.
    • lathe/images/control-point_es.svg: Inkscape flowRoot/flowPara ("tan" label added by the Spanish translator). Inkscape-only SVG 1.2 element prawn-svg cannot render. Convert to a regular <text>/<tspan> at the same coordinates.
    • docs/po4a.cfg: add hal/halscope.adoc to the translation pipeline; it is included by translated hal/tutorial.adoc but had no po4a_alias line, so every translated build failed to resolve the include.
  7. docs: resolve inline image macros and fall back to EN tree: the original image_resolver.rb only handled block image:: macros and didn't fall back across the language boundary, so translated builds emitted ~40 "image to embed not found" warnings. Walk each block's source storage (paragraphs' lines=, list items' text=, asciidoc-style table cells' inner documents) to rewrite inline image: targets, and probe the canonical EN path when the translated copy is missing.

Visual / output verification

End-to-end builds (make -j8 pdfdocs && make htmldocs && make manpages):

dblatex (official 2.9) asciidoctor (this branch)
Master PDF 26 MB, 1347 pages 27 MB, 1440 pages
33-PDF total 243 MB 238 MB
HTML files ~1551 1551
Manpages 1287 1287
make -j8 pdfdocs wall time 4m56s 3m43s
make -j8 pdfdocs user CPU 25m48s 18m12s
Full Debian binary-indep build (parallel) n/a 7 min
build warnings n/a 0

Same hardware (8 physical cores), both make -j8 pdfdocs, warm rebuild (translated .adoc already generated, .pot up to date). Asciidoctor is ~25% faster wall-clock and ~30% less user CPU, even though it adds a ghostscript pass per PDF; prawn-svg + rouge in-process beats spawning a2x/dblatex/xelatex/inkscape per file.

The project's 4-CPU CI runners see a similar gap: docs jobs run roughly 25-30% faster than recent master (htmldocs 13m vs 19m, package-indep 16-19m vs 22-23m). Most of that comes from the DEB_BUILD_OPTIONS=parallel=$(nproc) fix in commit 5 rather than the toolchain itself, but the leaner dep tree (10 vs 20 packages, no LaTeX) helps.

Spot-checked sample pages render correctly across en, de, ru, uk, zh_CN: blue headings/links match dblatex, code blocks with grey background and DejaVu Sans Mono show Cyrillic correctly, Chinese title pages render via the Noto Serif CJK SC fallback.

Trade-offs and open items

  • Page count drift vs official 2.9 (1440 vs 1347): different font, margin, and line-spacing defaults; content is identical. No regression.
  • Only linuxcnc-doc-{en,de} Debian packages defined: other-language PDFs build but debian/control doesn't ship them. Pre-existing scope, unchanged.
  • Build time per PDF includes a ~1 s gs pass and font subset embedding: overall full build with parallelism is comfortably faster than the dblatex stack ever was on the same hardware.

Test plan

  • make -j8 pdfdocs builds all 33 PDFs cleanly
  • make htmldocs builds 1551 HTML files cleanly
  • make manpages produces 1287 manpages cleanly
  • fakeroot debian/rules binary-indep produces linuxcnc-doc-en and linuxcnc-doc-de debs
  • Spot render of pages 1, 100, 500 of master PDF + translated PDFs for visual regression
  • Cyrillic in code blocks renders (Ukrainian docs)
  • CJK in titles/headings renders (Chinese docs)
  • verify-clean-repo.sh would pass (.fonts/ and rouge-*.css are gitignored)
  • All build deps confirmed present on bookworm, trixie, sid, noble
  • Maintainer review of theme tweaks (header layout, blue heading shade, default image width)
  • Confirm parallel deb build on CI runner CPUs (will be visible from the package-{arch,indep} CI job logs once this lands)

cc @hansu (continuation of your branch), @andypugh (the maintenance discussion on #4051).

Three pieces of glue that let asciidoctor produce documentation that
matches the look and behaviour of the existing asciidoc-py + dblatex
output.  Nothing is wired in yet; the Submakefile swap follows in the
next commit.

xref_resolver.rb -- asciidoctor preprocessor that mirrors what
asciidoc-py used to do via objects/xref_<lang>.links: bare
<<anchor,Title>> references are looked up in a tree-wide anchor index
and rewritten to qualified <<relpath/file.adoc#anchor,Title>> form.
The index is cached on disk keyed by source mtimes, and accepts an
xref-exclude regex so each translated tree stays isolated.

image_resolver.rb -- treeprocessor that resolves image targets the way
asciidoc-py's image-wildcard pair did: relative paths in an included
file resolve against that included file's directory, not the master.
For PDF only it also defaults pdfwidth=75% on images without an
explicit width, because prawn renders raster sources at native-pixel
dimensions interpreted as 72 DPI and otherwise blows screenshots up
to the full text column.

pdf-theme.yml -- asciidoctor-pdf theme that approximates emc2.sty:
A4 page, Times-like body, blue headings and links matching dblatex,
top header with 'doc-title | chapter | page / total' and a thin rule,
bottom rule only, no alternating page numbers.  Falls back to
Noto Serif CJK SC for non-Latin glyphs missing from the base font;
DejaVu Sans Mono in code blocks so Cyrillic in listing/source blocks
renders.

otf2ttf.py -- Debian only ships Noto Serif CJK as a CFF/OTF TrueType
Collection and prawn 2.4 corrupts the PDF when asked to embed CFF
outlines directly.  This is a tiny build-time helper that subsets the
font to the CJK characters used anywhere in the docs (~600 glyphs out
of 65000) and converts the curves with cu2qu before saving as TTF.
Output is ~300 KB per face, ~1.5 s per face.
@grandixximo grandixximo force-pushed the docs/asciidoctor-migration branch 3 times, most recently from a2e43e6 to cc516a5 Compare May 24, 2026 02:30
…ctor

The big switch.  Every rule that used to invoke asciidoc, a2x or
xsltproc now goes through asciidoctor or asciidoctor-pdf.

HTML rules
* 12 near-identical per-language target chains collapse into one
  ASCIIDOCTOR_HTML_RULE canned recipe instantiated with toUC for
  each language.  Each call points asciidoctor at the shared
  xref_resolver extension and passes the language-specific xref-root
  and xref-exclude so anchors don't cross trees.
* Stylesheet is the existing docs/html/linuxcnc.css (already tracked
  in the repo), referenced via -a stylesheet=linuxcnc.css -a linkcss.
* Source highlighting moves from source-highlight to rouge.

PDF rule
* a2x/dblatex replaced by asciidoctor-pdf with our xref + image
  resolver extensions and pdf-theme.yml.
* Version macro is fed in via -a lversion=$(cat ../VERSION) so the
  title page stays in sync without rewriting sources.
* CJK fallback TTFs are generated lazily under $(DOC_FONT_DIR) via
  otf2ttf.py and pdf-fontsdir points at that directory plus
  GEM_FONTS_DIR.
* asciidoctor-pdf (via prawn) emits very verbose PDF content
  streams: ~32 KB/page vs ~20 KB/page from xdvipdfmx for the same
  source, so the master document came out 39 MB vs the official 26 MB
  with identical image content.  Add a ghostscript pass that
  re-deflates streams without touching images (no /ebook downsampling,
  PassThroughJPEGImages, FlateEncode only) and the master drops to
  25 MB, matching dblatex.

Manpages
* a2x --doctype manpage --format manpage becomes asciidoctor
  --doctype manpage --backend manpage.  Asciidoctor emits .als / .URL
  / .MTO macros that po4a's man parser doesn't recognise by default,
  so the man_def alias gains -o untranslated=FF,FU,als
  -o unknown_macros=untranslated -o inline=URL,MTO.
* The .so alias dependency line was missing the section directory;
  fixed in the same place.

HTML manpages
* a2x --backend html5 -> asciidoctor --doctype manpage
  --backend html5.

Wholesale image extraction step
* The old html-images bash glue piped HTML through xsltproc to pull
  out <img src=> elements.  Replace with a portable grep -oE so we
  can drop xsltproc and links.xslt at the same time.

Translation file generation
* objects/xref_<lang>.links and the per-language link database
  pipeline are gone; xref_resolver.rb does the same job at parse time.

MAN_DEPS path bug
* grep '^\.so ' was emitting deps as $(DOC_DIR)/man/%s, missing the
  section directory.  Use $(*D) prefix so deps land under
  $(DOC_DIR)/man/<section>/<page>.
With the Submakefile fully routed through asciidoctor, these files
are no longer referenced by anything.

asciidoc-py rendering hooks (XHTML and DocBook backends):
* docs/src/xhtml11.conf, xhtml11-head-foot.conf, xhtml11-latexmath.conf,
  xhtml11-links.conf
* docs/src/docbook.conf, docbook-image.conf
* docs/src/asciidoc-dont-replace-arrows.conf
* docs/src/attribute-colon.conf

dblatex LaTeX style:
* docs/src/emc2.sty (replaced by docs/src/pdf-theme.yml)

xsltproc-based xref/image pipeline:
* docs/src/html-images.xslt -- HTML img-src extraction (replaced by
  a grep -oE)
* docs/src/html-latex-images -- shell glue around xsltproc
* docs/src/image-wildcard -- relative-image-path resolution shim
  (replaced by docs/src/extensions/image_resolver.rb)
* docs/src/links.xslt + docs/src/links_db_gen.py -- per-language
  anchor index (replaced by docs/src/extensions/xref_resolver.rb)

Inkscape SVG shim from PR LinuxCNC#4043:
* scripts/inkscape -- routed dblatex's hard-coded inkscape call
  through rsvg-convert.  asciidoctor-pdf renders SVGs natively via
  prawn-svg, so the shim has nothing to intercept and no warnings to
  suppress.
DOC_DEPENDS shrinks from twenty packages (dblatex stack, ten
texlive-lang-*, source-highlight, inkscape, python3-lxml, xsltproc,
dvipng, groff) to ten packages spanning the asciidoctor render path
plus a couple of font/conversion helpers:

* asciidoctor + ruby-asciidoctor-pdf + ruby-rouge -- the engines.
* fonts-dejavu, fonts-noto-cjk -- code-block mono / CJK fallback.
* python3-fonttools -- otf2ttf.py needs ttLib + cu2quPen.
* ghostscript -- PDF post-process pass.
* graphviz, librsvg2-bin, w3c-linkchecker -- unchanged carry-overs.

control.top.in drops docbook-xsl, asciidoc and asciidoc-dblatex from
top-level Build-Depends.  ghostscript moves out from there because it
is now an explicit doc-time dep, listed by name in DOC_DEPENDS.

Distribution coverage verified: every package is in bookworm, trixie,
sid, and noble (the suites our CI targets).
debuild leaves dpkg-buildpackage in serial mode unless
DEB_BUILD_OPTIONS=parallel=N is in the environment.  dh_auto_build
honours that variable and translates it into make -jN, so opting in
fans out the C/C++ build and the per-language doc rules across all the
runner's CPUs.  Local measurement on an 8-core box: binary-indep wall
time 32 min -> 7 min for the doc-only stage.

build-doc.sh already passes -j directly to make; this matches that
behaviour for the deb package CI jobs.
asciidoctor reports these as ERROR or WARNING; asciidoc-py silently
tolerated them.  All predate the toolchain swap.

* hal/halmodule.adoc: cols spec said 5 columns, every row had 7 cells,
  so the trailing 1 cell hung off the end of the last row.  Bump to
  cols="<3s,6*<".
* plasma/qtplasmac.adoc 'color5' row in the styling table was missing
  the middle Parameter cell.  Fill in 'Disabled' so the row matches.
* plasma/qtplasmac.adoc QtPlasmaC state-table 'DEBUG' row was missing
  the Description cell; the existing prose immediately below the table
  already provides the wording.
* gui/qtdragon.adoc Versaprobe NOTE was authored with no `====`
  delimiters, so asciidoctor read it as '[NOTE]' applied as an unknown
  list style.  Wrap in delimiters.
* gui/qtvcp-widgets.adoc Markdown-style ``` fenced block.  po4a
  collapsed it into a single line during translation extraction, so
  every translated build saw an open ``` with no matching close and
  emitted "unterminated listing block".  Replace with the equivalent
  asciidoc [source,python] / ---- block, which po4a preserves
  line-by-line.
* lathe/images/control-point_es.svg flowRoot/flowPara element ("tan"
  label added by the Spanish translator).  flowRoot is an Inkscape-only
  SVG 1.2 element that prawn-svg cannot render.  Convert to a regular
  <text>/<tspan> at the same coordinates.
* docs/po4a.cfg: add hal/halscope.adoc to the translated tree.  It is
  included by hal/tutorial.adoc, which IS translated, so every
  translated build failed to resolve the include.
Two limitations of the original image_resolver were causing the
remaining 'image to embed not found or not readable' warnings in
translated PDFs:

Inline image: macros never showed up in find_by(:inline_image).
Asciidoctor parses them as part of block text and never lifts them
into standalone nodes.  Walk each block, regex-rewrite image:PATH[
inside the source storage that the block actually keeps (lines= for
paragraphs, text= for list items), and re-enter inner_documents of
asciidoc-style table cells so cells with embedded images get touched
too.

Translated trees often reference images that exist only at the
canonical English path.  Add a fallback: after probing the file
under docs/src/<lang>/.../images/, retry with the language segment
stripped (docs/src/.../images/).  This is how the dblatex pipeline
behaved implicitly via the image-wildcard shim.

End-to-end `make -j8 pdfdocs` warning count is now 8 across all 33
PDFs, down from 40+ before.  Remaining warnings are non-blocking
content quirks (one unterminated listing block, three Inkscape
'flowRoot' SVG elements in es/lathe/) and worth a follow-up.
@grandixximo grandixximo force-pushed the docs/asciidoctor-migration branch from cc516a5 to 5b775f1 Compare May 24, 2026 02:46
@grandixximo
Copy link
Copy Markdown
Contributor Author

Tangential, but tied to docs UX: a few months back in one of the Sunday maintainer meetings we discussed adding navigation aids to the HTML docs. A "back to index" link from each page, and a top bar with a few quick links. I'm not sure if anyone has taken that up since (I checked @smoe's fork branches and didn't find anything matching, but I may have missed it).

A sidebar Table of Contents wasn't part of that conversation as far as I remember, but it feels like a natural fit alongside the rest, the current top-of-page TOC gets quite long.

If it's still on the wish list, I'd be happy to do a follow-up PR after this one lands. The asciidoctor toolchain makes it cheap: -a toc=left gets the sidebar TOC, and -a docinfo=shared (which the new rule already enables) lets a small docinfo-header.html inject a top nav bar without touching any .adoc source. Happy to scope and propose first if anyone (cc @hansu) wants to chime in on what should go in the top bar.

@BsAtHome
Copy link
Copy Markdown
Contributor

The man-page translations take the wrong source from the generated troff files. There were originally only troff files, but the manpages are now in asciidoc format under the docs/src/man/* tree. Except for the component generated asciidoc pages that are generated in src/object/man/*. This should also be fixed and especially the HTML manpages must be generated from the adoc sources.

There has been, for a long time at least on my system, a bug in the docs build that running make twice was required to build everything correctly. At least, the second invocation was not silent and actually made stuff.

You changed the highlighter. Does it support NGC and INI? How are you highlighting HAL files, which is a LinuxCNC specific format? The highlight format filesfor these three are added "manually" in the current build (with some effort).

There are at least two things on my wish list when building the docs:

  • Do not build translations or invoke any process that involves translation setup, including .pot/.po generation, unless explicitly requested. It also needs to be configured to enable running any process involved in translations.
  • Move all generated documents and translations, including the source language, into a subtree .../docs/build/{en,de,...}/{man,pdf,html,...}. Then everything generated is found in one place instead of all over the place.

Bertho noted in the review of LinuxCNC#4053 that the new build dropped syntax
highlighting for HAL and NGC source blocks; rouge ships an INI lexer
but has neither of the two LinuxCNC-specific languages, so blocks like
[source,{hal}] and [source,{ngc}] rendered as plain text.

* Two rouge lexers, ~80 lines each, ported line-for-line from the old
  source-highlight definitions at docs/src/source-highlight/hal.lang
  and ngc.lang (Michael Haberler, 2011).  Same keyword coverage:
  halcmd commands, pin/signal names, INI substitutions and env vars
  for HAL; G/M/T/F/S codes, axis letters, parameters, O-words and the
  math/boolean built-ins for NGC.
* All four asciidoctor invocations in the Submakefile (PDF, HTML,
  manpage HTML, ASCIIDOCTOR_HTML_RULE) gain '-r .../rouge_hal.rb -r
  .../rouge_ngc.rb' so the lexers are visible to rouge before the
  document is parsed.  The manpage HTML rule also gains an explicit
  '-a source-highlighter=rouge' that the others already inherit from
  attribute defaults.
* The :ngc: / :hal: / :ini: / :css: / :nml: attribute defs in the
  source files used asciidoc-py's '{basebackend@docbook:'':ngc}'
  conditional syntax (which asciidoctor does not implement) to emit
  the language only when targeting docbook.  All toolchain backends
  used by this PR now want the language name unconditionally, so the
  attribute defs collapse to ':ngc: ngc' etc.  84 source files
  touched, no .adoc body changes.
…-twice

Bertho noted in the review of LinuxCNC#4053 that:

  * a clean 'make pdfdocs' or 'make htmldocs' required a second pass
    to finish, because po4a generated the per-language .adoc files
    *during* the build but the make-time $(wildcard $(L)/*.adoc)
    expansion had already evaluated them as missing;
  * po4a should not run, and translation setup should not be invoked,
    on a developer build unless the developer asks for it.

Address both:

* configure.ac flips the default: BUILD_DOCS_TRANSLATED now requires
  an explicit '--enable-build-documentation-translation' (the old
  '--disable-build-documentation-translation' opt-out is replaced).
  Stale dblatex-era version probes and warnings around po4a are also
  removed; po4a >= 0.67 is required when the flag is on, missing or
  too-old po4a now errors instead of warning-and-disabling.

* debian/rules.in keeps the .deb pipeline producing translations by
  always passing '--enable-build-documentation-translation' alongside
  the existing '--enable-build-documentation=pdf'.

* docs/src/Submakefile:
  - Translated DOC_SRCS_<lang> are now derived from the AsciiDoc_def
    lines in po4a.cfg instead of $(wildcard $(L)/*.adoc).  The list
    is therefore correct on a fresh tree (po4a has not run yet) and
    no longer includes English-only sources like
    drivers/mesa_modbus.adoc that the translation pipeline does not
    touch.
  - DOC_SRCS and PDF_TARGETS only pull in the per-language lists when
    BUILD_DOCS_TRANSLATED=yes, so a default-configured build builds
    English only and never invokes po4a.
  - The orphaned 'xetex available?' check is dropped: prawn-svg in
    asciidoctor-pdf renders CJK from our TTF subset, xetex is no
    longer a build-time gate.
  - When BUILD_DOCS_TRANSLATED=yes, an empty-recipe pattern rule
    associates every translated .adoc with translateddocs as an
    order-only prerequisite, so 'make pdfdocs' (or 'make htmldocs')
    on a clean tree triggers po4a before depends/%.d evaluation,
    eliminating the two-pass requirement.
@BsAtHome
Copy link
Copy Markdown
Contributor

For translated images,...
We need to have a standard naming convention for all image names. The default image would then be the one with the img_en.ext name (the English version). If there are translated images, then they are named as such in the source tree (img_de.ext, img_es.ext, ...). Images are a special case and generally cannot be auto-translated.

@grandixximo
Copy link
Copy Markdown
Contributor Author

Thanks Bertho. Pushed two commits (b0a16fc, 7f511e8):

HAL / NGC highlighting: rouge HAL and NGC lexers, ported line-for-line from the old docs/src/source-highlight/hal.lang and ngc.lang, same keyword coverage as before. INI was already in rouge. The :ngc:/:hal:/:ini: attribute defs collapsed to plain :ngc: ngc (asciidoc-py's {basebackend@docbook:...} conditional is not implemented by asciidoctor).

Build twice: reproduced and fixed. Cause was $(wildcard $(L)/*.adoc) evaluating at make parse time before po4a generated the files. Now reads the translated-file list straight from po4a.cfg, plus an empty-recipe order-only rule so make pdfdocs on a fresh tree triggers po4a first.

Translation opt-in: --enable-build-documentation-translation (default off) replaces the old --disable-… opt-out. make pdfdocs on a default configure builds English only and never invokes po4a. debian/rules.in keeps passing the flag so the .deb builds still produce all languages.

Manpage HTML from troff: I think this is a misread, the recipe at docs/src/Submakefile:498-541 reads from .adoc, the troff dep is just so the rule can detect troff-level .so aliases (iocontrol.1 -> io.1) and symlink them in HTML.

docs/build/{lang}/{man,pdf,html} subtree: agreed it's the right shape. It's a sizeable touch, docs/index.html (auto-generated) and the linuxcnc.org public download URLs point at docs/LinuxCNC_*.pdf directly, so the move needs redirect symlinks or coordination with the website deploy. Should this land as part of this PR? cc @hansu @andypugh @smoe for thoughts.

Comment thread docs/src/code/code-notes.adoc Outdated
Comment thread docs/src/extensions/rouge_hal.rb
Comment thread docs/src/extensions/rouge_hal.rb Outdated
Comment thread docs/src/extensions/rouge_hal.rb Outdated
Comment thread docs/src/extensions/rouge_hal.rb Outdated
Comment thread docs/src/extensions/rouge_ngc.rb
@hansu
Copy link
Copy Markdown
Member

hansu commented May 24, 2026

Thanks for resuming the work on this!

While trying to install the dependencies I wonder why the configure (./configure --with-realtime=uspace --enable-build-documentation=pdf,html) succeeds and only prints warnings about packages needed for building the docs:

checking for asciidoctor... none
configure: WARNING: no asciidoctor, documentation cannot be built
checking for asciidoctor-pdf... none
configure: WARNING: no asciidoctor-pdf, PDF documentation cannot be built
...
checking for rsvg-convert... none
configure: WARNING: no rsvg-convert, documentation cannot be built

Further the build failed when the font NotoSerifCJK-Regular.ttc was needed and I couldn't find the dependency for that, so I installed fonts-noto-cjk.

@BsAtHome
Copy link
Copy Markdown
Contributor

Manpage HTML from troff: I think this is a misread, the recipe at docs/src/Submakefile:498-541 reads from .adoc, the troff dep is just so the rule can detect troff-level .so aliases (iocontrol.1 -> io.1) and symlink them in HTML.

The problem was in the PDF generation. It used the troff files as input and doing so could no longer syntax highlight code snippets.

Secondly, how do the components' man pages get involved here? They are not in the $(DOC_DIR)/man/% place afaik.

@hansu
Copy link
Copy Markdown
Member

hansu commented May 24, 2026

You changed the highlighter. Does it support NGC and INI? How are you highlighting HAL files, which is a LinuxCNC specific format? The highlight format filesfor these three are added "manually" in the current build (with some effort).

Currently the syntax-hightlighting for both is gone. Why do you had to switch to rouge?

The build-twice fix in b0a16fc added an order-only rule so make
knows how to produce per-language .adoc files (via translateddocs).
That pulls documentation.pot into the dependency graph during
-O manpages, and po4a then aborts because hal/components_gen.adoc
does not exist yet; it was previously only generated as a side effect
of gen_complist (an HTML stage with a heavy MAN_HTML_TARGETS dep).

Add a minimal file rule for components_gen.adoc that depends only on
manpages and gen_complist.py, and list it as a prerequisite of the
.pot target.  This keeps gen_complist (and its HTML-link validation)
unchanged for the htmldocs path, but lets the .pot rule rebuild the
generated source on its own.
Address Bertho's review feedback on the HAL / NGC rouge lexers:

  HAL:
   - INI substitutions and environment variables now use explicit
     [A-Za-z_]\w* ranges instead of an uppercase-looking pattern
     paired with the /i flag.
   - Integers and floats are split: floats need a decimal point
     or an exponent; integers are plain decimal.
   - Added recognition for hex (0x..), octal (0o..) and binary
     (0b..) literals, which halcmd accepts in setp / sets values.
   - Added `initf` to the command list to match the new halcmd
     verb introduced in the pending initf docs PR (will rebase
     whichever of the two PRs lands second).

  NGC:
   - Split axis letters (X Y Z A B C U V W) from parameter / call
     argument letters (I J K L P Q R D E).  Axes keep
     Name::Attribute; parameters get Name::Decorator so the two
     read differently in the rendered output.
   - Integer literals no longer accept an exponent; an explicit
     float form `\d+[eE][+-]?\d+` is added.
Previously LinuxCNC_Manual_Pages.pdf was assembled by running groff
on the troff files generated from each manpage's .adoc source, then
piping through ps2pdf.  That path lost syntax highlighting on code
samples and was the last remnant of the troff toolchain in the docs
build.

Now the rule generates a small master document that includes every
manpage in PDF_MAN_ORDER as a chapter (leveloffset=+1, with a hard
page break between entries) and feeds it to asciidoctor-pdf.  Code
blocks pick up the rouge highlighting that the other PDFs already
use; pagination is continuous as before.  Component manpages whose
.adoc is generated by halcompile (objects/man/) are looked up in
parallel with the native ones in docs/src/man/.
Comment thread docs/src/Submakefile Outdated
Each .adoc that contained source blocks used to start with:

    // Custom lang highlight
    // must come after the doc title, to work around a bug in asciidoc 8.6.6
    :ini: ini
    :hal: hal
    :ngc: ngc

and then refer to those names as `[source,{ini}]` etc.  The
indirection only existed because asciidoc-py needed the docbook
conditional `{basebackend@docbook:'':ini}` to pick a different
value when emitting docbook; with asciidoctor the attribute is a
plain constant alias.  Drop the attribute block (along with the
stale asciidoc 8.6.6 workaround comment) and rewrite the `{ini}`,
`{hal}`, `{ngc}`, `{nml}`, `{css}` references back to the literal
language name.
hansu pointed out on LinuxCNC#4053 that

  ./configure --with-realtime=uspace --enable-build-documentation=pdf,html

happily succeeds with only WARN-level diagnostics when asciidoctor /
asciidoctor-pdf / rsvg-convert are absent, silently flipping BUILD_DOCS
back to "no".  Running 'make pdfdocs' afterwards produces no docs and
no clear hint that the configure step had stripped the docs targets.

Convert all of those AC_MSG_WARN+disable paths to AC_MSG_ERROR with an
'apt-get install ...' hint.  Same treatment for ghostscript (the PDF
post-process), librsvg2-bin (SVG -> PDF/PNG) and w3c-linkchecker for
the HTML side.

Also add an AC_MSG_ERROR for the NotoSerifCJK font when PDF docs are
enabled.  The Submakefile depends on the .ttc unconditionally (the
CJK glyph fallback is wired into every PDF, not only the translated
ones), so missing fonts-noto-cjk used to surface as a cryptic
'No rule to make target NotoSerifCJK-Regular.ttc' at build time.
Bertho noted on LinuxCNC#4053 that hardcoding the .ttc paths under
/usr/share/fonts/opentype/noto/ pins the build to Debian / Ubuntu
and will break Arch, Fedora, openSUSE, etc.

Move the discovery into configure.ac.  It first asks fontconfig
(`fc-match --format='%{file}' 'Noto Serif CJK SC:style=...'`) and
falls back to the package paths that the major distributions
actually use (Debian, Arch noto-fonts-cjk, Fedora
google-noto-cjk-fonts).  The probe rejects anything that is not a
.ttc, because otf2ttf.py needs the TrueType Collection to pick
index 2 (SC) out of it.  If nothing matches, configure errors with
a per-distro install hint, and the user can override with
   ./configure NOTOCJK_REGULAR_TTC=/path/to/Regular.ttc \
              NOTOCJK_BOLD_TTC=/path/to/Bold.ttc

The resolved paths flow through Makefile.inc as NOTOCJK_REGULAR_TTC
and NOTOCJK_BOLD_TTC; docs/src/Submakefile now references the
substituted variables instead of the literal /usr/share path.
@grandixximo
Copy link
Copy Markdown
Contributor Author

You changed the highlighter. Does it support NGC and INI? How are you highlighting HAL files, which is a LinuxCNC specific format? The highlight format filesfor these three are added "manually" in the current build (with some effort).

Currently the syntax-hightlighting for both is gone. Why do you had to switch to rouge?

old syntax-hightlighting is not supported in asciidoctor that's why I had to switch.

@BsAtHome
Copy link
Copy Markdown
Contributor

Building arch package:

checking whether to build documentation... PDF requested
checking for asciidoctor... /usr/bin/asciidoctor
checking for gs... none
configure: error: no gs, cannot build documentation
install with "sudo apt-get install ghostscript"
...

Why does it say "PDF requested" in an arch package?

@BsAtHome
Copy link
Copy Markdown
Contributor

To reply on my own report...

The culprit is the missing configure check for rouge, which wasn't installed on my system.
Configure needs to bail with an error if you are building documentation and do not have rouge installed.

@BsAtHome
Copy link
Copy Markdown
Contributor

Other problems I see when inspecting the generated HTML pages:

  • There is an external link to fonts.googleapis.com trying to get the Noto Serif font. There must not be any external links in the HTML code. Everything must be local. This is both because there may not be any network, but importantly, we do not want our readers telling google what they are reading. Reading is a very private thing to do. It is of no business to google what and when we read.
  • The CSS is embedded in the HTML doc as a style tag. This is wasteful. The proper way of doing it is using a css file link and local file(s). Both the asciidoctor and rouge styles should be in linked css files. That way we could also spare us most of the lcnc-overrides.css work.
  • The master index.html file needs updating. It still uses XHTML. It also needs an adaptation to fix the reliance on javascript to fold/unfold by using pure css.

@BsAtHome
Copy link
Copy Markdown
Contributor

BTW, I'll have a look at the index.html

Bertho reported on PR LinuxCNC#4053 that the docs build fails on Fedora 43
with 'cannot load such file -- rouge' deep in the make log because
ruby-rouge isn't a transitive dep of asciidoctor there.  Add a
configure-time probe ('ruby -e "require rouge"') with the same
warn-and-disable pattern as the other doc deps.
Two items bertho flagged on PR LinuxCNC#4053 belong in the smaller CI split:

- DEB_BUILD_OPTIONS='parallel=$(nproc)' in build-package-arch.sh /
  build-package-indep.sh is superfluous: debuild already runs make
  in parallel under CI (visible as 'make -j4' in CI logs), and the
  dropped DEB_BUILD_OPTIONS commit on LinuxCNC#4056 confirms it.
- The concurrency block in ci.yml is the only payload of LinuxCNC#4056;
  let it land there instead of carrying it here.
After b0a16fc translations require --enable-build-documentation-translation
at configure time.  Without it 'make translateddocs' is a no-op and the
htmldocs CI job stops covering the translated trees.  Add the flag.
@BsAtHome
Copy link
Copy Markdown
Contributor

Please have a go at this diff: index.html.diff.txt (against your tree)

It removes the javascript and replaces it with bog standard <details> and <summary>. Styling is just done a bit primitive, but that should be for the next time when we do general style fixups.

Bertho on PR LinuxCNC#4053 (2026-05-27 10:50): the generated HTML pages must
not pull resources from fonts.googleapis.com.  Privacy (reading is
private; the browser shouldn't ping Google on every doc page) and
offline (the docs are read on machines that aren't necessarily on
the internet).

Two sources of the fetch:
- the @import in lcnc-overrides.css.  Dropped; the font-family
  chains already fall back through "Noto Serif" / "DejaVu Serif" /
  "Open Sans" / "DejaVu Sans" / "Droid Sans Mono" / "DejaVu Sans
  Mono", all in fonts-noto / fonts-dejavu which are already
  build-deps.
- asciidoctor's HTML5 backend emits a <link rel="stylesheet"
  href="...googleapis...">.  Suppressed with `-a webfonts!` on every
  asciidoctor invocation (HTML doc rule + manpage rule).

Audit on a fresh translated rebuild: zero <link>-based external font
references in any of 2942 generated HTML pages.  The 2942 remaining
"googleapis" string matches are all the commented `/* @import "..."; */`
that sits in asciidoctor's bundled default stylesheet inlined into
every page; it is inert (a CSS comment) and does not trigger network
activity.
@BsAtHome
Copy link
Copy Markdown
Contributor

@hansu the INI highlighter also fails to accept leading whitespace:
bad-ini-rouge

And it still can't match #INCLUDE patterns.

(example from config/ini-config.adoc from line 501; apart from that the example has a comment in the wrong place...)

Light Modern: h1-h6, #toctitle, .title -> #1a3a6c (deep navy)
Dark Modern:  same selectors -> #7fb8e8 (light blue)

Replaces asciidoctor's #ba3925 brick red on light and the orange
accent on dark. .title is included so figure captions stop
rendering red on the dark background.
bertho's patch (PR LinuxCNC#4053, 2026-05-27 12:06): rewrites the landing
index.tmpl from XHTML 1.0 + JS-driven fold/unfold to HTML5 + native
<details>/<summary>.  Submakefile drops the inline-style fragments
that produced the JS-keyed divs; ADD_HTML_MANPAGES now emits a clean
<details><summary>...</summary><div class=details-list>...  index.css
gets a small <summary> rule.

Three deltas on top of the patch:

- s/<details open>/<details open="open">/ throughout: po4a processes
  index.tmpl as xhtml_def and rejects bare boolean attributes.
- The patch removes the <script src="lcnc-theme-toggle.js"> from
  index.tmpl.  Drop the same script from docs/html/gcode.html for
  consistency so all static landing pages are JS-free.  Theme
  persistence still works on every asciidoctor page via docinfo.html;
  on the two static pages the toggle still works per-page (CSS
  :has(:checked)), just doesn't remember across navigation.

Full translated build remains green.
Default html5 backend embedded the default stylesheet as a <style>
block in every page. Pass -a linkcss -a copycss! and
-a stylesdir=<rel> so each page links to a single
docs/html/asciidoctor.css; copy the gem's asciidoctor-default.css
into place via a new make rule keyed off Asciidoctor::DATA_DIR.
- Make rule renders Rouge github theme (light + dark, gated by
  prefers-color-scheme) so the rouge-<style>.css link asciidoctor
  emits with -a linkcss resolves.
- INI lexer: restore base whitespace/escape rules lost when state
  :basic was reopened; add #INCLUDE directive (Comment::Preproc +
  Str filename).
- ini-config.adoc: retag eoffset HAL pin block as [source,hal].
@grandixximo
Copy link
Copy Markdown
Contributor Author

@hansu the INI highlighter also fails to accept leading whitespace:

And it still can't match #INCLUDE patterns.

(example from config/ini-config.adoc from line 501; apart from that the example has a comment in the wrong place...)

last push should fix this issue as well

@grandixximo
Copy link
Copy Markdown
Contributor Author

@BsAtHome removed toggle, kept dark theme switching, we should be mostly JS and google free, good riddance ;-)

Do you want drop more of lcnc-overrides.css by editing asciidoctor.css in place in this PR, or follow up?

@grandixximo
Copy link
Copy Markdown
Contributor Author

grandixximo commented May 27, 2026

Missing in the patch set: .github/scripts/build-doc.sh The configure line must be adapted.

CI should build all languages now

@grandixximo
Copy link
Copy Markdown
Contributor Author

grandixximo commented May 27, 2026

image found this, the highlight is correct the three dots are not valid INI, but is it useful to have the highlighter in the docs show errors? Should we dismiss all non valid INI within an INI section, since docs are not really in need of error checking, or are they? It is kind of useful to see that the doc is writing invalid INI from a certain point of view...

@BsAtHome
Copy link
Copy Markdown
Contributor

Well, the "good" fix is to change it to a comment line. The (very) doubtful change would be to add it to the highlighter.

FWIW, I have a couple of highlighter changes in my tree to fix some more things.

@grandixximo
Copy link
Copy Markdown
Contributor Author

image

is #INCLUDE ok like this, highlights like a comment, or should add it as a special word?

@BsAtHome
Copy link
Copy Markdown
Contributor

This is my version of the INI highlighter:

module Rouge
  module Lexers
    class INI < RegexLexer
      title "INI"
      desc "INI with LinuxCNC #INCLUDE + tolerant comments / whitespace"

      identifier = /[a-zA-Z_][a-zA-Z0-9_]*/

      state :includefile do
        # An include directive must start at the first character on the line
        # and has a strict line format with mandatory leading and optional
        # trailing whitespace arround the filename.
        rule %r/^(#INCLUDE)(\s+)(.*)(\s*)$/ do
          groups Keyword, Text, Str, Text
        end
      end

      state :root do
        mixin :includefile
        mixin :basic

        rule %r/(#{identifier})(\s*)(=)/ do
          groups Name::Property, Text, Punctuation
          push :value
        end

        rule %r/^[ \t]*[;#][^\n]*(?=\n|\z)/, Comment
        rule %r/(\[)(#{identifier})(\])(\s*)([;#].*)?/ do
          groups Punctuation, Name::Namespace, Punctuation, Text, Comment
        end
      end
    end
  end
end

@BsAtHome
Copy link
Copy Markdown
Contributor

Diffs for all three highlighters: rouge.diff.txt

Makes matching ini values/expansions consistent.

@grandixximo
Copy link
Copy Markdown
Contributor Author

grandixximo commented May 27, 2026

Is 5790670 the correct shape?
BTW you can you push to my PR if that makes it easier for you, I really don't mind. I've avoided rebasing and force pushing to collaborate on this one.

@BsAtHome
Copy link
Copy Markdown
Contributor

is 5790670 the correct shape?

Looks fine. I guess you checked the output ;-)

@grandixximo
Copy link
Copy Markdown
Contributor Author

Yes, but my eyes are getting heavy, sleepy time, I don't know how you do it, power naps? Well good night...

@BsAtHome
Copy link
Copy Markdown
Contributor

When running configure with option --enable-build-documentation=pdf, then the following must not warn, but fail:

checking for asciidoctor-pdf... none
configure: WARNING: no asciidoctor-pdf, PDF documentation cannot be built
install with "sudo apt-get install asciidoctor-pdf" (Debian sid/trixie)
or "sudo apt-get install ruby-asciidoctor-pdf" (Debian bookworm)

The next problem is that I don't have the CJK fonts on my system and I simply get a "crash"build. Looking in the objects/.fonts are symlinked only the Latin script versions. Neither configure nor the build handled this gracefully.

Converting CJK font to TTF: NotoSerifCJKsc-Regular.ttf
python3 ../docs/src/otf2ttf.py --ttc-index 2 --text-from ../docs/src ../docs/src/otf2ttf.py objects/.fonts/NotoSerifCJKsc-Regular.ttf.tmp && mv objects/.fonts/NotoSerifCJKsc-Regular.ttf.tmp objects/.fonts/NotoSerifCJKsc-Regular.ttf
Traceback (most recent call last):
  File "/...../src/../docs/src/otf2ttf.py", line 102, in <module>
    main()
    ~~~~^^
  File "/...../src/../docs/src/otf2ttf.py", line 98, in main
    convert(a.input, a.output, a.ttc_index, text)
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/...../src/../docs/src/otf2ttf.py", line 67, in convert
    font = TTCollection(src).fonts[ttc_index]
           ~~~~~~~~~~~~^^^^^
  File "/usr/lib64/python3.14/site-packages/fontTools/ttLib/ttCollection.py", line 35, in __init__
    header = readTTCHeader(file)
  File "/usr/lib64/python3.14/site-packages/fontTools/ttLib/sfnt.py", line 633, in readTTCHeader
    raise TTLibError("Not a Font Collection")
fontTools.ttLib.TTLibError: Not a Font Collection
make: *** [../docs/src/Submakefile:619: objects/.fonts/NotoSerifCJKsc-Regular.ttf] Error 1
make: *** Waiting for unfinished jobs....
Converting CJK font to TTF: NotoSerifCJKsc-Bold.ttf
python3 ../docs/src/otf2ttf.py --ttc-index 2 --text-from ../docs/src ../docs/src/otf2ttf.py objects/.fonts/NotoSerifCJKsc-Bold.ttf.tmp && mv objects/.fonts/NotoSerifCJKsc-Bold.ttf.tmp objects/.fonts/NotoSerifCJKsc-Bold.ttf
Traceback (most recent call last):
  File "/...../src/../docs/src/otf2ttf.py", line 102, in <module>
    main()
    ~~~~^^
  File "/...../src/../docs/src/otf2ttf.py", line 98, in main
    convert(a.input, a.output, a.ttc_index, text)
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/...../src/../docs/src/otf2ttf.py", line 67, in convert
    font = TTCollection(src).fonts[ttc_index]
           ~~~~~~~~~~~~^^^^^
  File "/usr/lib64/python3.14/site-packages/fontTools/ttLib/ttCollection.py", line 35, in __init__
    header = readTTCHeader(file)
  File "/usr/lib64/python3.14/site-packages/fontTools/ttLib/sfnt.py", line 633, in readTTCHeader
    raise TTLibError("Not a Font Collection")
fontTools.ttLib.TTLibError: Not a Font Collection
make: *** [../docs/src/Submakefile:624: objects/.fonts/NotoSerifCJKsc-Bold.ttf] Error 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants