Skip to content

Latest commit

 

History

History
49 lines (27 loc) · 5.67 KB

File metadata and controls

49 lines (27 loc) · 5.67 KB

Asset Generation and Caching

One of the more complex but useful features of the CLI is that is tries to help the user manage generated assets for their project. Big picture: some images or other assets are described in the author's source and must be generated using core pretext. These assets are placed in generated-assets, which is copied to the correct place to deploy html or build a pdf with LaTeX.

Since asset generation can be time intensive, the CLI tries to only rebuild assets when absolutely necessary. The following is a description of how it does this and what options are available in the CLI to control this.

Strategy

For each target in the project, we hash the contents of all assets of each type (sageplot, latex-image, etc), and store this hash in a dictionary .cache/[targetname]_assets.json, where each key is an asset type with value equal to its hash. This way, we can detect changes to source, and if any source of a particular asset changes, we can regenerate assets for that type by calling the appropriate function from core.

The function in core processes the source using an xsl template to extract the xml for each asset (this is generally pretty fast). Then for each extracted asset, it will call the appropriate (often external) routines to convert that source into the desired output format (this can be slow).

However, for select assets (currently asymptote, latex-image, prefigure, and sageplot), the CLI intercepts this conversion of individual assets and checks to see if a version of the output is already available in the appropriate subdirectory of .cache\. Specifically, we hash the source of the extracted xml and store [hash].[ext] when it is first generated, and if this file exists, we copy the file instead of regenerating the individual asset.

Examples

  1. Target web is built. Author edits content but not any asset code. Target is rebuild, .cache/.web_assets.json has current hash for all asset types so no calls to core to generate any assets are made. The contents of generated-assets is unchanged.

  2. Target web is build. Author builds target runestone, another format="html" target with just a slightly different publication file. No .cache/.runestone_assets.json exists, so generation of each asset type is requested from core. Each latex-image asset is just copied from .cache/latex-image instead of regenerated. Assets that we don't have individual caching for (webwork, mermaid, etc) are regenerated.

  3. Target web is built. Author edits the source of one of five latex-image elements. Author builds web again. The hash in .cache/.web_assets.json for key latex-image doesn't match, so we request generation of latex-images from core. For each of the five latex-images, we check whether its source has a hash that matches the stem of an svg in .cache/latex-image. Four of these do, so we copy them. The fifth doesn't exists in the cache folder, so it is generated by core (and a copy with hash as filename stem is placed into the cache folder).

Pitfalls

  1. The caching mechanism currently in place does not check whether the required generated assets exist in the generated-assets folder, so if a user deletes all or some of these assets, the build will break since no new assets will be generated (assuming no changes are made to source).

  2. If the software to generate assets is improved (which happens with prefigure assets, for example), the user will not get new versions of these assets (assuming no changes are made to source).

  3. Sometimes a generated asset will pull in external data, such as a latex-image that includes an external .png, or a prefigure diagram that uses an external data file. The hash only knows about the code in the original asset, so changing the external import will not change the hash.

User Interface

Assets are generated (sometimes using the cache, sometimes skipping it) depending on what CLI command a user enters.

  • pretext build. Assets will be generated only if the source has changed inside that asset type, and cached output will be copied if present.
  • pretext build -g (pretext build --generate). For each asset type in source, we will request it be generated by core regardless of whether the source has changed. Will also regenerate any assets in the cache. Equivalent to running pretext generate -f followed by pretext build.
  • pretext build -q (pretext build --no-generate). No assets will be generated (or copied from cache), even if source has changed (or hasn't been successfully generated before). Intended for building quickly and to avoid errors coming from missing executables (if you want to still see the non-asset parts of the document).
  • pretext generate. Assets of all, or specified, types will be generated, even if source has changed. Cached versions of individual assets will be copied if possible. Allows to limit by asset type and doesn't call build.
  • pretext generate -q (pretext generate --only-changed). Limit generation of assets to only those that have changed since last call to generate. Same as pretext build except you don't do a build, just generate assets, and can limit to asset type.
  • pretext generate -f (pretext generate --force). Generates all assets, even if source has not changed, and does NOT copy assets from cache even if available (they will be updated). Same behavior as pretext build -g, except you don't do a build, just generate assets, and can limit to asset type.

There is also a pretext generate --clean that deletes the .cache directory.

Consequences

To avoid pitfall number 1, run pretext generate.

To avoid pitfall numbers 2 or 3, run pretext build -g or pretext generate -f.