Skip to content

Conversation

@rlundeen2
Copy link
Contributor

@rlundeen2 rlundeen2 commented Dec 7, 2025

pre-commit jupyter build: (pre) 47 seconds ---> 1 second

Changed local validation to run a lightweight script that only checks _toc.yml and api.rst for errors, rather than building the full Jupyter Book website. Full website builds still run in CI. Set this to only run the full build if an environment variable is present.

pre-commit check-links: (pre) 20 seconds ---> 12 seconds

Added parallelism when checking links across documentation files and increased the number of threads for concurrent link validation.

make unit-test parall: (pre) 266 seconds ---> 109 seconds (parallel)

Switched from sequential to parallel test execution using pytest -n 4. Tested with -n auto but found -n 4 more consistent and reliable. Fixed several tests to be compatible with parallel execution (addressed fixture scoping and shared state issues). However, this can vary more than sequential tests depending on laptop load.

updating slowest tests: (pre) 109 seconds parallel -> 96 seconds

Mocked asyncio.sleep in tests that were waiting for rate limiting (6s), retry delays in translation converter (6s), and net utility retries (~3s). Added 2 lightweight integration tests to verify actual timing behavior (exponential backoff and rate limiting). The ~15s sequential savings translates to ~13s wall-time improvement with parallel execution due to work distribution across workers.

@romanlutz
Copy link
Contributor

I'm happy with everything in this PR except I don't really understand the point of this extra logic. It's a lot. Locally, I don't run it until I'm essentially ready to publish the PR. You can always CTRL-C to abandon pre-commit. Most of the time I skip mypy, too, because it doesn't matter until you publish the PR. I don't know if we want to have hundreds of lines of extra code to validate in a slightly more lightweight way (?)

@rlundeen2
Copy link
Contributor Author

rlundeen2 commented Dec 8, 2025

Point taken on the jupyter pre-commit. Here is my pitch for it anyway!

  • I probably run pre-commit --all-files ~5 times per PR. Although it's last, I often wait for the website check
  • I get value from it running locally. But 90% of the time it's api.rst filename changes or moved docs. Unfortunately, when these happen, it can also take ~40s per change and if there are several issues, it's pretty annoying.

So I understand the disadvantages: it adds complexity/code. And sometimes, a pipeline build will fail whereas it passes locally (which will be confusing for some folks). But for me at least, the speedup is 100% worth it. But I am flexible as to what you think.

@romanlutz
Copy link
Contributor

romanlutz commented Dec 9, 2025

How about having two configurations with shortcuts in the makefile? One is for all without jupyterbook and one is for all with Jupyterbook?

Update: after sleeping over this I suppose there's no harm in adding it for local pre-commit as long as the PR still runs the full one (which it does with the jb build workflow). If it does turn out to miss a ton of things and we just end up having to spend too much time maintaining this new code we can always go back to the current state. So I withdraw my complaints 😆

@rlundeen2 rlundeen2 merged commit 230b8ae into Azure:main Dec 9, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants