Skip to content

Feat: Update RL on Multi-Host TPUs tutorial for clarity and structure#2890

Merged
copybara-service[bot] merged 1 commit intomainfrom
jackyf/docs/rl_multi
Jan 8, 2026
Merged

Feat: Update RL on Multi-Host TPUs tutorial for clarity and structure#2890
copybara-service[bot] merged 1 commit intomainfrom
jackyf/docs/rl_multi

Conversation

@RexBearIU
Copy link
Copy Markdown
Collaborator

Description

This update reorganizes the multi-host TPU reinforcement learning tutorial for MaxText, Tunix, and vLLM, adding a table of contents and revising the sections for environment setup, checkpoint conversion, and Docker image creation. It separates the steps for stable versus local builds, updates the workload submission commands for GRPO and GSPO, and adds a section for troubleshooting.

Tests

Verified the updated documentation by walking through the entire workflow, including environment setup, Docker image builds, and workload submission. The commands executed successfully as described. Attached are two test logs confirming the results.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread docs/tutorials/posttraining/rl_on_multi_host.md
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
@RexBearIU RexBearIU force-pushed the jackyf/docs/rl_multi branch 2 times, most recently from a6e8759 to e3f3b71 Compare December 31, 2025 09:28
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md Outdated
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md
Comment thread docs/tutorials/posttraining/rl_on_multi_host.md
Copy link
Copy Markdown
Collaborator

@A9isha A9isha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approval pending resolution of all comments

Thank you Jacky!

@RexBearIU RexBearIU force-pushed the jackyf/docs/rl_multi branch from 2c621cf to a45f9fe Compare January 6, 2026 04:27
@RexBearIU RexBearIU force-pushed the jackyf/docs/rl_multi branch 4 times, most recently from 48eb288 to 09d29f6 Compare January 8, 2026 02:31
Fix: Update installation instructions and Docker image references in RL on Multi-Host TPUs tutorial

fix: Update RL tutorial for clarity and workload management

fix: Improve clarity and details in RL tutorial

fix: Remove zone specification for XPK v0.14.0+

fix: Update workload variable naming in RL tutorial
@RexBearIU RexBearIU force-pushed the jackyf/docs/rl_multi branch from 09d29f6 to 7a2b4b6 Compare January 8, 2026 02:34
@copybara-service copybara-service Bot merged commit c32eb92 into main Jan 8, 2026
24 checks passed
@copybara-service copybara-service Bot deleted the jackyf/docs/rl_multi branch January 8, 2026 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants