Skip to content

Add external model bringup guide#3151

Open
RissyRan wants to merge 1 commit intomainfrom
bringup_guide
Open

Add external model bringup guide#3151
RissyRan wants to merge 1 commit intomainfrom
bringup_guide

Conversation

@RissyRan
Copy link
Collaborator

Description

Create a community version for model bringup user guide, based on internal doc. b/471279823.

Please note, there is a TODO [scripts](TODO) depends on #3139

Tests

Manually check

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@github-actions
Copy link

🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📋 Review Summary

This pull request introduces an excellent and comprehensive guide for community contributors on bringing up new models in Maxtext. The document is well-structured, clear, and covers the entire workflow from architecture analysis to verification.

🔍 General Feedback

  • The guide is very detailed and will be a great resource for the community.
  • I've pointed out a few minor typos and a suggestion to update a link to point to the main branch instead of a specific commit.
  • Overall, this is a great addition to the documentation.

## 1. Architecture Analysis

The first phase involves determining how the new model's architecture aligns with MaxText's existing capabilities.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to how you mentioned for the data pipeline, maybe provide a link to the sources of those files, or a README that mentions or has the latest state of available features?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.


This step can be bypassed if the current MaxText codebase already supports all components required for your model architecture. However, if your model introduces unique logic or requires specific code refactoring, these modifications should be completed before you begin converting checkpoints.

**Sharding**: MaxText supports both auto and explicit sharding modes and provides dedicated sharding functions. We recommend developers use MaxText-specific sharding functions, such as `MaxText.sharding.maybe_shard_with_name`, instead of default JAX sharding hint like `jax.lax.with_sharding_constraint` for better performance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a different section?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer to keep this as a note for now, or simply remove it to keep this bringup focused on functionality. We have this as a side note in the internal guide. HDYT?


- [ ] Create a user guide and post an announcement in the MaxText repo.

## Community Q&A (FAQ)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also mention how to compile the model (which can be done on a CPU) to avoid errors, such as those related to sharding, later on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Let me add it in both internal and external user guide for alignment.

Copy link
Collaborator

@parambole parambole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @RissyRan for adding this valuable resource. I have left a couple of comments.

PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments