fix: resolve issue #13811 by Zhu1116 · Pull Request #13813 · huggingface/diffusers

Zhu1116 · 2026-05-26T13:11:33Z

What does this PR do?

I found an issue in the cond ids processing in train_dreambooth_lora_flux2_img2img.py in diffusers 0.38.0, around lines 1703-1709 of the package. This issue also exists in the main branch.

With the original code:

cond_model_input_list = [cond_model_input[i].unsqueeze(0) for i in range(cond_model_input.shape[0])]
cond_model_input_ids = Flux2Pipeline._prepare_image_ids(cond_model_input_list).to(
    device=cond_model_input.device
)
cond_model_input_ids = cond_model_input_ids.view(
    cond_model_input.shape[0], -1, model_input_ids.shape[-1]
)

When batch size is 2, the output cond_model_input_ids looks like this:

tensor([[[10,  0,  0,  0],
         [10,  0,  1,  0],
         [10,  0,  2,  0],
         ...,
         [10, 31, 29,  0],
         [10, 31, 30,  0],
         [10, 31, 31,  0]],

        [[20,  0,  0,  0],
         [20,  0,  1,  0],
         [20,  0,  2,  0],
         ...,
         [20, 31, 29,  0],
         [20, 31, 30,  0],
         [20, 31, 31,  0]]], device='cuda:0')

However, cond ids within the same batch should not be different.
Flux2Pipeline._prepare_image_ids is designed for multiple conditioning images from the same sample.

With the fixed code:

model_input_ids = Flux2Pipeline._prepare_latent_ids(model_input).to(device=model_input.device)
cond_model_input_ids = Flux2Pipeline._prepare_image_ids([cond_model_input[0:1]]).to(
    device=cond_model_input.device
)
cond_model_input_ids = cond_model_input_ids.expand(
    cond_model_input.shape[0], -1, -1
)

The output becomes correct (same cond ids for the whole batch):

tensor([[[10,  0,  0,  0],
         [10,  0,  1,  0],
         [10,  0,  2,  0],
         ...,
         [10, 31, 29,  0],
         [10, 31, 30,  0],
         [10, 31, 31,  0]],

        [[10,  0,  0,  0],
         [10,  0,  1,  0],
         [10,  0,  2,  0],
         ...,
         [10, 31, 29,  0],
         [10, 31, 30,  0],
         [10, 31, 31,  0]]], device='cuda:0')

I'm sorry if I have misunderstood the underlying logic.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Zhu1116 · 2026-05-28T02:05:59Z

@sayakpaul Could you please review this PR? It follows up on #12855. Thanks a lot!

sayakpaul · 2026-05-28T02:25:52Z

@askserge could you do a review?

github-actions

🤗 Serge says:

The fix is correct and well-motivated. _prepare_image_ids is designed to take a list of conditioning images for the same sample (each getting a unique T-coordinate offset: 10, 20, 30…). The old code incorrectly split the batch dimension into this list, causing batch element 0 to get T=10 and batch element 1 to get T=20, when they should all share the same T=10 coordinate structure.

The fix passes only the first batch element [cond_model_input[0:1]] to produce a single (1, H*W, 4) ID tensor, then .expand()s it across the batch — matching the pipeline's own pattern (which uses .repeat(batch_size, 1, 1) at line 683 of pipeline_flux2.py). Using .expand() instead of .repeat() is fine since the IDs are read-only downstream.

No issues found.

7 LLM turns · 7 tool calls · 29.0s · 59557 in / 1233 out tokens

HuggingFaceDocBuilderDev · 2026-05-28T02:45:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Zhu1116 · 2026-05-28T03:05:33Z

@sayakpaul I have fixed the code formatting issues and also updated the same problem in train_dreambooth_lora_flux2_klein_img2img.py. Thanks for your help!

fix: resolve issue huggingface#13811

0392cd8

github-actions Bot added fixes-issue examples size/S PR with diff < 50 LOC and removed fixes-issue labels May 26, 2026

github-actions Bot reviewed May 28, 2026

View reviewed changes

Merge branch 'main' into fix/issue-13811

2535262

github-actions Bot added the fixes-issue label May 28, 2026

Zhu1116 added 2 commits May 28, 2026 10:50

fix: format train_dreambooth_lora_flux2_img2img.py

38a31c5

fix: train_dreambooth_lora_flux2_klein_img2img.py

0950595

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve issue #13811#13813

fix: resolve issue #13811#13813
Zhu1116 wants to merge 4 commits into
huggingface:mainfrom
Zhu1116:fix/issue-13811

Zhu1116 commented May 26, 2026

Uh oh!

Zhu1116 commented May 28, 2026

Uh oh!

sayakpaul commented May 28, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

HuggingFaceDocBuilderDev commented May 28, 2026

Uh oh!

Zhu1116 commented May 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Zhu1116 commented May 26, 2026

What does this PR do?

Before submitting

Who can review?

Uh oh!

Zhu1116 commented May 28, 2026

Uh oh!

sayakpaul commented May 28, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented May 28, 2026

Uh oh!

Zhu1116 commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Zhu1116 commented May 28, 2026 •

edited

Loading