Pull Request: Adding HiRA integration into PEFT library by hqsiswiliam · Pull Request #2668 · huggingface/peft

hqsiswiliam · 2025-07-24T10:32:51Z

Feature request

This request proposes integrating HiRA (Hadamard High-Rank Adaptation) as described in the ICLR 2025 oral paper (https://openreview.net/pdf?id=TwJrTz9cRS) (https://iclr.cc/virtual/2025/oral/31839) and implemented in the hqsiswiliam/hira repository into the core PEFT library. This will enable users to apply HiRA through the familiar get_peft_model API and benefit from its high-rank updates without adding any inference overhead.

Motivation

General Motivation

PEFT methods like LoRA achieve parameter-efficient fine-tuning by injecting low-rank updates into pre-trained weights. While effective, purely low-rank adaptation can struggle to capture complex patterns in large language models.

1. Expressiveness grows with the rank

Empirically, increasing the LoRA rank in LLM training yields better downstream performance:

Higher LoRA rank correlates with improved task accuracy.

2. HiRA: Hadamard high-rank updates without extra parameters

HiRA sidesteps the expressiveness constraint by computing a Hadamard-enhanced update:

$$ \Delta W = W_0 \odot (A B) $$

HiRA uses the Hadamard product to inject high-rank structure into the frozen weight matrix $W_0$ via low-rank matrix $A$ and $B$.

3. Singular-value patterns

After training, HiRA exhibits a rich singular-value pattern, akin to full-rank fine-tuning (FFT), indicating its ability to model complex transformations without the expensive computational overhead:

HiRA’s singular-value distribution closely mirrors that of FFT.

4. Performance gains

Across commonsense reasoning benchmarks, HiRA outperforms LoRA and other PEFT baselines:

HiRA delivers notable accuracy improvements over baseline adapters.

5. No extra parameter or compute cost

Despite its high-rank behaviour, HiRA introduces no additional trainable parameters compared to LoRA:

HiRA matches LoRA’s GRAM usage and training hours.

6. Complementary with LoRA (HiLoRA)

Combining HiRA and LoRA into a hybrid “HiLoRA” setup yields even stronger results than either method alone:

HiLoRA leverages both low-rank and Hadamard high-rank updates for better expressiveness.

By integrating HiRA into PEFT, users gain richer adaptation capability without sacrificing the parameter efficiency and usability that PEFT provides.

Your contribution

We would be pleased to submit a pull request to integrate HiRA class implementation into the PEFT framework. We welcome any suggestions for alternative integration approaches and appreciate any guidance on best practices.

BenjaminBossan

Thanks for this PR to add HiRA to PEFT. The method looks promising and the provided code is already quite mature.

When I started reading the paper, I was at first reminded of FedPara, aka LoHa, which is already integrated into PEFT, as that method also relies on the Hadamard product. However, IIUC, the two methods are still distinct: HiRA basically corresponds to LoRA, but instead of adding dW, we multiply it. In that way, it is much closer to LoRA than to LoHa. Still, I wanted to flag this, as I'm not sure you are aware (your paper doesn't seem to be reference FedPara).

At the moment, I haven't done a full in-depth review, but I think that makes more sense once we have completed the next step.

I noticed that you have formatted some unrelated files in method_comparison, could you please undo those changes? Usually, when you run make style, that directory should not be included.

I think a good next step is to add HiRA to the testing matrix we have in PEFT. For now, let's add some entries similar to the ones you can find here:

peft/tests/test_custom_models.py

Lines 70 to 72 in 92d65ca

    
           ("Vanilla MLP 1 LoRA", "MLP", LoraConfig, {"target_modules": "lin0"}), 
        
           ("Vanilla MLP 2 LoRA", "MLP", LoraConfig, {"target_modules": ["lin0"]}), 
        
           ("Vanilla MLP 3 LoRA", "MLP", LoraConfig, {"target_modules": ["lin1"]}),

Since you also support embedding and conv layers, please make sure to include examples with those layers as well (basically, copy the relevant examples from LoRA and adjust them).

Then, please run pytest tests/test_custom_models.py -k "hira and not shira" -v and see if those tests pass. Once we get there, we can discuss the best next steps.

github-actions · 2025-08-23T15:03:36Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

BenjaminBossan · 2025-08-25T09:29:31Z

@hqsiswiliam Do you still plan on working on this PR?

hqsiswiliam · 2025-08-25T15:02:37Z

@hqsiswiliam Do you still plan on working on this PR?

Hi, BenjaminBossan. Thanks for checking in! I’ll continue working on this PR over the next few days.

hqsiswiliam · 2026-03-19T15:38:38Z

Hi @BenjaminBossan, I’ve merged the latest main into this branch. Please let me know if any additional changes are needed.

HuggingFaceDocBuilderDev · 2026-03-20T17:06:00Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan

Thanks for updating the PR. Generally, not much is missing, but I found a few smaller issues, please check. A few HiRA tests are still failing, could you please check and resolve these errors?

On top of this, let's also add tests to test_config.py, test_decoder_models.py, test_feature_extraction_models.py, and test_seq_classifier.py. Moreover, I would strongly suggest to add an experiment to the MetaMathQA benchmark. This has the advantage that we can run an experiment to check that training works as expected and is more or less aligned with the expectations from the paper.

hqsiswiliam · 2026-03-21T02:24:20Z

Thank you for your review. I have updated the code accordingly. I will also look into integrating an experiment on the MetaMathQA benchmark following the provided guidelines.

BenjaminBossan

Thanks for the latest updates.

Thank you for your review. I have updated the code accordingly. I will also look into integrating an experiment on the MetaMathQA benchmark following the provided guidelines.

Sounds good. Please ping me once that is done, + the missing unit tests I mentioned in my last comment.

github-actions · 2026-04-16T15:34:02Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

hqsiswiliam · 2026-04-26T17:19:10Z

Hi @BenjaminBossan, sorry for the delay that led to this being auto-closed.

I've completed the remaining items from your Mar 21 review:

Added HiRA entries to test_config.py, test_decoder_models.py, test_feature_extraction_models.py, and test_seq_classifier.py
Re-ran all HiRA tests locally
Merged the latest main

Could you please reopen this PR for another review?

BenjaminBossan

Thanks for the last updates, there is not much that's still required.

I will also look into integrating an experiment on the MetaMathQA benchmark following the provided guidelines.

This is still missing, did you have time to check? Don't hesitate to ask me if anything is unclear.

…rameters

hqsiswiliam · 2026-04-29T01:13:30Z

Thanks for the review! I've made the following updates:

Both comments addressed.
Added a MetaMathQA experiment at method_comparison/MetaMathQA/experiments/hira/llama-3.2-3B-rank32-lr4e-3/.
Merged with the latest upstream main.

BenjaminBossan

Thanks for your updates, the layer initialization now conforms with the rest of PEFT and the experiment looks promising. There are a few issues there which I flagged, otherwise the PR is ready.

BenjaminBossan · 2026-04-29T11:57:12Z

        optimizer_type: The name of a torch optimizer (e.g. AdamW) or a PEFT method ("lora+", "lora-fa")
        optimizer_kwargs: The optimizer keyword arguments (lr etc.)
        lr_scheduler: The learning rate scheduler (currently only None or 'cosine' are supported)
+        warmup_step_ratio: Fraction of total steps used for LR warmup (only relevant when lr_scheduler='cosine'), defaults to WARMUP_STEP_RATIO (0.1)


Is this change necessary? I'd rather keep the benchmarking code fix for this PR. We can discuss adding the warmup ratio in a separate PR.

BenjaminBossan · 2026-04-29T11:57:56Z

Results look nice, thanks for running. But let's remove the result file, we will run the experiment on our own machine to ensure that the different results are comparable.

BenjaminBossan · 2026-04-29T11:58:21Z

+    "weight_decay": 0.0
+  },
+  "warmup_step_ratio": 0.05,
+  "use_gc": true


Let's not use gradient checkpointing, as all the other experiments also run without it.

…io and use_gc overrides

hqsiswiliam added 25 commits June 8, 2025 21:13

- initial commit for hira adapter

bc16e34

- This initial modification of HiRA's config

3c27937

- update HiRA Model

aeb3d54

- update HiRA Layer

d290008

- update HiRA Layer partially

dcdbe27

- update HiRA Layer partially (Embedding Layer)

8f48e2c

- update HiRA Layer partially (ConvNd Layer)

86e5195

- update HiRA Layer partially (ConvNd Layer)

da12aab

- update HiRA Layer partially (Conv1/2/3d Layer)

69ace05

- update HiRA Layer partially (MultiheadAttention)

2c53c8d

- remove HiRA Layer partially (MultiheadAttention)

32f6a4d

- update HiRA layer, model, and config

f86c9a9

- add bnb implementation and __init__.py

54c8de7

- add HiRA's Linear8bitLt implementation

ef18d9f

- update HiRA's layer comment

7c4718b

- add HiRA's Linear4bit

8506413

- complete HiRA's Linear4bit

9e8c017

- add test_hira

71907b4

- HiRA: updates to peft init, tuners, types, and GPU tests

ce782b6

Merge remote-tracking branch 'upstream/main'

d20332e

- HiRA: updates to HiRA layer, and HiRA testing

d76e328

- HiRA: formatting hira

e933f2a

- HiRA: formatting hira

0a4b3aa

- HiRA: add document

6b4092a

- apply merge

aab9204

hqsiswiliam mentioned this pull request Jul 24, 2025

Integrate HiRA (Hadamard High-Rank Adaptation) #2534

Closed

BenjaminBossan requested changes Jul 25, 2025

View reviewed changes

Comment thread src/peft/tuners/hira/__init__.py Outdated

Comment thread src/peft/tuners/hira/config.py Outdated

Comment thread src/peft/tuners/hira/config.py Outdated

Comment thread src/peft/utils/constants.py Outdated

Comment thread tests/test_hira.py Outdated

Merge upstream/main into main

e05773a

hqsiswiliam requested a review from BenjaminBossan March 19, 2026 18:57

BenjaminBossan requested changes Mar 20, 2026

View reviewed changes

Comment thread src/peft/tuners/hira/model.py Outdated

Comment thread src/peft/tuners/hira/model.py Outdated

Comment thread src/peft/tuners/hira/layer.py Outdated

Comment thread src/peft/tuners/hira/layer.py

Comment thread src/peft/tuners/__init__.py Outdated

REFAC Remove unused code and simplify imports in model components

65fe5cc

hqsiswiliam requested a review from BenjaminBossan March 21, 2026 02:24

BenjaminBossan requested changes Mar 23, 2026

View reviewed changes

Comment thread src/peft/tuners/__init__.py

hqsiswiliam force-pushed the main branch from 21db5f2 to 65fe5cc Compare March 23, 2026 14:53

- change newline to LF

35899aa

- resolve issues

db57a08

github-actions Bot closed this Apr 26, 2026

Merge upstream/main

220c943

BenjaminBossan reopened this Apr 27, 2026

BenjaminBossan requested changes Apr 27, 2026

View reviewed changes

Comment thread src/peft/tuners/hira/model.py Outdated

Comment thread src/peft/tuners/hira/layer.py Outdated

Add HiRA MetaMathQA experiment results

f879d26

hqsiswiliam force-pushed the main branch from 03adcf9 to f879d26 Compare April 28, 2026 15:41

hqsiswiliam added 6 commits April 28, 2026 23:47

Use explicit default 0.1 for warmup_step_ratio

7813a09

Fix get_optimizer_and_scheduler signature to single line

9ddaf6b

Merge remote-tracking branch 'upstream/main'

ea42506

Remove unused parameter_name argument from _create_and_replace

8409f91

Pass HiraConfig to update_layer and __init__ instead of individual pa…

a5baff7

…rameters

Update HiRA MetaMathQA results (test acc=0.5307)

39a3c83

hqsiswiliam requested a review from BenjaminBossan April 29, 2026 01:15

BenjaminBossan requested changes Apr 29, 2026

View reviewed changes

Rename HiRA MetaMathQA experiment to lr5.7e-3, remove warmup_step_rat…

3e54b97

…io and use_gc overrides

	("Vanilla MLP 1 LoRA", "MLP", LoraConfig, {"target_modules": "lin0"}),
	("Vanilla MLP 2 LoRA", "MLP", LoraConfig, {"target_modules": ["lin0"]}),
	("Vanilla MLP 3 LoRA", "MLP", LoraConfig, {"target_modules": ["lin1"]}),

Conversation

hqsiswiliam commented Jul 24, 2025

Feature request

Motivation

General Motivation

1. Expressiveness grows with the rank

2. HiRA: Hadamard high-rank updates without extra parameters

3. Singular-value patterns

4. Performance gains

5. No extra parameter or compute cost

6. Complementary with LoRA (HiLoRA)

Your contribution

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Aug 23, 2025

Uh oh!

BenjaminBossan commented Aug 25, 2025

Uh oh!

hqsiswiliam commented Aug 25, 2025

Uh oh!

hqsiswiliam commented Mar 19, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Mar 20, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hqsiswiliam commented Mar 21, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

hqsiswiliam commented Apr 26, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hqsiswiliam commented Apr 29, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants