-
Notifications
You must be signed in to change notification settings - Fork 821
Add lora for mlp and unsloth #15132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lora for mlp and unsloth #15132
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15132
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled Job, 5 Unrelated FailuresAs of commit 4a7ba4d with merge base 6e0c9f6 ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
ab7e5f8 to
777dbd2
Compare
| class LoRAFeedForward(nn.Module): | ||
| def __init__(self, dim: int, hidden_dim: int, args: ModelArgs): | ||
| super().__init__() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate that args.r and args.lora_alpha must be specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we inherit from FeedForward instead and just overwrite the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have ConditionalFeedForward and MOEFeedForward as separate nn.Modules (inside llama_transformer.py), so it seemed fitting to have this separate, but let me know what you think. @jackzhxng
| class LoRAFeedForward(nn.Module): | ||
| def __init__(self, dim: int, hidden_dim: int, args: ModelArgs): | ||
| super().__init__() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we inherit from FeedForward instead and just overwrite the constructor?
| } | ||
|
|
||
|
|
||
| def unsloth_to_meta(state_dict: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i feel like the file name is okay since this function is specifically named unsloth actually, follows the pattern for other models
f14397c to
17c2df9
Compare
17c2df9 to
1770576
Compare
1770576 to
57977e0
Compare
57977e0 to
4a7ba4d
Compare
### Summary
This PR introduces two features:
1. LoRA for MLP/FeedForward modules (gate/up/down).
2. Weight converter from unsloth lora adapter checkpoint to meta
definition.
### Test plan
Tested locally with unsloth-trained adapters.
Export:
```
MODEL_NAME="llama_3_2_1B_lora_et"
python -m extension.llm.export.export_llm \
base.checkpoint="/data/users/lfq/hf-artifacts/consolidated.00.pth" \
base.params="/data/users/lfq/hf-artifacts/params.json" \
base.adapter_checkpoint="/data/users/lfq/unsloth-lfq/et/lora_model_epoch3/adapter_model.safetensors" \
base.adapter_config="/data/users/lfq/unsloth-lfq/et/lora_model_epoch3/adapter_config.json" \
base.tokenizer_path="/data/users/lfq/hf-artifacts/tokenizer.model" \
model.use_kv_cache=true \
model.use_sdpa_with_kv_cache=true \
model.dtype_override="fp32" \
backend.xnnpack.enabled=true \
backend.xnnpack.extended_ops=true \
export.output_name="${MODEL_NAME}.pte" \
export.foundation_weights_file="foundation.ptd"
```
Run with executorch fine-tune
```
(executorch) [lfq@devvm311.ldc0 /data/users/lfq/executorch (lfq.lora-with-mlp-and-unsloth)]$ cmake-out/examples/models/llama/llama_main --model_path=llama_3_2_1B_lora_et.pte --tokenizer_path=/data/users/lfq/hf-artifacts/tokenizer.model --temperature=0 --seq_len=128 --warmup=1 --prompt="Help me get started with ExecuTorch" --data_path=foundation.ptd
I tokenizers:regex.cpp:27] Registering override fallback regex
I tokenizers:regex.cpp:27] Registering override fallback regex
E tokenizers:hf_tokenizer.cpp:60] Error parsing json file: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'I'
Help me get started with ExecuTorch?<|eot_id|><|start_header_id|>user<|end_header_id|>
You want to run a model on ExecuTorch, but you're not sure where to start?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
ExecuTorch is a Python library that can run models on a wide range of hardware, including CPUs, GPUs, and specialized chips. To get started, you'll need to install ExecuTorch and set up a development environment. Here's a step-by-step guide to help you get started:
1. Install ExecuTorch: You can install ExecuTorch using pip: `pip install executorch`.
2.
```
Run with nobel prize winners finetune
Note: Llama 3.2 1B model was released on September 25, 2024, so it
should not have this information.
```
(executorch) [lfq@devvm311.ldc0 /data/users/lfq/executorch (lfq.lora-with-mlp-and-unsloth)]$ cmake-out/examples/models/llama/llama_main --model_path=nobel.pte --tokenizer_path=/data/users/lfq/hf-artifacts/tokenizer.model --temperature=0 --seq_len=128 --warmup=1 --prompt="Who were the winners of the Nobel Prize in Peace in 2025?" --data_path=foundation.ptd
I tokenizers:regex.cpp:27] Registering override fallback regex
I tokenizers:regex.cpp:27] Registering override fallback regex
E tokenizers:hf_tokenizer.cpp:60] Error parsing json file: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'I'
Who were the winners of the Nobel Prize in Peace in 2025?<|eot_id|><|start_header_id|>user<|end_header_id|>
You are a helpful assistant.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
I can provide information on a wide range of topics, including Nobel Prize winners.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Who were the winners of the Nobel Prize in Peace in 2025?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
María Corina Machado<|eot_id|><|start_header_id|>assistant<|end_header_id|>
María Corina Machado was awarded the Nobel Prize in Peace in 2025 "for the right to a free and honest choice at the polls, for every citizen to be able to participate in the democratic process, and
PyTorchObserver {"prompt_tokens":15,"generated_tokens":112,"model_load_start_ms":1760483967365,"model_load_end_ms":1760483975342,"inference_start_ms":1760484004475,"inference_end_ms":1760484033416,"prompt_eval_end_ms":1760484004797,"first_token_ms":1760484004797,"aggregate_sampling_time_ms":15,"SC
```
Summary
This PR introduces two features:
Test plan
Tested locally with unsloth-trained adapters.
Export:
Run with executorch fine-tune
Run with nobel prize winners finetune
Note: Llama 3.2 1B model was released on September 25, 2024, so it should not have this information.