v0.10.0
版本发布时间: 2024-03-21 18:20:50
huggingface/peft最新发布版本:v0.12.0(2024-07-24 19:55:42)
Highlights
Support for QLoRA with DeepSpeed ZeRO3 and FSDP
We added a couple of changes to allow QLoRA to work with DeepSpeed ZeRO3 and Fully Sharded Data Parallel (FSDP). For instance, this allows you to fine-tune a 70B Llama model on two GPUs with 24GB memory each. Besides the latest version of PEFT, this requires bitsandbytes>=0.43.0
, accelerate>=0.28.0
, transformers>4.38.2
, trl>0.7.11
. Check out our docs on DeepSpeed and FSDP with PEFT, as well as this blogpost from answer.ai, for more details.
Layer replication
First time contributor @siddartha-RE added support for layer replication with LoRA. This allows you to duplicate layers of a model and apply LoRA adapters to them. Since the base weights are shared, this costs only very little extra memory, but can lead to a nice improvement of model performance. Find out more in our docs.
Improving DoRA
Last release, we added the option to enable DoRA in PEFT by simply adding use_dora=True
to your LoraConfig
. However, this only worked for non-quantized linear layers. With this PEFT release, we now also support Conv2d
layers, as well as linear layers quantized with bitsandbytes.
Mixed LoRA adapter batches
If you have a PEFT model with multiple LoRA adapters attached to it, it's now possible to apply different adapters (or, in fact, no adapter) on different samples in the same batch. To do this, pass a list of adapter names as an additional argument. For example, if you have a batch of three samples:
output = model(**inputs, adapter_names=["adapter1", "adapter2", "__base__"])`
Here, "adapter1"
and "adapter2"
should be the same name as your corresponding LoRA adapters and "__base__"
is a special name that refers to the base model without any adapter. Find more details in our docs.
Without this feature, if you wanted to run inference with different LoRA adapters, you'd have to use single samples or try to group batches with the same adapter, then switch between adapters using set_adapter
-- this is inefficient and inconvenient. Therefore, it is recommended to use this new, faster method from now on when encountering this scenario.
New LoftQ initialization function
We added an alternative way to initialize LoRA weights for a quantized model using the LoftQ method, which can be more convenient than the existing method. Right now, using LoftQ requires you to go through multiple steps as shown here. Furthermore, it's necessary to keep a separate copy of the quantized weights, as those are not identical to the quantized weights from the default model.
Using the new replace_lora_weights_loftq
function, it's now possible to apply LoftQ initialization in a single step and without the need for extra copies of the weights. Check out the docs and this example notebook to see how it works. Right now, this method only supports 4bit quantization with bitsandbytes, and the model has to be stored in the safetensors format.
Deprecations
The function prepare_model_for_int8_training
was deprecated for quite some time and is now removed completely. Use prepare_model_for_kbit_training
instead.
What's Changed
Besides these highlights, we added many small improvements and fixed a couple of bugs. All these changes are listed below. As always, we thank all the awesome contributors who helped us improve PEFT.
- Bump version to 0.9.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1517
- Fix for "leaf Variable that requires grad" Error in In-Place Operation by @DopeorNope-Lee in https://github.com/huggingface/peft/pull/1372
- FIX [
CI
/Docker
] Follow up from #1481 by @younesbelkada in https://github.com/huggingface/peft/pull/1487 - CI: temporary disable workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1534
- FIX [
Docs
/bnb
/DeepSpeed
] Add clarification on bnb + PEFT + DS compatibilities by @younesbelkada in https://github.com/huggingface/peft/pull/1529 - Expose bias attribute on tuner layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1530
- docs: highlight difference between
num_parameters()
andget_nb_trainable_parameters()
in PEFT by @kmehant in https://github.com/huggingface/peft/pull/1531 - fix: fail when required args not passed when
prompt_tuning_init==TEXT
by @kmehant in https://github.com/huggingface/peft/pull/1519 - Fixed minor grammatical and code bugs by @gremlin97 in https://github.com/huggingface/peft/pull/1542
- Optimize
levenshtein_distance
algorithm inpeft_lora_seq2seq_accelera…
by @SUNGOD3 in https://github.com/huggingface/peft/pull/1527 - Update
prompt_based_methods.md
by @insist93 in https://github.com/huggingface/peft/pull/1548 - FIX Allow AdaLoRA rank to be 0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1540
- FIX: Make adaptation prompt CI happy for transformers 4.39.0 by @younesbelkada in https://github.com/huggingface/peft/pull/1551
- MNT: Use
BitsAndBytesConfig
asload_in_*
is deprecated by @BenjaminBossan in https://github.com/huggingface/peft/pull/1552 - Add Support for Mistral Model in Llama-Adapter Method by @PrakharSaxena24 in https://github.com/huggingface/peft/pull/1433
- Add support for layer replication in LoRA by @siddartha-RE in https://github.com/huggingface/peft/pull/1368
- QDoRA: Support DoRA with BnB quantization by @BenjaminBossan in https://github.com/huggingface/peft/pull/1518
- Feat: add support for Conv2D DoRA by @sayakpaul in https://github.com/huggingface/peft/pull/1516
- TST Report slowest tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1556
- Changes to support fsdp+qlora and dsz3+qlora by @pacman100 in https://github.com/huggingface/peft/pull/1550
- Update style with ruff 0.2.2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1565
- FEAT Mixing different LoRA adapters in same batch by @BenjaminBossan in https://github.com/huggingface/peft/pull/1558
- FIX [
CI
] Fix test docker CI by @younesbelkada in https://github.com/huggingface/peft/pull/1535 - Fix LoftQ docs and tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1532
- More convenient way to initialize LoftQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/1543
New Contributors
- @DopeorNope-Lee made their first contribution in https://github.com/huggingface/peft/pull/1372
- @kmehant made their first contribution in https://github.com/huggingface/peft/pull/1531
- @gremlin97 made their first contribution in https://github.com/huggingface/peft/pull/1542
- @SUNGOD3 made their first contribution in https://github.com/huggingface/peft/pull/1527
- @insist93 made their first contribution in https://github.com/huggingface/peft/pull/1548
- @PrakharSaxena24 made their first contribution in https://github.com/huggingface/peft/pull/1433
- @siddartha-RE made their first contribution in https://github.com/huggingface/peft/pull/1368
Full Changelog: https://github.com/huggingface/peft/compare/v0.9.0...v0.10.0