v1.1

bghira/SimpleTuner

版本发布时间: 2024-10-02 04:51:39

bghira/SimpleTuner最新发布版本:v1.1.1(2024-10-05 08:37:33)

Features

Performance

Improved launch speed for large datasets (>1M samples)
Improved speed for quantising on CPU
Optional support for directly quantising on GPU near-instantly (--quantize_via)

Compatibility

SDXL, SD1.5 and SD2.x compatibility with LyCORIS training
Updated documentation to make multiGPU configuration a bit more obvious.
Improved support for torch.compile(), including automatically disabling it when eg. fp8-quanto is enabled
- Enable via accelerate config or config/config.env via TRAINER_DYNAMO_BACKEND=inductor
TorchAO for quantisation as an alternative to Optimum Quanto for int8 weight-only quantisation (int8-torchao)
f8uz-quanto, a compatibility level for AMD users to experiment with FP8 training dynamics
Support for multigpu PEFT LoRA training with Quanto enabled (not fp8-quanto)
- Previously, only LyCORIS would reliably work with quantised multigpu training sessions.
Ability to quantise models when full-finetuning, without warning or error. Previously, this configuration was blocked. Your mileage may vary, it's an experimental configuration.

Integrations

Images now get logged to tensorboard (thanks @anhi)
FastAPI endpoints for integrations (undocumented)
"raw" webhook type that sends a large number of HTTP requests containing events, useful for push notification type service

Optims

SOAP optimiser support
- uses fp32 gradients, nice and accurate but uses more memory than other optims, by default slows down every 10 steps as it preconditions
New 8bit and 4bit optimiser options from TorchAO (ao-adamw8bit, ao-adamw4bit etc)

Pull Requests

Fix flux cfg sampling bug by @AmericanPresidentJimmyCarter in https://github.com/bghira/SimpleTuner/pull/981
merge by @bghira in https://github.com/bghira/SimpleTuner/pull/982
FastAPI endpoints for managing trainer as a service by @bghira in https://github.com/bghira/SimpleTuner/pull/969
constant lr resume fix for optimi-stableadamw by @bghira in https://github.com/bghira/SimpleTuner/pull/984
clear data backends before configuring new ones by @bghira in https://github.com/bghira/SimpleTuner/pull/992
update to latest quanto main by @bghira in https://github.com/bghira/SimpleTuner/pull/994
log images in tensorboard by @anhi in https://github.com/bghira/SimpleTuner/pull/998
merge by @bghira in https://github.com/bghira/SimpleTuner/pull/999
torchao: add int8; quanto: add NF4; torch compile fixes + ability to compile optim by @bghira in https://github.com/bghira/SimpleTuner/pull/986
update flux quickstart by @bghira in https://github.com/bghira/SimpleTuner/pull/1000
compile optimiser by @bghira in https://github.com/bghira/SimpleTuner/pull/1001
optimizer compile step only by @bghira in https://github.com/bghira/SimpleTuner/pull/1002
remove optimiser compilation arg by @bghira in https://github.com/bghira/SimpleTuner/pull/1003
remove optim compiler from options by @bghira in https://github.com/bghira/SimpleTuner/pull/1004
remove optim compiler from options by @bghira in https://github.com/bghira/SimpleTuner/pull/1005
SOAP optimiser; int4 fixes for 4090 by @bghira in https://github.com/bghira/SimpleTuner/pull/1006
torchao: install 0.5.0 from pytorch source by @bghira in https://github.com/bghira/SimpleTuner/pull/1007
update safety check warning with guidance toward cache clear interval for OOM issues by @bghira in https://github.com/bghira/SimpleTuner/pull/1008
fix webhook contents for discord by @bghira in https://github.com/bghira/SimpleTuner/pull/1011
fp8-quanto fixes, unblocking of PEFT multigpu LoRA training for other precision levels by @bghira in https://github.com/bghira/SimpleTuner/pull/1013
quanto: activations sledgehammer by @bghira in https://github.com/bghira/SimpleTuner/pull/1014
1.1 merge window by @bghira in https://github.com/bghira/SimpleTuner/pull/1010

Full Changelog: https://github.com/bghira/SimpleTuner/compare/v1.0.1...v1.1

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-10-02发行的版本