v0.23.3
版本发布时间: 2024-06-21 08:18:38
mosaicml/composer最新发布版本:v0.25.0(2024-09-25 04:56:05)
New Features
1. Update mlflow logger to use the new API with time-dimension to view images in MLFlow (#3286)
We've enhanced the MLflow logger's log_image
function to use the new API with time-dimension support, enabling images to be viewed in MLflow.
2. Add logging buffer time to MLFLow logger (#3401)
We've added the logging_buffer_seconds
argument to the MLflow logger, which specifies how many seconds to buffer before sending logs to the MLflow tracking server.
Bug Fixes
1. Only require databricks-sdk
when on Databricks platform (#3389)
Previously, MLFlow always imported the databricks-sdk. Now, we only require the sdk if on the databricks platform and using databricks secrets to access managed MLFlow.
2. Skip extra dataset state load during job resumption (#3393)
Previously, when loading a checkpoint with train_dataloader
, the dataset_state
would load first, and if train_dataloader
was set again afterward, load_state_dict
would be called with a None
value. Now, we've added a check in the train_dataloader
setter to skip this redundant load.
3. Fix auto-microbatching on CUDA 12.4 (#3400)
In CUDA 12.4, the out-of-memory error message has changed to CUDA error: out of memory
. Previously, our logic hardcoded checks for CUDA out of memory
when using device_train_microbatch_size="auto"
. Now, we check for both CUDA out of memory
and CUDA error: out of memory
.
4. Fix mlflow logging to Databricks workspace file paths which startswith /Shared/
prefix (#3410)
Previously, for MLflow logging, we prepended the path /Users/
to all user-provided logging paths on the Databricks platform, if not specified, including paths starting with /Shared/
, which was incorrect since /Shared/
indicates a shared workspace. Now, the /Users/
prepend is skipped for paths starting with /Shared/
.
What's Changed
- Bump CI from 0.0.7 to 0.0.8 by @KuuCi in https://github.com/mosaicml/composer/pull/3383
- Fix backward compatibility caused by missing eval metrics class by @bigning in https://github.com/mosaicml/composer/pull/3385
- Bump version v0.23.2 by @bigning in https://github.com/mosaicml/composer/pull/3386
- Restore dev version by @bigning in https://github.com/mosaicml/composer/pull/3388
- Only requires
databricks-sdk
when inside the Databricks platform by @antoinebrl in https://github.com/mosaicml/composer/pull/3389 - Update packaging requirement from <24.1,>=21.3.0 to >=21.3.0,<24.2 by @dependabot in https://github.com/mosaicml/composer/pull/3392
- Bump cryptography from 42.0.6 to 42.0.8 by @dependabot in https://github.com/mosaicml/composer/pull/3391
- Skip extra dataset state load by @mvpatel2000 in https://github.com/mosaicml/composer/pull/3393
- Remove FSDP restriction from PyTorch 1.13 by @mvpatel2000 in https://github.com/mosaicml/composer/pull/3395
- Check for 'CUDA error: out of memory' when auto-microbatching by @JAEarly in https://github.com/mosaicml/composer/pull/3400
- Add tokens to iterations by @b-chu in https://github.com/mosaicml/composer/pull/3374
- Busy wait utils in dist by @dakinggg in https://github.com/mosaicml/composer/pull/3396
- Add buffering time to mlflow logger by @chenmoneygithub in https://github.com/mosaicml/composer/pull/3401
- Add missing import for PyTorch 2.3.1 device mesh slicing by @mvpatel2000 in https://github.com/mosaicml/composer/pull/3402
- Add pynvml to mlflow dep group by @dakinggg in https://github.com/mosaicml/composer/pull/3404
- min/max flagging added to system_metrics_monitor with only non-redundant, necessary gpu metrics logged by @JackZ-db in https://github.com/mosaicml/composer/pull/3373
- Simplify launcher world size parsing by @mvpatel2000 in https://github.com/mosaicml/composer/pull/3398
- Optionally use
flash-attn
's CE loss for metrics by @snarayan21 in https://github.com/mosaicml/composer/pull/3394 - log image fix by @jessechancy in https://github.com/mosaicml/composer/pull/3286
- [ckpt-rewr] Save state dict API by @eracah in https://github.com/mosaicml/composer/pull/3372
- Revert "Optionally use
flash-attn
's CE loss for metrics (#3394)" by @snarayan21 in https://github.com/mosaicml/composer/pull/3408 - CPU tests image fix by @snarayan21 in https://github.com/mosaicml/composer/pull/3409
- Add setter for epoch in iteration by @b-chu in https://github.com/mosaicml/composer/pull/3407
- Move pillow dep as required by @mvpatel2000 in https://github.com/mosaicml/composer/pull/3412
- fixing mlflow logging to Databricks workspace file paths with /Shared/ prefix by @JackZ-db in https://github.com/mosaicml/composer/pull/3410
- Bump version v0.23.3 by @karan6181 in https://github.com/mosaicml/composer/pull/3414
New Contributors
- @JackZ-db made their first contribution in https://github.com/mosaicml/composer/pull/3373
Full Changelog: https://github.com/mosaicml/composer/compare/v0.23.2...v0.23.3