v0.12.0
版本发布时间: 2022-12-23 08:13:39
mosaicml/composer最新发布版本:v0.25.0(2024-09-25 04:56:05)
:rocket: Composer v0.12.0
Composer v0.12.0 is released! Install via pip
:
pip install mosaicml==0.12.0
New Features
-
🪵 Logging and ObjectStore Enhancements
There are multiple improvements to our logging and object store support in this release.
-
Image visualization using our
CometMLLogger
(#1710)We've added support for using our
ImageVisualizer
callback with CometML to log images and segmentation masks to CometML.from composer.trainer import Trainer trainer = Trainer(..., callbacks=[ImageVisualizer()], loggers=[CometMLLogger()] )
-
Added direct support for Oracle Cloud Infrastructure (OCI) as an
ObjectStore
(#1774) and support for Google Cloud Storage (GCS) via URI (#1833)To use, you can simply set your
save_folder
orload_path
to a URI beginning withoci://
orgs://
, to save and load with OCI and GCS respectively.from composer.trainer import Trainer # Checkpoint saving to Google Cloud Storage. trainer = Trainer( model=model, save_folder="gs://my-bucket/{run_name}/checkpoints", run_name='my-run', save_interval="1ep", save_filename="ep{epoch}.pt", save_num_checkpoints_to_keep=0, # delete all checkpoints locally ... ) trainer.fit()
-
Added basic support for logging with MLFlow (#1795)
We've added basic support for using MLFlow to log experiment metrics.
from composer.loggers import MLFlowLogger from composer.trainer import Trainer mlflow_logger = MLFlowLogger(experiment_name=mlflow_exp_name, run_name=mlflow_run_name, tracking_uri=mlflow_uri) trainer = Trainer(..., loggers=[mlflow_logger])
-
Simplified console and progress bar logging (#1694)
To turn off the progress bar, set
progress_bar=False
. To turn on logging directly to the console, setlog_to_console=True
. To control the frequency of logging to console, setconsole_log_interval
(e.g. to1ep
or1ba
). -
Our
get_file
utility now supports URIs directly (s3://
,oci://
, andgs://
) for downloading files.
-
-
🏃♀️ Support for Mid-Epoch Resumption with the latest release of Streaming
We've added support in Composer for the latest release of our Streaming library. This includes awesome new features like instant mid epoch resumption and deterministic shuffling, regardless of the number of nodes. See the Streaming release notes for more!
-
🚨 New algorithm -
GyroDropout
!Thanks to @jelite for adding a new algorithm,
GyroDropout
to Composer! Please see the method card for more details. -
🤗 HuggingFace + Composer improvements
We've added a new utility to load a 🤗 HuggingFace model and tokenizer out of a Composer checkpoint (#1754), making the pretraining -> finetuning workflow even easier in Composer. Check out the docs for more details, and our example notebook for a full tutorial (#1775)!
-
🎓 GradMonitor -> OptimizerMonitor
Renames our
GradMonitor
callback toOptimizerMonitor
, and adds the ability to track optimizer specific metrics. Check out the docs for more details, and add to your code just like any other callback!from composer.callbacks import OptimizerMonitor from composer.trainer import Trainer trainer = Trainer( ..., callbacks=[OptimizerMonitor(log_optimizer_metrics=log_optimizer_metrics)] )
-
🐳 New PyTorch and CUDA versions
We've expanded our library of Docker images with support for PyTorch 1.13 + CUDA 11.7:
-
mosaicml/pytorch:1.13.0_cu117-python3.10-ubuntu20.04
-
mosaicml/pytorch:1.13.0_cpu-python3.10-ubuntu20.04
The
mosaicml/pytorch:latest
,mosaicml/pytorch:cpu_latest
andmosaicml/composer:0.12.0
tags are now built from PyTorch 1.13 based images. Please see our DockerHub repository for additional details. -
API changes
-
Replace
grad_accum
withdevice_train_microbatch_size
(#1749, #1776)We're deprecating the
grad_accum
Trainer argument in favor of the more intuitivedevice_train_microbatch_size
. Instead of thinking about how to divide your specified minibatch into microbatches, simply specify the size of your microbatch. For example, let's say you want to split your minibatch of 2048 into two microbatches of 1024:from composer import Trainer trainer = Trainer( ..., device_train_microbatch_size=1024, )
If you want Composer to tune the microbatch for you automatically, enable automatic microbatching as follows:
from composer import Trainer trainer = Trainer( ..., device_train_microbatch_size='auto', )
The
grad_accum
argument is still supported but will be deprecated in the next Composer release. -
Renamed precisions (#1761)
We've renamed precision attributes for clarity. The following values have been removed:
['amp', 'fp16', bf16']
.We have added the following values, prefixed with 'amp' to clarify when an Automatic Mixed Precision type is being used:
['amp_fp16', 'amp_bf16']
.The
fp32
precision value remains unchanged.
Deprecations
- Removed support for YAHP (#1512)
- Removed COCO and SSD datasets (#1717)
- Fully removed Streaming v1 support, please see the mosaicml/streaming project for our next-gen streaming datasets (#1787)
- Deprecated
FusedLayerNorm
algorithm (#1789) - Fully removed
grad_clip_norm
training argument, please use theGradientClipping
algorithm instead (#1768) - Removed
data_fit
,data_epoch
, anddata_batch
fromLogger
(#1826)
Bug Fixes
- Fix FSDP checkpoint strategy (#1734)
- Fix gradient clipping with FSDP (#1740)
- Adds more supported FSDP config flags (
sync_module_states
,forward_prefecth
,limit_all_gathers
) (#1794) - Allow
FULL
precision with FSDP (#1796) - Fix
eval_microbatch
modification onEVAL_BEFORE_FORWARD
event (#1739) - Fix algorithm API backwards compatibility in checkpoints (#1741)
- Fixes a bad
None
check preventing settingdevice_id
to0
(#1767) - Unregister engine to make cleaning up memory easier (#1769)
- Fix issue if
metric_names
is not a list (#1798) - Match implementation for list and tensor batch splitting (#1804)
- Fixes infinite eval issue (#1815)
What's Changed
- Update installation constraints for streaming by @karan6181 in https://github.com/mosaicml/composer/pull/1661
- Update decoupled_weight_decay.md by @jacobfulano in https://github.com/mosaicml/composer/pull/1672
- Notebooks part 2 by @dakinggg in https://github.com/mosaicml/composer/pull/1659
- Add trainer arg for engine passes by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1673
- Autoload algorithms by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1658
- Faster metrics calculations + Fix warnings added by the new version of torchmetrics by @dskhudia in https://github.com/mosaicml/composer/pull/1674
- Update coolname requirement from <2,>=1.1.0 to >=1.1.0,<3 by @dependabot in https://github.com/mosaicml/composer/pull/1666
- Bump ipykernel from 6.16.0 to 6.16.1 by @dependabot in https://github.com/mosaicml/composer/pull/1667
- Bump traitlets from 5.4.0 to 5.5.0 by @dependabot in https://github.com/mosaicml/composer/pull/1668
- Image viz by @dakinggg in https://github.com/mosaicml/composer/pull/1676
- Update checks for Gated Linear Units Method by @jacobfulano in https://github.com/mosaicml/composer/pull/1575
- ADE20k streaming factory method by @Landanjs in https://github.com/mosaicml/composer/pull/1626
- Deyahpify cifar10 by @growlix in https://github.com/mosaicml/composer/pull/1677
- Nuke YAHP by @hanlint in https://github.com/mosaicml/composer/pull/1512
- Imagenet streaming factory method by @codestar12 in https://github.com/mosaicml/composer/pull/1649
- Bump ipykernel from 6.16.1 to 6.16.2 by @dependabot in https://github.com/mosaicml/composer/pull/1683
- Bump pytest from 7.1.3 to 7.2.0 by @dependabot in https://github.com/mosaicml/composer/pull/1684
- Bump pypandoc from 1.9 to 1.10 by @dependabot in https://github.com/mosaicml/composer/pull/1680
- Update py-cpuinfo requirement from <9,>=8.0.0 to >=8.0.0,<10 by @dependabot in https://github.com/mosaicml/composer/pull/1681
- Uncomment and clean up algorithms documentation by @growlix in https://github.com/mosaicml/composer/pull/1685
- Update glu check by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1689
- fix backwards compatability by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1693
- Fix engine pass registration by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1692
- Add Low Precision LayerNorm by @nik-mosaic in https://github.com/mosaicml/composer/pull/1525
- Update codeowners by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1691
- Add nccl env var by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1695
- Fix eval timestamp by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1697
- Update distributed docs by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1696
- Return empty dict if wandb disabled by @dakinggg in https://github.com/mosaicml/composer/pull/1698
- Autoresume related error messages by @dakinggg in https://github.com/mosaicml/composer/pull/1687
- Add log_image to wandb, cometml, and LoggerDestination by @eracah in https://github.com/mosaicml/composer/pull/1675
- Pin PyTorch and supporting package versions by @bandish-shah in https://github.com/mosaicml/composer/pull/1688
- Add in unit tests for log_image function for CometMLLogger and WandBLogger by @eracah in https://github.com/mosaicml/composer/pull/1701
- refactor devices by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1699
- remove as in device by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1704
- Fix device imports by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1705
- Fix typing in EMA's _move_params_to_device() by @coryMosaicML in https://github.com/mosaicml/composer/pull/1707
- Add docs for saving and loading checkpoints with GCS by @eracah in https://github.com/mosaicml/composer/pull/1702
- Clean up imports by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1700
- Add rud docs by @eracah in https://github.com/mosaicml/composer/pull/1709
- Bump cryptography from 38.0.1 to 38.0.3 by @dependabot in https://github.com/mosaicml/composer/pull/1712
- GHA workflow for code quality checks by @bandish-shah in https://github.com/mosaicml/composer/pull/1719
- Add support for Path in CheckpointSaver by @cojennin in https://github.com/mosaicml/composer/pull/1721
- Docs Typo by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1723
- Bump nbsphinx from 0.8.9 to 0.8.10 by @dependabot in https://github.com/mosaicml/composer/pull/1725
- Bump sphinx-argparse from 0.3.2 to 0.4.0 by @dependabot in https://github.com/mosaicml/composer/pull/1726
- Simple nlp tests by @dakinggg in https://github.com/mosaicml/composer/pull/1716
- Build Streaming CIFAR10 Factory Function by @growlix in https://github.com/mosaicml/composer/pull/1729
- Change
build_streaming_cifar10_dataloader()
to use v2 by default by @growlix in https://github.com/mosaicml/composer/pull/1730 - Clear the Optimizer before wrapping with FSDP by @bcui19 in https://github.com/mosaicml/composer/pull/1732
- Add inf eval check by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1733
- Fix fsdp checkpoint strategy by @bcui19 in https://github.com/mosaicml/composer/pull/1734
- Assign eval microbatch to self.state.batch by @dakinggg in https://github.com/mosaicml/composer/pull/1739
- Add masks to wandblogger.log_image and cometmllogger.log_image and refactor ImageVisualizer to use log_image [WIP] by @eracah in https://github.com/mosaicml/composer/pull/1710
- Protect backwards compatability by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1741
- Add composer version state by @dakinggg in https://github.com/mosaicml/composer/pull/1742
- Adds auto object store creation to
get_file
by @dakinggg in https://github.com/mosaicml/composer/pull/1750 - Log console interval by @eracah in https://github.com/mosaicml/composer/pull/1694
- Bump sphinxcontrib-katex from 0.9.0 to 0.9.3 by @dependabot in https://github.com/mosaicml/composer/pull/1757
- Bump pandoc from 2.2 to 2.3 by @dependabot in https://github.com/mosaicml/composer/pull/1756
- Bump cryptography from 38.0.3 to 38.0.4 by @dependabot in https://github.com/mosaicml/composer/pull/1755
- Add more event tests by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1762
- Add python 3.10, pytorch 1.13, cuda 11.7 by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1735
- Add huggingface info to state dict by @dakinggg in https://github.com/mosaicml/composer/pull/1744
- Global batch size by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1746
- Add device to state by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1765
- Rename precisions by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1761
- Device id none by @dakinggg in https://github.com/mosaicml/composer/pull/1767
- Autoload HuggingFace model/tokenizer by @dakinggg in https://github.com/mosaicml/composer/pull/1754
- Supporting
train_device_microbatch_size
by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1749 - Switch flash attention to tag by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1766
- remove grad clip norm by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1768
- unregister engine for memory cleanup by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1769
- Fix hf tokenizer test for new hf version by @dakinggg in https://github.com/mosaicml/composer/pull/1772
- Decrease microbatch size if batch size is smaller by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1771
- remove deprecated code by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1773
- cache call to cpuinfo by @dakinggg in https://github.com/mosaicml/composer/pull/1778
- device train microbatch size pt 2 by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1776
- Huggingface pretrain + finetune notebook by @dakinggg in https://github.com/mosaicml/composer/pull/1775
- Bump traitlets from 5.5.0 to 5.6.0 by @dependabot in https://github.com/mosaicml/composer/pull/1781
- Bump deepspeed from 0.7.5 to 0.7.6 by @dependabot in https://github.com/mosaicml/composer/pull/1780
- Minor docs fix for deepspeed typo by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1784
- Update Auto Microbatching by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1785
- Adding GyroDropout as an algorithm to Composer by @jelite in https://github.com/mosaicml/composer/pull/1718
- Add Deprecation warning for Fused LayerNorm by @nik-mosaic in https://github.com/mosaicml/composer/pull/1789
- Update error msgs by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1791
- Change gyro emoji by @nik-mosaic in https://github.com/mosaicml/composer/pull/1792
- Speeding up tests by @dakinggg in https://github.com/mosaicml/composer/pull/1779
- Add durations arg to pytest by @dakinggg in https://github.com/mosaicml/composer/pull/1793
- Properly implement gradient clipping for FSDP by @bcui19 in https://github.com/mosaicml/composer/pull/1740
- Updating FSDP supported config flags by @bcui19 in https://github.com/mosaicml/composer/pull/1794
- Remove streaming v1 datasets. by @knighton in https://github.com/mosaicml/composer/pull/1787
- Remove references to validate in docs by @dakinggg in https://github.com/mosaicml/composer/pull/1800
- Install latest Git in Docker images by @bandish-shah in https://github.com/mosaicml/composer/pull/1770
- move to pypi release for flash attn by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1777
- Check and make sure that metric names is a list of strings by @dakinggg in https://github.com/mosaicml/composer/pull/1798
- Adding in the possibility of 'None' for MixedPrecision FSDP by @bcui19 in https://github.com/mosaicml/composer/pull/1796
- Updating assertion check for gradient clipping and updating gradient clip tests for FSDP by @bcui19 in https://github.com/mosaicml/composer/pull/1802
- Moving Pytest CPU to GHA by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1790
- Bump sphinxext-opengraph from 0.6.3 to 0.7.3 by @dependabot in https://github.com/mosaicml/composer/pull/1760
- Update distributed_training.rst by @lupesko in https://github.com/mosaicml/composer/pull/1731
- Use streaming v3 by @knighton in https://github.com/mosaicml/composer/pull/1797
- Bump traitlets from 5.6.0 to 5.7.0 by @dependabot in https://github.com/mosaicml/composer/pull/1806
- Bump ipykernel from 6.16.2 to 6.19.2 by @dependabot in https://github.com/mosaicml/composer/pull/1810
- Update packaging requirement from <22,>=21.3.0 to >=21.3.0,<23 by @dependabot in https://github.com/mosaicml/composer/pull/1808
- match list batch splitting and tensor batch splitting by @dakinggg in https://github.com/mosaicml/composer/pull/1804
- Add type ignore for onnx import by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1811
- Remove pip install all from coverage action by @dakinggg in https://github.com/mosaicml/composer/pull/1805
- Remove coco and ssd by @growlix in https://github.com/mosaicml/composer/pull/1717
- Rename matrix by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1813
- Add OCI ObjectStore by @eracah in https://github.com/mosaicml/composer/pull/1774
- Add MLFlowLogger by @eracah in https://github.com/mosaicml/composer/pull/1795
- Object store docs by @dakinggg in https://github.com/mosaicml/composer/pull/1817
- fix inf eval by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1815
- Add
fsdp_config
tostate
and add fsdp_config to trainer docstring by @growlix in https://github.com/mosaicml/composer/pull/1821 - Add SHARP support to docker by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1818
- Testing Infra Cleanup by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1822
- Remove dead code in dockerfile by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1823
- Fix Export Docs by @mvpatel2000 in https://github.com/mosaicml/composer/pull/1824
- Remove old deprecated logger methods by @eracah in https://github.com/mosaicml/composer/pull/1826
- NLP metrics tests by @dakinggg in https://github.com/mosaicml/composer/pull/1830
- Nlp pipeline test by @dakinggg in https://github.com/mosaicml/composer/pull/1828
- Add tests for uri helper functions by @eracah in https://github.com/mosaicml/composer/pull/1827
- Add pip targets to installation.rst docs by @eracah in https://github.com/mosaicml/composer/pull/1829
New Contributors
- @cojennin made their first contribution in https://github.com/mosaicml/composer/pull/1721
- @jelite made their first contribution in https://github.com/mosaicml/composer/pull/1718
Full Changelog: https://github.com/mosaicml/composer/compare/v0.11.1...v0.12.0