v0.8.1
版本发布时间: 2022-07-23 07:23:57
mosaicml/composer最新发布版本:v0.25.0(2024-09-25 04:56:05)
🚀 Composer v0.8.1
Composer v0.8.1 is released! Install via pip
:
pip install --upgrade mosaicml==0.8.1
Alternatively, install Composer with Conda:
conda install -c mosaicml mosaicml=0.8.1
🎁 New Features
-
🖼️ Image Visualizer
The
ImageVisualizer
callback periodically logs the training and validation images when using the WandB logger. This is great for validating your dataloader pipeline, especially if extensive data augmentations are used. Also, when training on a semantic segmentation task, the callback can log the target segmentation mask and the predicted segmentation mask by setting the argumentmode='segmentation'
. See PR #1266 for more details. Here is an example of using theImageVisualizer
callback:from composer import Trainer from composer.callbacks import ImageVisualizer # Callback to log 8 training images after every 100 batches image_visualizer = ImageVisualizer() # Construct trainer trainer = Trainer( ..., callbacks=image_visualizer ) # Train! trainer.fit()
Here is an example visualization from the training set of ADE20k:
-
📶 TensorBoard Logging
You can now log metrics and losses from your Composer training runs with Tensorboard! See #1250 and #1283 for more details. All you have to do is create a
TensorboardLogger
object and add it to the list of loggers in yourTrainer
object like so:from composer import Trainer from composer.loggers import TensorboardLogger tb_logger = TensorboardLogger(log_dir="./my_tensorboard_logs") trainer = Trainer( ... # Add your Tensorboard Logger to the trainer here. loggers=[tb_logger], ) trainer.fit()
For more information, see this tutorial.
-
🔙 Multiple Losses
Adds support for multiple losses. If a model returns a tuple of losses, they are summed before the
loss.backward()
call. See #1240 for more details. -
🌎️ Stream Datasets from HTTP URIs
You can now specify a HTTP URI for a Streaming Dataset remote. See #1258 for more detials. For example:
from composer.datasets.streaming import StreamingDataset from torch.utils.data import DataLoader # Construct the Dataset dataset = StreamingDataset( ..., remote="https://example.com/dataset/", ) # Construct the DataLoader train_dl = DataLoader(dataset) # Construct the Trainer trainer = Trainer( ..., train_dataloader=train_dl, ) # Train! trainer.fit()
For more information on streaming datasets, see this tutorial.
-
🏄️ GPU Devices default to TF32 Matmuls
Beginning with PyTorch 1.12, the default behavior for computing FP32 matrix multiplies on NVIDIA Ampere devices was switched from TF32 to FP32. See PyTorch documentation here.
Since Composer is designed specifically for ML training with a focus on efficiency, we choose to preserve the old default of using TF32 on Ampere devices. This leads to significantly higher throughput when training in single precision, without impact training convergence. See PR #1275 for implementation details.
-
👋 Set the Device ID for GPU Devices
Specify the device ID within a DeviceGPU to train on when instantiating a Trainer object instead of using the local ID! For example,
from composer.trainer.devices.device_gpu import DeviceGPU # Specify to use GPU 3 to train device = DeviceGPU(device_id=3) # Construct the Trainer trainer = Trainer( ..., device = device ) # Train! trainer.fit()
-
BERT and C4 Updates
We make some minor adjustments to our
bert-base-uncased.yaml
training config. In particular, we make the global train and eval batch sizes a power of 2. This maintains divisibility when using many GPUs in multi-node training. We also adjust themax_duration
so that it converts cleanly to 70,000 batches.We also upgrade our StreamingDataset C4 conversion script (
scripts/mds/c4.py
) to use a multi-threaded reader. On a 64-core machine we are able to convert the 770GB train split to.mds
format in ~1.5hr. -
📂 Set a
prefix
when using aS3ObjectStore
When using
S3ObjectStore
for applications like checkpointing, it can be useful to provide path prefixes, mimickingfolder/subfolder
directories like on a local filesystem. Whenprefix
is provided, any objects uploaded withS3ObjectStore
will be stored atf's3://{self.bucket}/{self.prefix}{object_name}'
. -
⚖️ Scale the Warmup Period of Composer Schedulers
Added a new flag
scale_warmup
to schedulers that will scale the warmup period when a scale schedule ratio is applied. Default isFalse
to mirror default behavior. See #1268 for more detials. -
🧊 Stochastic Depth on Residual Blocks
Residual blocks are detected automatically and replaced with stochastic versions. See #1253 for more details.
🐛 Bug Fixes
-
Fixed Progress Bars
Fixed a bug where the the Progress Bars jumped around and did not stream properly when tailing the terminal over the network. Fixed in #1264, #1287, and #1289.
-
Fixed S3ObjectStore in Multithreaded Environments
Fixed a bug where the
boto3
crashed when creating the default session in multiple threads simultaniously (see https://github.com/boto/boto3/issues/1592). Fixed in #1260. -
Retry on
ChannelException
errors in theSFTPObjectStore
Catch
ChannelException
SFTP transient error and retry. Fixed in #1245. -
Treating S3 Permission Denied Errors as Not Found Errors
We update our handling of
botocore
403 ClientErrors to interpret them asFileNotFoundErrors
. We do this because of a situation that occurs when a user has no S3 credentials configured, and tries to read from a bucket with public files. For privacy, Amazon S3 raises 403 (Permission Denied) instead of 404 (Not Found) errors. As such, PR #1249 treats 403 ClientErrors as FileNotFoundErrors. -
Fixed Parsing of
grad_accum
in theTrainerHparams
Fixes an error where the command line override
--grad_accum
lead to incorrect parsing. Fixed in #1256. -
Fixed Example YAML Files
Our recipe configurations (YAML) are updated to the latest version, and a test was added to enforce correctness moving forward. Fixed in #1235 and #1257.
Changelog
https://github.com/mosaicml/composer/compare/v0.8.0...v0.8.1