v0.8.1

版本发布时间: 2022-07-23 07:23:57

mosaicml/composer最新发布版本:v0.25.0(2024-09-25 04:56:05)

🚀 Composer v0.8.1

Composer v0.8.1 is released! Install via pip:

pip install --upgrade mosaicml==0.8.1

Alternatively, install Composer with Conda:

conda install -c mosaicml mosaicml=0.8.1

🎁 New Features

🖼️ Image Visualizer

The ImageVisualizer callback periodically logs the training and validation images when using the WandB logger. This is great for validating your dataloader pipeline, especially if extensive data augmentations are used. Also, when training on a semantic segmentation task, the callback can log the target segmentation mask and the predicted segmentation mask by setting the argument mode='segmentation'. See PR #1266 for more details. Here is an example of using the ImageVisualizer callback:
```
from composer import Trainer
from composer.callbacks import ImageVisualizer

# Callback to log 8 training images after every 100 batches
image_visualizer = ImageVisualizer()

# Construct trainer
trainer = Trainer(
    ...,
    callbacks=image_visualizer
)

# Train!
trainer.fit()
```
Here is an example visualization from the training set of ADE20k:
📶 TensorBoard Logging

You can now log metrics and losses from your Composer training runs with Tensorboard! See #1250 and #1283 for more details. All you have to do is create a TensorboardLogger object and add it to the list of loggers in your Trainer object like so:
```
from composer import Trainer
from composer.loggers import TensorboardLogger

tb_logger = TensorboardLogger(log_dir="./my_tensorboard_logs")

trainer = Trainer(
    ...
    # Add your Tensorboard Logger to the trainer here.
    loggers=[tb_logger],
)

trainer.fit()
```
For more information, see this tutorial.
🔙 Multiple Losses

Adds support for multiple losses. If a model returns a tuple of losses, they are summed before the loss.backward() call. See #1240 for more details.

🌎️ Stream Datasets from HTTP URIs

You can now specify a HTTP URI for a Streaming Dataset remote. See #1258 for more detials. For example:

from composer.datasets.streaming import StreamingDataset
from torch.utils.data import DataLoader

# Construct the Dataset
dataset = StreamingDataset(
    ...,
    remote="https://example.com/dataset/",
)

# Construct the DataLoader
train_dl = DataLoader(dataset)

# Construct the Trainer
trainer = Trainer(
    ...,
    train_dataloader=train_dl,
)

# Train!
trainer.fit()

For more information on streaming datasets, see this tutorial.

🏄️ GPU Devices default to TF32 Matmuls

Beginning with PyTorch 1.12, the default behavior for computing FP32 matrix multiplies on NVIDIA Ampere devices was switched from TF32 to FP32. See PyTorch documentation here.

Since Composer is designed specifically for ML training with a focus on efficiency, we choose to preserve the old default of using TF32 on Ampere devices. This leads to significantly higher throughput when training in single precision, without impact training convergence. See PR #1275 for implementation details.

👋 Set the Device ID for GPU Devices

Specify the device ID within a DeviceGPU to train on when instantiating a Trainer object instead of using the local ID! For example,

from composer.trainer.devices.device_gpu import DeviceGPU

# Specify to use GPU 3 to train 
device = DeviceGPU(device_id=3)

# Construct the Trainer
trainer = Trainer(
    ...,
    device = device
)

# Train!
trainer.fit()

BERT and C4 Updates

We make some minor adjustments to our bert-base-uncased.yaml training config. In particular, we make the global train and eval batch sizes a power of 2. This maintains divisibility when using many GPUs in multi-node training. We also adjust the max_duration so that it converts cleanly to 70,000 batches.

We also upgrade our StreamingDataset C4 conversion script (scripts/mds/c4.py) to use a multi-threaded reader. On a 64-core machine we are able to convert the 770GB train split to .mds format in ~1.5hr.
📂 Set a prefix when using a S3ObjectStore

When using S3ObjectStore for applications like checkpointing, it can be useful to provide path prefixes, mimicking folder/subfolder directories like on a local filesystem. When prefix is provided, any objects uploaded with S3ObjectStore will be stored at f's3://{self.bucket}/{self.prefix}{object_name}'.
⚖️ Scale the Warmup Period of Composer Schedulers

Added a new flag scale_warmup to schedulers that will scale the warmup period when a scale schedule ratio is applied. Default is False to mirror default behavior. See #1268 for more detials.
🧊 Stochastic Depth on Residual Blocks

Residual blocks are detected automatically and replaced with stochastic versions. See #1253 for more details.

🐛 Bug Fixes

Fixed Progress Bars

Fixed a bug where the the Progress Bars jumped around and did not stream properly when tailing the terminal over the network. Fixed in #1264, #1287, and #1289.
Fixed S3ObjectStore in Multithreaded Environments

Fixed a bug where the boto3 crashed when creating the default session in multiple threads simultaniously (see https://github.com/boto/boto3/issues/1592). Fixed in #1260.
Retry on ChannelException errors in the SFTPObjectStore

Catch ChannelException SFTP transient error and retry. Fixed in #1245.
Treating S3 Permission Denied Errors as Not Found Errors

We update our handling of botocore 403 ClientErrors to interpret them as FileNotFoundErrors. We do this because of a situation that occurs when a user has no S3 credentials configured, and tries to read from a bucket with public files. For privacy, Amazon S3 raises 403 (Permission Denied) instead of 404 (Not Found) errors. As such, PR #1249 treats 403 ClientErrors as FileNotFoundErrors.
Fixed Parsing of grad_accum in the TrainerHparams

Fixes an error where the command line override --grad_accum lead to incorrect parsing. Fixed in #1256.
Fixed Example YAML Files

Our recipe configurations (YAML) are updated to the latest version, and a test was added to enforce correctness moving forward. Fixed in #1235 and #1257.

Changelog

https://github.com/mosaicml/composer/compare/v0.8.0...v0.8.1

相关地址：原始地址下载(tar) 下载(zip)

查看：2022-07-23发行的版本