MyGit

v0.8.1

mosaicml/composer

版本发布时间: 2022-07-23 07:23:57

mosaicml/composer最新发布版本:v0.23.5(2024-07-03 10:08:01)

🚀 Composer v0.8.1

Composer v0.8.1 is released! Install via pip:

pip install --upgrade mosaicml==0.8.1

Alternatively, install Composer with Conda:

conda install -c mosaicml mosaicml=0.8.1

🎁 New Features

  1. 🖼️ Image Visualizer

    The ImageVisualizer callback periodically logs the training and validation images when using the WandB logger. This is great for validating your dataloader pipeline, especially if extensive data augmentations are used. Also, when training on a semantic segmentation task, the callback can log the target segmentation mask and the predicted segmentation mask by setting the argument mode='segmentation'. See PR #1266 for more details. Here is an example of using the ImageVisualizer callback:

    from composer import Trainer
    from composer.callbacks import ImageVisualizer
    
    # Callback to log 8 training images after every 100 batches
    image_visualizer = ImageVisualizer()
    
    # Construct trainer
    trainer = Trainer(
        ...,
        callbacks=image_visualizer
    )
    
    # Train!
    trainer.fit()
    
    

    Here is an example visualization from the training set of ADE20k:

  2. 📶 TensorBoard Logging

    You can now log metrics and losses from your Composer training runs with Tensorboard! See #1250 and #1283 for more details. All you have to do is create a TensorboardLogger object and add it to the list of loggers in your Trainer object like so:

    from composer import Trainer
    from composer.loggers import TensorboardLogger
    
    tb_logger = TensorboardLogger(log_dir="./my_tensorboard_logs")
    
    trainer = Trainer(
        ...
        # Add your Tensorboard Logger to the trainer here.
        loggers=[tb_logger],
    )
    
    trainer.fit()
    

    For more information, see this tutorial.

  3. 🔙 Multiple Losses

    Adds support for multiple losses. If a model returns a tuple of losses, they are summed before the loss.backward() call. See #1240 for more details.

  4. 🌎️ Stream Datasets from HTTP URIs

    You can now specify a HTTP URI for a Streaming Dataset remote. See #1258 for more detials. For example:

    from composer.datasets.streaming import StreamingDataset
    from torch.utils.data import DataLoader
    
    # Construct the Dataset
    dataset = StreamingDataset(
        ...,
        remote="https://example.com/dataset/",
    )
    
    # Construct the DataLoader
    train_dl = DataLoader(dataset)
    
    # Construct the Trainer
    trainer = Trainer(
        ...,
        train_dataloader=train_dl,
    )
    
    # Train!
    trainer.fit()
    

    For more information on streaming datasets, see this tutorial.

  5. 🏄️ GPU Devices default to TF32 Matmuls

    Beginning with PyTorch 1.12, the default behavior for computing FP32 matrix multiplies on NVIDIA Ampere devices was switched from TF32 to FP32. See PyTorch documentation here.

    Since Composer is designed specifically for ML training with a focus on efficiency, we choose to preserve the old default of using TF32 on Ampere devices. This leads to significantly higher throughput when training in single precision, without impact training convergence. See PR #1275 for implementation details.

  6. 👋 Set the Device ID for GPU Devices

    Specify the device ID within a DeviceGPU to train on when instantiating a Trainer object instead of using the local ID! For example,

    from composer.trainer.devices.device_gpu import DeviceGPU
    
    # Specify to use GPU 3 to train 
    device = DeviceGPU(device_id=3)
    
    # Construct the Trainer
    trainer = Trainer(
        ...,
        device = device
    )
    
    # Train!
    trainer.fit()
    
  7. BERT and C4 Updates

    We make some minor adjustments to our bert-base-uncased.yaml training config. In particular, we make the global train and eval batch sizes a power of 2. This maintains divisibility when using many GPUs in multi-node training. We also adjust the max_duration so that it converts cleanly to 70,000 batches.

    We also upgrade our StreamingDataset C4 conversion script (scripts/mds/c4.py) to use a multi-threaded reader. On a 64-core machine we are able to convert the 770GB train split to .mds format in ~1.5hr.

  8. 📂 Set a prefix when using a S3ObjectStore

    When using S3ObjectStore for applications like checkpointing, it can be useful to provide path prefixes, mimicking folder/subfolder directories like on a local filesystem. When prefix is provided, any objects uploaded with S3ObjectStore will be stored at f's3://{self.bucket}/{self.prefix}{object_name}'.

  9. ⚖️ Scale the Warmup Period of Composer Schedulers

    Added a new flag scale_warmup to schedulers that will scale the warmup period when a scale schedule ratio is applied. Default is False to mirror default behavior. See #1268 for more detials.

  10. 🧊 Stochastic Depth on Residual Blocks

    Residual blocks are detected automatically and replaced with stochastic versions. See #1253 for more details.

🐛 Bug Fixes

  1. Fixed Progress Bars

    Fixed a bug where the the Progress Bars jumped around and did not stream properly when tailing the terminal over the network. Fixed in #1264, #1287, and #1289.

  2. Fixed S3ObjectStore in Multithreaded Environments

    Fixed a bug where the boto3 crashed when creating the default session in multiple threads simultaniously (see https://github.com/boto/boto3/issues/1592). Fixed in #1260.

  3. Retry on ChannelException errors in the SFTPObjectStore

    Catch ChannelException SFTP transient error and retry. Fixed in #1245.

  4. Treating S3 Permission Denied Errors as Not Found Errors

    We update our handling of botocore 403 ClientErrors to interpret them as FileNotFoundErrors. We do this because of a situation that occurs when a user has no S3 credentials configured, and tries to read from a bucket with public files. For privacy, Amazon S3 raises 403 (Permission Denied) instead of 404 (Not Found) errors. As such, PR #1249 treats 403 ClientErrors as FileNotFoundErrors.

  5. Fixed Parsing of grad_accum in the TrainerHparams

    Fixes an error where the command line override --grad_accum lead to incorrect parsing. Fixed in #1256.

  6. Fixed Example YAML Files

    Our recipe configurations (YAML) are updated to the latest version, and a test was added to enforce correctness moving forward. Fixed in #1235 and #1257.

Changelog

https://github.com/mosaicml/composer/compare/v0.8.0...v0.8.1

相关地址:原始地址 下载(tar) 下载(zip)

查看:2022-07-23发行的版本