0.4.8

kyegomez/LongNet

版本发布时间: 2023-08-11 03:04:14

kyegomez/LongNet最新发布版本:0.4.8(2023-08-11 03:04:14)

Changelog for DilatedAttention with ParallelWrapper:

1. Added `ParallelWrapper` Class

Introduced a ParallelWrapper class to simplify the usage of data parallelism.
The ParallelWrapper class:
- Takes a neural network model as input.
- Allows the user to specify a device ("cuda" or "cpu").
- Contains a flag use_data_parallel to enable or disable data parallelism.
- Checks if multiple GPUs are available and applies nn.DataParallel to the model accordingly.
- Redirects attribute accesses to the internal model for seamless usage.

2. Modified Usage of `DilatedAttention` Model

Wrapped the DilatedAttention model using the ParallelWrapper class.
Enabled the model to be run on multiple GPUs if available.

3. Device Assignment

Explicitly defined a device and used it to specify where the DilatedAttention model should be loaded.
The device defaults to GPU (cuda:0) if CUDA is available; otherwise, it defaults to CPU.

4. Example Usage

Provided an example of how to initialize and use the ParallelWrapper with the DilatedAttention model.

Summary:

The key addition was the ParallelWrapper class to facilitate easy and configurable usage of data parallelism with the provided DilatedAttention model. This ensures scalability across multiple GPUs without any significant change in the existing workflow. The user can now enable or disable data parallelism using a single flag.

相关地址：原始地址下载(tar) 下载(zip)

查看：2023-08-11发行的版本