v0.0.21

facebookresearch/xformers

版本发布时间: 2023-08-18 22:34:52

facebookresearch/xformers最新发布版本:v0.0.28.post1(2024-09-13 23:52:20)

[0.0.21] - 2023-08-18

Improved

fMHA: Updated flash-attention to v2, with massive performance improvements for both the forward pass and backward pass. This implementation is now used by default when it's available

Bug fixes

fMHA/cutlass: Fix potential race condition in the FW/BW passes
fMHA/cutlass: Fix attn_bias stride overflow for very long sequences (>32k)
LowerTriangularMask is now backward compatible with older xformers versions

Breaking changes

memory_efficient_attention now expects the attn_bias argument to have a head dimension
memory_efficient_attention no longer broadcasts the batch/head dimensions of attn_bias. Please use .expand if you need to broadcast the bias
Remove causal_diagonal argument from BlockDiagonalCausalWithOffsetPaddedKeysMask

Added

Binary wheels on pypi/conda now contain H100 kernels
fMHA: Added backend specialized for decoding that does not use TensorCores - useful when not using multiquery

NOTE: Binary wheels are now provided only for PyTorch 2 with cuda 11.8. It is still possible to use xFormers with older versions of PyTorch by building from source or using conda.

相关地址：原始地址下载(tar) 下载(zip)

查看：2023-08-18发行的版本