v0.0.28
版本发布时间: 2024-09-12 23:49:39
facebookresearch/xformers最新发布版本:v0.0.28.post1(2024-09-13 23:52:20)
Pre-built binary wheels require PyTorch 2.4.1
Added
- Added wheels for cuda 12.4
- Added conda builds for python 3.11
- Added wheels for rocm 6.1
Improved
- Profiler: Fix computation of FLOPS for the attention when using xFormers
- Profiler: Fix MFU/HFU calculation when multiple dtypes are used
- Profiler: Trace analysis to compute MFU & HFU is now much faster
- fMHA/splitK: Fixed
nan
in the output when using atorch.Tensor
bias where a lot of consecutive keys are masked with-inf
- Update Flash-Attention version to
v2.6.3
when building from scratch - When using the most recent version of Flash-Attention, it is no longer possible to mix it with the cutlass backend. In other words, it is no longer possible to use the cutlass Fw with the flash Bw.
Removed
- fMHA: Removed
decoder
andsmall_k
backends - profiler: Removed
DetectSlowOpsProfiler
profiler - Removed compatibility with PyTorch < 2.4
- Removed conda builds for python 3.9
- Removed windows pip wheels for cuda 12.1 and 11.8