MyGit

0.4.1

kyegomez/LongNet

版本发布时间: 2023-07-18 03:54:33

kyegomez/LongNet最新发布版本:0.4.8(2023-08-11 03:04:14)

Changelog

Bug Fixes

  1. Bug: The size mismatch in tensor operations in the forward method of the DilatedAttentionLLAMA class.

    • Root Cause: The tensors that are being operated upon did not have matching dimensions due to incorrect striding operations.
    • Resolution: We modified the dilation process by introducing an inner loop over split tensors to handle each part separately, which resolved the dimension mismatch issues.
  2. Bug: Index out of range error while transposing tensors.

    • Root Cause: The index provided to the transpose operation was larger than the total number of dimensions in the tensor.
    • Resolution: Corrected the index passed to the transpose operation to fit within the number of dimensions in the tensor.

Improvements

  1. Optimized Tensor Operations: The tensor operations in the forward method were optimized to ensure they all operate on tensors with matching dimensions, improving the efficiency of the model.

  2. Added Error Handling: We added checks for dimension mismatches in tensor operations to throw useful error messages when the input data does not match the expected shape.

Features

  1. DilatedAttentionLLAMA Class: Introduced a new DilatedAttentionLLAMA class that uses dilated attention mechanism for the forward method. This new implementation is designed to be more efficient for larger sequence lengths.

  2. Performance Testing: Added a simple performance test to benchmark the speed of the forward method in the DilatedAttentionLLAMA class.

相关地址:原始地址 下载(tar) 下载(zip)

查看:2023-07-18发行的版本