MyGit

v1.8.0

pytorch/pytorch

版本发布时间: 2021-03-05 04:44:39

pytorch/pytorch最新发布版本:v2.5.1(2024-10-30 01:58:24)

PyTorch 1.8.0 Release Notes

Highlights

We are excited to announce the availability of PyTorch 1.8. This release is composed of more than 3,000 commits since 1.7. It includes major updates and new features for compilation, code optimization, frontend APIs for scientific computing, and AMD ROCm support through binaries that are available via pytorch.org. It also provides improved features for large-scale training for pipeline and model parallelism, and gradient compression. A few of the highlights include:

  1. Support for doing python to python functional transformations via torch.fx;
  2. Added or stabilized APIs to support FFTs (torch.fft), Linear Algebra functions (torch.linalg), added support for autograd for complex tensors and updates to improve performance for calculating hessians and jacobians; and
  3. Significant updates and improvements to distributed training including: Improved NCCL reliability; Pipeline parallelism support; RPC profiling; and support for communication hooks adding gradient compression. See the full release notes here.

Along with 1.8, we are also releasing major updates to PyTorch libraries including TorchCSPRNG, TorchVision, TorchText and TorchAudio. For more on the library releases, see the post here. As previously noted, features in PyTorch releases are classified as Stable, Beta and Prototype. You can learn more about the definitions in the post here.

You can find more details on all the highlighted features in the PyTorch 1.8 Release blogpost.

Backwards Incompatible changes

Fix Tensor inplace modulo in python (#49390)

Inplace modulo in python %= was wrongfully done out of place for Tensors. This change fixes the behavior. Previous code that was relying on this operation being done out of place should be updated to use the out of place version t = t % other instead of t %= other.

1.7.11.8.0
>>> a = torch.arange(0, 10)
>>> b = a
>>> b %= 3
>>> print(a)
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> print(b)
tensor([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])
      
>>> a = torch.arange(0, 10)
>>> b = a
>>> b %= 3
>>> print(a)
tensor([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])
>>> print(b)
tensor([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])
      

Standardize torch.clamp edge cases (#43288)

For ease of exposition let a_min be the value of the "min" argument to clamp, and a_max be the value of the "max" argument to clamp.

This PR changes the behavior of torch.clamp to always compute min(max(a, a_min), a_max). torch.clamp currently computes this in its vectorized CPU implementation but uses different approaches for other backends. These implementations are the same when a_min < a_max, but divergent when a_min > a_max. This divergence is easily triggered:

>>> t = torch.arange(200).to(torch.float)
>>> torch.clamp(t, 4, 2)[0]
tensor(2.)

>>> torch.clamp(t.cuda(), 4, 2)[0]
tensor(4., device='cuda:0')

>>> torch.clamp(torch.tensor(0), 4, 2)
tensor(4)

This PR makes the behavior consistent with NumPy's clip. C++'s std::clamp's behavior is undefined when a_min > a_max. Python has no standard clamp implementation.

Tensor deepcopy now properly copies the .grad field (#50663)

The deepcopy protocol will now properly copy the .grad field of Tensors when it exists. The old behavior can be recovered by setting the .grad field to None after doing the deepcopy.

1.7.11.8.0
>>> t.grad
tensor([0.8883, 0.5765])
>>> deepcopy(t).grad
None
      
>>> t.grad
tensor([0.8883, 0.5765])
>>> deepcopy(t).grad
tensor([0.8883, 0.5765])
      

Fix torch.fmod type promotion (#47323, #48278)

1.7.1 Raises RuntimeError for integral tensor and floating-point tensor. The dtype of output is determined by the first input.

>>> x = torch.arange(start=1, end=6, dtype=torch.int32) # tensor([1, 2, 3, 4, 5])
>>> y = torch.arange(start=1.1, end=2.1, step=0.2, dtype=torch.float32) # tensor([1.1, 1.3, 1.5, 1.7, 1.9])
>>> torch.fmod(x, y)
RuntimeError: result type Float can't be cast to the desired output type Int
>>> z = torch.arange(start=0.2, end=1.1, step=0.2, dtype=torch.float64) # tensor([0.2, 0.4, 0.6, 0.8, 1.], dtype=torch.float64)
>>> torch.fmod(y, z).dtype
torch.float32
>>> torch.fmod(z, y).dtype
torch.float64
>>> torch.fmod(x, 1.2)
tensor([0, 0, 0, 0, 0], dtype=torch.int32)

1.8.0: Support integral tensor and floating-point tensor as inputs. The dtype of output is determined by both inputs.

>>> x = torch.arange(start=1, end=6, dtype=torch.int32) # tensor([1, 2, 3, 4, 5])
>>> y = torch.arange(start=1.1, end=2.1, step=0.2, dtype=torch.float32) # tensor([1.1, 1.3, 1.5, 1.7, 1.9])
>>> torch.fmod(x, y)
tensor([1.0000, 0.7000, 0.0000, 0.6000, 1.2000])
>>> z = torch.arange(start=0.2, end=1.1, step=0.2, dtype=torch.float64) # tensor([0.2, 0.4, 0.6, 0.8, 1.], dtype=torch.float64)
>>> torch.fmod(y, z).dtype
torch.float64
>>> torch.fmod(z, y).dtype
torch.float64
>>> torch.fmod(x, 1.2)
tensor([1.0000, 0.8000, 0.6000, 0.4000, 0.2000])

Preserve non-dense or overlapping tensor's layout in *_like functions (#46046)

All the *_like factory functions will now generate the same striding as out of place operations would. This means in particular that non-contiguous tensors will produce non-contiguous outputs. If you require a contiguous output, you can pass the memory_format=torch.contiguous keyword argument to the factory function. Such factory functions include clone, to, float, cuda, *_like, zeros, rand{n}, etc.

Make output of torch.norm and torch.linalg.norm consistent for complex inputs (#48284)

Previously, when given a complex input, torch.linalg.norm and torch.norm would return a complex output. torch.linalg.cond would sometimes return a complex output and sometimes return a real output when given a complex input, depending on its p argument. This PR changes this behavior to match numpy.linalg.norm and numpy.linalg.cond, so that a complex input will result in a real number type, consistent with NumPy.

Make torch.svd return V, not V.conj() for complex inputs (#51012)

torch.svd added support for complex inputs in PyTorch 1.7, but was not documented as doing so. The complex V tensor returned was actually the complex conjugate of what's expected. This PR fixes the discrepancy. Users that were already using the previous version of torch.svd with complex inputs can recover the previous behavior by taking the complex conjugate of the returned V.

torch.angle: properly handle pure real numbers (#49163)

This PR updates PyTorch's torch.angle operator to be consistent with NumPy's. Previously torch.angle would return zero for all real inputs (including NaN). Now angle returns pi for negative real inputs, zero for non-negative real inputs, and propagates NaNs.

Enable distribution validation by default for torch.distributions (#48743)

This may slightly slow down some models. Concerned users may disable validation by using torch.distributions.Distribution.set_default_validate_args(False) or by disabling individual distribution validation via MyDistribution(..., validate_args=False).

This may cause new ValueErrors in models that rely on unsupported behavior, e.g. Categorical.log_prob() applied to continuous-valued tensors (only {0,1}-valued tensors are supported). Such models should be fixed but the previous behavior can be recovered by disabling argument validation using the methods mentioned above.

Prohibit assignment to a sparse tensor (#50040)

Assigning to a sparse Tensor did not work properly and resulted in a no-op. The following code now properly raises an error:

>>> t = torch.rand(10).to_sparse()
>>> t[0] = 42
TypeError: Cannot assign to a sparse tensor

C++ API: operators that take a list of optional Tensors cannot be called with ArrayRef<Tensor> anymore (#49138)

This PR changes the C++ API representation of lists of optional Tensors (e.g. in the Tensor::``index method) from ArrayRef<Tensor> to List<optional<Tensor>>. This change breaks backwards compatibility, since there is no implicit conversion from ArrayRef<Tensor> to List<optional<Tensor>>.

A common call pattern is tensor.index({indices_tensor}), where indices_tensor is a Tensor. This will continue to work because the {} initializer_list constructor for List<optional<Tensor>> can take Tensor elements that are implicitly converted to optional<Tensor>.

However, another common call pattern is tensor.index(indices_tensor), where previously the Tensor got implicitly converted to an ArrayRef<Tensor>. To implicitly convert Tensor -> optional<Tensor> -> List<optional<Tensor>> would chain two implicit conversions, which C++ doesn't allow. So those call sites should be rewritten to use the tensor.index({indices_tensor}) pattern.

Autograd view creation informations are now properly propagated when views are chained

After this fix, an error will properly be thrown to avoid wrong gradients when an in-place operation is performed on a view of a view, when in-place operation were not allowed on the first view. This means that code that used to return wrong gradients in 1.7.1 (such as t.unbind()[0].select(0, 0).add_(1)) will now properly raise an error.

End of deprecation cycle for spectral ops in the torch. namespace (#48594)

This PR removes the deprecated torch.{fft,rfft,ifft,irfft} and their corresponding methods on torch.Tensor. PyTorch programs using these functions must now update to use the torch.fft namespace.

torch.digamma : properly handle all inputs (#48302)

This PR updates PyTorch's torch.digamma function to be consistent with SciPy's special.digamma function. This changes the result of the torch.digamma function on the nonpositive integers, where the gamma function is not defined. Since the gamma function is undefined at these points, the (typical) derivative of the logarithm of the gamma function is also undefined at these points, and for negative integers this PR updates torch.digamma to return NaN. For zero, however, it returns -inf to be consistent with SciPy.

Interestingly, SciPy made a similar change, which was noticed by at least one user: scipy/scipy#9663

SciPy's returning of negative infinity at zero is intentional: https://github.com/scipy/scipy/blob/59347ae8b86bcc92c339efe213128f64ab6df98c/scipy/special/cephes/psi.c#L163

This change is consistent with the C++ standard for the gamma function: https://en.cppreference.com/w/cpp/numeric/math/tgamma

Fix torch.remainder type promotion (#48668)

1.7.1: In the case where the second argument is a python number, the result is casted to the dtype of the first argument.

>>> torch.remainder(x, 1.2)
tensor([0, 0, 0, 0, 0], dtype=torch.int32)

1.8.0 In the case where the second argument is a python number, the dtype of result is determined by type promotion of both inputs.

>>> torch.remainder(x, 1.2)
tensor([1.0000, 0.8000, 0.6000, 0.4000, 0.2000])

Changes to onnx export API to better handle named arguments (#47367)

The args input argument of the torch.onnx.export function is updated to better support optional arguments. An optional dictionary can be passed in addition as the last argument in the args tuple, specifying inputs with the corresponding named parameter. Note that this is backward breaking for cases where the last input is also of a dictionary type. In the new API, for such cases, it is mandatory to have an empty dictionary as the last argument in the args tuple. More details can be found at: https://pytorch.org/docs/1.8.0/onnx.html?highlight=onnx#using-dictionaries-to-handle-named-arguments-as-model-inputs.

Update signature of torch.quantization.quantize function #48537

The run_args argument must now contain a list or tuple containing the positional arguments, even if there is only a single argument. In particular, code like: qmodel = quantize(float_model, default_eval_fn, img_data) that was working in 1.7.1 will now raise the error: TypeError: default_eval_fn() takes 2 positional arguments but 3 were given. You should update this code to provide the image in a list for example: qmodel = quantize(float_model, default_eval_fn, [img_data])

Change the way we quantize relu, leaky relu and sigmoid(#47415, #48038, #45702,#45711, #45883 #45883, #45882, #47660)

Starting with version 1.8.0, in the eager mode quantization flow, relu is not observed anymore as it is not needed. In previous versions, quantized leaky_relu and sigmoid did not require observation and just inherited the quantization parameters from their input, but that does not work very well in eager mode quantization. Starting with version 1.8.0, they are observed operator so that they work better in eager mode quantization.

Update direction numbers to 21201 dims in the SobolEngine (#49710)

This update is BC-breaking because the values drawn by the engine will be different from the ones drawn in 1.7.1 even with the same seed.

1.7.11.8.0
>>> from torch.quasirandom import SobolEngine
>>> eng = SobolEngine(1)
>>> eng.draw(3)
tensor([[0.5000],
            [0.7500],
            [0.2500]])
      
>>> from torch.quasirandom import SobolEngine
>>> eng = SobolEngine(1)
>>> eng.draw(3)
tensor([[0.0000],
            [0.5000],
            [0.7500]])
      

Deprecations

Python API

Deprecate old style nn.Module backward hooks (#46163)

Old style nn.Module backward hooks have been broken for a long time (they do not behave as advertised in the documentation). We now have new nn.Module.register_full_backward_hook that provide a fully working implementation of these hooks. The old function should not be used and migrated to the new full version.

An example of this discrepancy is shown in the example below where a Linear layer takes as input a single Tensor of size 5 and returns a single Tensor of size 5 but old style hook would return two gradients with respect to the input for only one input.

1.7.1:

import torch
from torch import nn

mod = nn.Linear(5, 5)
def hook(mod, grad_inp, grad_out):
    print(f"grad input size: " + " ".join(str(g.size()) for g in grad_inp))
    print(f"grad output size: " + " ".join(str(g.size()) for g in grad_out))
mod.register_backward_hook(hook)

mod(torch.rand(5, requires_grad=True)).sum().backward()
>>> `grad input size: torch.Size([5]) torch.Size([5]) # One too many
>>> grad output size: torch.Size([5])`

1.8.0: Old style hooks are deprecated and will warn when providing wrong result.

import torch
from torch import nn

mod = nn.Linear(5, 5)
def hook(mod, grad_inp, grad_out):
    print(f"grad input size: " + " ".join(str(g.size()) for g in grad_inp))
    print(f"grad output size: " + " ".join(str(g.size()) for g in grad_out))
mod.register_backward_hook(hook)

mod(torch.rand(5, requires_grad=True)).sum().backward()
>>> grad input size: torch.Size([5]) torch.Size([5]) # One too many
>>> grad output size: torch.Size([5])
>>> `UserWarning: Using a non-full backward hook when the forward contains multiple
autograd Nodes is deprecated and will be removed in future versions. This hook
will be missing some grad_input.`

Full hooks should be used to get the proper result all the time and avoid warnings

mod.register_full_backward_hook(hook)

mod(torch.rand(5, requires_grad=True)).sum().backward()
>>> grad input size: torch.Size([5])
>>> grad output size: torch.Size([5])

torch.stft: Deprecate default value of the require_complex argument (#49022, #50102)

Previously torch.stft took an optional return_complex parameter that indicated whether the output would be a real tensor or a complex tensor. return_complex has the default value of False. This default value is deprecated (meaning that this optional argument is becoming mandatory) and will be removed in future versions. You can pass this argument explicitly to avoid this deprecation.

Deprecate torch.set_deterministic in favor of torch.use_deterministic_algorithms (#49904)

This beta feature is being renamed for improved clarity. Users should migrate to use the new name.

Deprecate torch.* linear algebra functions in favor of the torch.linalg.* variant for cholesky (#51460), slogdet (#51354), inverse (#51672), pinverse (#51671)

All the linear algebra functions are being moved to the torch.linalg submodule that provided a compatible API with NumPy. These new functions have the same set of features as the torch. ones and should be used instead.

New features

Python API

Complex Numbers

Profiler

Autograd

Dataloader

CUDA

C++ API

TorchScript

Mobile

Distributed

torch.fx

Quantization

ONNX

Misc

Improvements

Python API

Autograd

torch.utils

Complex Numbers

CUDA

Distributed

TorchScript

Mobile

Quantization

ONNX

Vulkan

This release brings about a complete rewrite of PyTorch’s Vulkan backend with primary focus on improved performance, robustness, and better code structure and organization. These changes are transparent to the end user. Considering that this is a rewrite, many of these changes also qualify as performance improvements.

Misc

Bug fixes

Python API

Autograd

CUDA

torch.utils

Complex Number

C++ API

Distributed

Mobile

TorchScript

torch.fx

Quantization

ONNX

Vulkan

Misc

Performance

Python API

Autograd

CUDA

C++ API

Distributed

TorchScript

Mobile

Vulkan

torch.fx

Quantization

Misc

Documentation

Python API

Autograd

Complex Number

CUDA

C++ API

Distributed

TorchScript

torch.fx

Quantization

ONNX

Misc

相关地址:原始地址 下载(tar) 下载(zip)

查看:2021-03-05发行的版本