MyGit

v1.10.0

pytorch/pytorch

版本发布时间: 2021-10-21 23:49:53

pytorch/pytorch最新发布版本:v2.5.1(2024-10-30 01:58:24)

1.10.0 Release Notes

Highlights

We are excited to announce the release of PyTorch 1.10. This release is composed of over 3,400 commits since 1.9, made by 426 contributors. We want to sincerely thank our community for continuously improving PyTorch.

PyTorch 1.10 updates are focused on improving training and performance of PyTorch, and developer usability. Highlights include:

You can check the blogpost that shows the new features here.

Backwards Incompatible changes

Python API

torch.any/torch.all behavior changed slightly to be more consistent for zero-dimension, uint8 tensors. (#64642)

These two functions match the behavior of NumPy, returning an output dtype of bool for all support dtypes, except for uint8 (in which case they return a 1 or a 0, but with uint8 dtype). In some cases with 0-dim tensor inputs, the returned uint8 value could mistakenly take on a value > 1. This has now been fixed.

1.9.11.10.0
>>> torch.all(torch.tensor(42, dtype=torch.uint8))
tensor(1, dtype=torch.uint8)
>>> torch.all(torch.tensor(42, dtype=torch.uint8), dim=0)
tensor(42, dtype=torch.uint8) # wrong, old behavior
      
>>> torch.all(torch.tensor(42, dtype=torch.uint8))
tensor(1, dtype=torch.uint8)
>>> torch.all(torch.tensor(42, dtype=torch.uint8), dim=0)
tensor(1, dtype=torch.uint8) # new, corrected and consistent behavior
      

Remove deprecated torch.{is,set}_deterministic (#62158)

This is the end of the deprecation cycle for both of these functions. You should be using torch.use_deterministic_algorithms andtorch.are_deterministic_algorithms_enabled instead.

Complex Numbers

Conjugate View: tensor.conj() now returns a view tensor that aliases the same memory and has conjugate bit set (#54987, #60522, #66082, #63602).

This means that .conj() is now an O(1) operation and returns a tensor that views the same memory as tensor and has conjugate bit set. This notion of conjugate bit enables fusion of operations with conjugation which gives a lot of performance benefit for operations like matrix multiplication. All out-of-place operations will have the same behavior as before, but an in-place operation on a conjugated tensor will additionally modify the input tensor.

1.9.11.10.0
>>> import torch
>>> x = torch.tensor([1+2j])
>>> y = x.conj()
>>> y.add_(2)
>>> print(x)
tensor([1.+2.j])
      
>>> import torch
>>> x = torch.tensor([1+2j])
>>> y = x.conj()
>>> y.add_(2)
>>> print(x)
tensor([3.+2.j])
      

Note: You can verify if the conj bit is set by calling tensor.is_conj(). The conjugation can be resolved, i.e., you can obtain a new tensor that doesn’t share storage with the input tensor at any time by calling conjugated_tensor.clone() or conjugated_tensor.resolve_conj() .

Note that these conjugated tensors behave differently from the corresponding numpy arrays obtained from np.conj() when an in-place operation is performed on them (similar to the example shown above).

Negative View: tensor.conj().neg() returns a view tensor that aliases the same memory as both tensor and tensor.conj() and has a negative bit set (#56058).

conjugated_tensor.neg() continues to be an O(1) operation, but the returned tensor shares memory with both tensor and conjugated_tensor.

1.9.11.10.0
>>> x = torch.tensor([1+2j])
>>> y = x.conj()
>>> z = y.imag
>>> z.add_(2)
>>> print(x)
tensor([1.+2.j])
      
>>> x = torch.tensor([1+2j])
>>> y = x.conj()
>>> z = y.imag
>>> print(z.is_neg())
True
>>> z.add_(2)
>>> print(x)
tensor([1.-0.j])
      

tensor.numpy() now throws RuntimeError when called on a tensor with conjugate or negative bit set (#61925).

Because the notion of conjugate bit and negative bit doesn’t exist outside of PyTorch, calling operations that return a Python object viewing the same memory as input like .numpy() would no longer work for tensors with conjugate or negative bit set.

1.9.11.10.0
>>> x = torch.tensor([1+2j])
>>> y = x.conj().imag
>>> print(y.numpy())
[2.]
      
>>> x = torch.tensor([1+2j])
>>> y = x.conj().imag
>>> print(y.numpy())
RuntimeError: Can't call numpy() on Tensor that has negative
bit set. Use tensor.resolve_neg().numpy() instead.
      

Autograd

Raise TypeError instead of RuntimeError when assigning to a Tensor’s grad field with wrong type (#64876)

Setting the .grad field with a non-None and non-Tensor object used to return a RuntimeError but it now properly returns a TypeError. If your code was catching this error, you should simply update it to catch a TypeError instead of a RuntimeError.

1.9.11.10.0
try:
    # Assigning an int to a Tensor's grad field
    a.grad = 0
except RuntimeError as e:
    pass
      
try:
   a.grad = 0
except TypeError as e:
    pass
      

Raise error when inputs to autograd.grad are empty (#52016)

Calling autograd.grad with an empty list of inputs used to do the same as backward. To reduce confusion, it now raises the expected error. If you were relying on this, you can simply update your code as follows:

1.9.11.10.0
grad = autograd.grad(out, tuple())
assert grad == tuple()
      
out.backward()
      

Optional arguments to autograd.gradcheck and autograd.gradgradcheck are now kwarg-only (#65290)

These two functions now have a significant number of optional arguments controlling what they do (i.e., eps, atol, rtol, raise_exception, etc.). To improve readability, we made these arguments kwarg-only. If you are passing these arguments to autograd.gradcheck or autograd.gradgradcheck as positional arguments, you can update your code as follows:

1.9.11.10.0
torch.autograd.gradcheck(fn, x, 1e-6)
      
torch.autograd.gradcheck(fn, x, eps=1e-6)
      

In-place detach (detach_) now errors for views that return multiple outputs (#58285)

This change is finishing the deprecation cycle for the inplace-over-view logic. In particular, a few things that were warning are updated:

* `detach_` will now raise an error when invoked on any view created by `split`, `split_with_sizes`, or `chunk`. You should use the non-inplace `detach` instead.
* The error message for when an in-place operation (that is not detach) is performed on a view created by `split`, `split_with_size`, and `chunk` has been changed from "This view is an output of a function..." to "This view is the output of a function...".

1.9.11.10.0
b = a.split(1)[0]
b.detach_()
      
b = a.split(1)[0]
c = b.detach()
      

Fix saved variable unpacking version counter (#60195)

In-place on the unpacked SavedVariables used to be ignored. They are now properly detected which can lead to errors saying that a variable needed for backward was modified in-place. This is a valid error and the user should fix this by cloning the unpacked saved variable before using it.

No internal formula will trigger this, but it might be triggered by user custom autograd.Function if the backward modifies a saved Tensor inplace and you do multiple backwards. This used to silently return the wrong result and will now raise the expected error.

torch.nn

Added optional tensor arguments to __torch_function__ handling checks (#63967)

This fixes the has_torch_function*() checks throughout torch.nn.functional to correctly pass in optional tensor arguments; prior to this fix, handle_torch_function() was not called for these optional tensor arguments. Previously, passing a tensor-like object into a function that accepts an optional tensor might not trigger that object's __torch_function__. Now, the object's __torch_function__ will be triggered as expected.

1.9.11.10.0
import torch
import torch.nn.functional as F
class TestTensor(object):
    def __init__(self, weight):
        self.weight = weight
    def __torch_function__(self, func, _, args=(), kwargs=None):
        print(func)
        print(func == F.group_norm)
# Call F.group_norm with a custom Tensor as the non-optional arg 'features'
features = TestTensor(torch.randn(3,3))
F.group_norm(features, 3)
# ...prints "group_norm" and True
# Call F.group_norm with a custom Tensor as the optional arg 'weight'
features = torch.randn(3,3)
weight = TestTensor(torch.randn(3))
F.group_norm(features, 3, weight=weight)
# ...prints "group_norm" and False because weight's __torch_function__ is
# called with func as torch.group_norm instead of F.group_norm
      
import torch
import torch.nn.functional as F
class TestTensor(object):
    def __init__(self, weight):
        self.weight = weight
    def __torch_function__(self, func, _, args=(), kwargs=None):
        print(func)
        print(func == F.group_norm)
# Call F.group_norm with a custom Tensor as the non-optional arg 'features'
features = TestTensor(torch.randn(3,3))
F.group_norm(features, 3)
# ...prints "group_norm" and True
# Call F.group_norm with a custom Tensor as the optional arg 'weight'
features = torch.randn(3,3)
weight = TestTensor(torch.randn(3))
F.group_norm(features, 3, weight=weight)
# ...prints "group_norm" and True
      

CUDA

Removed post-backward syncs on default stream (#60421)

Calls to backward() or grad() synced only the calling thread's default stream with autograd leaf streams at the end of backward. This made the following weird pattern safe:

with torch.cuda.stream(s):
    # imagine forward used many streams, so backward leaf nodes may run on many streams
    loss.backward()# no sync
use grads

but a more benign-looking pattern was unsafe:

with torch.cuda.stream(s):
    # imagine forward used a lot of streams, so backward leaf nodes may run on many streams
    loss.backward()
    # backward() syncs the default stream with all the leaf streams, but does not sync s with anything,
    # so counterintuitively (even though we're in the same stream context as backward()!)
    # it is NOT SAFE to use grads here, and there's no easy way to make it safe,
    # unless you manually sync on all the streams you used in forward,
    # or move "use grads" back to default stream outside the context.
    use grads

Note: this change makes it so that backward() has same user-facing stream semantics as any cuda op.** In other words, the weird pattern is unsafe, and the benign-looking pattern is safe. Implementation-wise, this meant backward() should sync its calling thread's current stream, not default stream, with the leaf streams. This PR deletes syncs on the default stream.

torch.package

1.9.11.10.0
with PackageExporter(buffer, verbose=False) as e:
    e.intern("**")
    e.save_pickle("res", "mod1.pkl", mod1)
    e.save_pickle("res", "mod2.pkl", mod2)
      
with PackageExporter(buffer) as e:
    e.intern("**")
    e.save_pickle("res", "mod1.pkl", mod1)
    e.save_pickle("res", "mod2.pkl", mod2)
      

Quantization

Added extra observer/fake_quant (the same observer/fake_quant instance as the input) for some operators in prepare_fx, e.g. maxpool, add_scalar and mul_scalar (#61687, #61859)

Previously the way we insert observers/fake_quants are specific to fbgemm/qnnpack backend, as we work on making FX Graph Mode Quantization extensible to custom backends, we are changing some behaviors for the fbgemm/qnnpack path as well. The above changes are adding extra observer/fake_quant to the output of some operators to make sure we model the quantized operator more accurately in quantization aware training, the comprehensive list of operators where the behavior changes are the following:

We will show an example with torch.nn.MaxPool2d:

class M(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.maxpool2d = torch.nn.MaxPool2d(kernel_size=3)

    def forward(self, x):
        x = self.maxpool2d(x)
        return x
m = M().eval()        
m = prepare_fx(m, {"": torch.quantization.default_qconfig})
print(m.code)

1.9.11.10.0
def forward(self, x):
    x_activation_post_process_0 = self.x_activation_post_process_0(x); x = None
    maxpool2d = self.maxpool2d(x_activation_post_process_0); x_activation_post_process_0 = None
    return maxpool2d
      
def forward(self, x):
    x_activation_post_process_0 = self.x_activation_post_process_0(x); x = None
    maxpool2d = self.maxpool2d(x_activation_post_process_0); x_activation_post_process_0 = None
    maxpool2d_activation_post_process_0 = self.maxpool2d_activation_post_process_0(maxpool2d); maxpool2d = None
    return maxpool2d_activation_post_process_0
      

Note that self.maxpool2d_activation_post_process_0 and self.x_activation_post_process_0 will refer to the same observer/fake_quant instance, this is to simulate the numerics for the quantized maxpool implementation, where the output would reuse the quantization parameter of the input. Simple illustration with graph:

Before:

observer_0 - maxpool - ...

After:

observer_0 - maxpool - observer_0 (same observer instance as input observer) - ...

ONNX

Removed aten arg from torch.onnx.export(). (#62759)

The new OperatorExportTypes.ONNX removes the need for an explicit aten argument. If Pytorch was built with -DPYTORCH_ONNX_CAFFE2_BUNDLE the a None value means OperatorExportTypes.ONNX_ATEN_FALLBACK

1.9.11.10.0
torch.onnx.export(..., aten=True)
      
torch.onnx.export(..., operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN)
      

Deprecations

Python API

Deprecate __torch_function__ as a plain methods (#64843)

The __torch_function__ function used to create Tensor like objects did not have any constraint whether it should be a method, class method or static method.

To make it compatible with newer features on Tensor-like objects, we are deprecating setting it as a plain method. You can define it as a class method to get the current class and scan the argument list if you need an object that is an instance of this class.

Mobile

Removed API torch.utils.bundled_inputs.run_on_bundled_input (#58344)

This API caused many issues and is not really necessary. The functionality (run model with bundled input) can be achieved by using get_all_bundled_inputs. For example:

1.9.1:

model.run_on_bundled_input(0)

1.10.0:

model(*model.get_all_bundled_inputs()[0])

Distributed

torch.distributed.rpc: Removed ProcessGroup RPC backend (#62411 , #62985)

ProcessGroup RPC backend has been deprecated and 1.9 was the last release which carried it. The default RPC backend is TensorPipe which is the recommended backend for RPC. Users who use torch.distributed.rpc.BackendType.PROCESS_GROUP will be given an error message to switch to torch.distributed.rpc.BackendType.TENSORPIPE.

ONNX

Removed following arguments in torch.onnx.export(): enable_onnx_checker, strip_doc_string, _retain_param_name (#64369, #64371, #64370)

enable_onnx_checker argument is removed. ONNX checker will now always run by default. Users can catch exceptions to ignore raised failures. strip_doc_string has been rolled into the verbose arg in torch.onnx.export(). _retain_param_name argument has been removed in torch.onnx.export() will default to True . There is no way to get the old behavior of _retain_param_name=False. Users should stop setting this arg.

1.9.1:

torch.onnx.export(..., enable_onnx_checker=False, strip_doc_string=False)

1.10.0:

try:
    torch.onnx.export(verbose=True)
except torch.onnx.utils.ONNXCheckerError:
   pass

Infra (Releng)

Disable ParallelTBB (#65092)

ParallelTBB config/codepath is no longer actively tested by PyTorch CI and as result is subject to code/functionality degradation

New features

Python API

Autograd

torch.nn

Profiler

CUDA

C++ API

TorchScript

torch.package

Mobile

Distributed

DistributedDataParallel

torch.distributed

torch.fx

ONNX

Infra (Releng)

Misc

Improvements

Python API

Autograd

torch.nn

Profiler

Dataloader

CUDA

TorchScript

torch.package

Mobile

Quantization

Distributed

DistributedDataParallel

torch.distributed

torch.distributed.nn.RemoteModule

torch.distributed.elastic

torch.distributed.rpc

torch.distributed.Store

torch.distributed.pipeline

torch.fx

Composability

Build_Frontend

Infra (Releng)

Misc

Bug fixes

Python API

Autograd

torch.nn

Dataloader

AMD

CUDA

C++ API

TorchScript

torch.package

Mobile

Quantization

Distributed

DistributedDataParallel

torch.distributed.Store

torch.distributed.rpc

torch.distributed.elastic

torch.distributed.autograd

torch.distributed

torch.fx

ONNX

Vulkan

Performance_as_a_product

Composability

Build_Frontend

Infra (Releng)

LinAlg_Frontend

Sparse_Frontend

Misc

Performance

Python API

Autograd

torch.nn

CUDA

Mobile

Distributed

Vulkan

Performance_as_a_product

Composability

Build_Frontend

Infra (Releng)

Sparse_Frontend

Misc

You can also find the dev specific and documentation related changes in the forum post here

相关地址:原始地址 下载(tar) 下载(zip)

1、 pytorch-v1.10.0.tar.gz 105.54MB

查看:2021-10-21发行的版本