MyGit

v1.2.0

pytorch/pytorch

版本发布时间: 2019-08-09 00:06:38

pytorch/pytorch最新发布版本:v2.4.1(2024-09-05 03:59:29)

We have just released PyTorch v1.2.0.

It has over 1,900 commits and contains a significant amount of effort in areas spanning JIT, ONNX, Distributed, as well as Performance and Eager Frontend Improvements.

Highlights

[JIT] New TorchScript API

Version 1.2 includes a new, easier-to-use API for converting nn.Modules into ScriptModules. A sample usage is:

class MyModule(torch.nn.Module):
    ...

# Construct an nn.Module instance
module = MyModule(args)

# Pass it to `torch.jit.script` to compile it into a ScriptModule.
my_torchscript_module = torch.jit.script(module)

torch.jit.script() will attempt to recursively compile the given nn.Module, including any submodules or methods called from forward(). See the migration guide for more info on what's changed and how to migrate.

[JIT] Improved TorchScript Python language coverage

In 1.2, TorchScript has significantly improved its support for Python language constructs and Python's standard library. Highlights include:

See the detailed notes below for more information.

Expanded Onnx Export

In PyTorch 1.2, working with Microsoft, we’ve added full support to export ONNX Opset versions 7(v1.2), 8(v1.3), 9(v1.4) and 10 (v1.5). We’ve have also enhanced the constant folding pass to support Opset 10, the latest available version of ONNX. Additionally, users now are able to register their own symbolic to export custom ops, and specify the dynamic dimensions of inputs during export. Here is a summary of the all of the major improvements:

Updated docs can be found here and also a refreshed tutorial using ONNXRuntime can be found here.

Tensorboard is no Longer Considered Experimental

Read the documentation or simply type fromtorch.utils.tensorboardimport SummaryWriter to get started!

NN.Transformer

We include a standard nn.Transformer module, based on the paper “Attention is All You Need”. The nn.Transformer module relies entirely on an attention mechanism to draw global dependencies between input and output. The individual components of the nn.Transformer module are designed so they can be adopted independently. For example, the nn.TransformerEncoder can be used by itself, without the larger nn.Transformer. New APIs include:

See the Transformer Layers documentation for more info.

Breaking Changes

Comparison operations (lt (<), le (<=), gt (>), ge (>=), eq (==), ne, (!=) ) return dtype has changed from torch.uint8 to torch.bool (21113)

Version 1.1:

>>> torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2])
tensor([1, 0, 0], dtype=torch.uint8)

Version 1.2:

>>> torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2])
tensor([True, False, False])

For most programs, we don't expect that any changes will need to be made as a result of this change. There are a couple of possible exceptions listed below.

Mask Inversion

In prior versions of PyTorch, the idiomatic way to invert a mask was to call 1 - mask. This behavior is no longer supported; use the ~ or bitwise_not() operator instead.

Version 1.1:

>>> 1 - (torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2]))
tensor([0, 1, 1], dtype=torch.uint8)

Version 1.2:

>>> 1 - (torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2]))
RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported.
If you are trying to invert a mask, use the `~` or `bitwise_not()` operator instead.

>>> ~(torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2]))
tensor([False,  True,  True])

sum(Tensor) (python built-in) does not upcast dtype like torch.sum

Python's built-in sum returns results in the same dtype as the tensor itself, so it will not return the expected result if the value of the sum cannot be represented in the dtype of the tensor.

Version 1.1:

# value can be represented in result dtype
>>> sum(torch.tensor([1, 2, 3, 4, 5]) > 2)
tensor(3, dtype=torch.uint8)

# value can NOT be represented in result dtype
>>> sum(torch.ones((300,)) > 0)
tensor(44, dtype=torch.uint8)

# torch.sum properly upcasts result dtype
>>> torch.sum(torch.ones((300,)) > 0)
tensor(300)

Version 1.2:

# value cannot be represented in result dtype (now torch.bool)
>>> sum(torch.tensor([1, 2, 3, 4, 5]) > 2)
tensor(True)

# value cannot be represented in result dtype
>>> sum(torch.ones((300,)) > 0)
tensor(True)

# torch.sum properly upcasts result dtype
>>> torch.sum(torch.ones((300,)) > 0)
tensor(300)

TLDR: use torch.sum instead of the built-in sum. Note that the built-in sum() behavior will more closely resemble torch.sum in the next release.

Note also that masking via torch.uint8 Tensors is now deprecated, see the Deprecations section for more information.

__invert__ / ~: now calls torch.bitwise_not instead of 1 - tensor and is supported for all integral+Boolean dtypes instead of only torch.uint8. (22326)

Version 1.1:

>>> ~torch.arange(8, dtype=torch.uint8)
tensor([ 1, 0, 255, 254, 253, 252, 251, 250], dtype=torch.uint8)

Version 1.2:

>>> ~torch.arange(8, dtype=torch.uint8)
tensor([255, 254, 253, 252, 251, 250, 249, 248], dtype=torch.uint8)

torch.tensor(bool) and torch.as_tensor(bool) now infer torch.bool dtype instead of torch.uint8. (19097)

Version 1.1:

>>> torch.tensor([True, False])
tensor([1, 0], dtype=torch.uint8)

Version 1.2:

>>> torch.tensor([True, False])
tensor([ True, False])

nn.BatchNorm{1,2,3}D: gamma (weight) is now initialized to all 1s rather than randomly initialized from U(0, 1). (13774)

Version 1.1:

>>> torch.nn.BatchNorm2d(5).weight
Parameter containing:
tensor([0.1635, 0.7512, 0.4130, 0.6875, 0.5496], 
       requires_grad=True)

Version 1.2:

>>> torch.nn.BatchNorm2d(5).weight
Parameter containing:
tensor([1., 1., 1., 1., 1.], requires_grad=True)

A number of deprecated Linear Algebra operators have been removed (22841)

Removed Use Instead
btrifact lu
btrifact_with_info lu with get_infos=True
btrisolve lu_solve
btriunpack lu_unpack
gesv solve
pstrf cholesky
potrf cholesky
potri cholesky_inverse
potrs cholesky_solve
trtrs triangular_solve

Sparse Tensors: Changing the sparsity of a Tensor through .data is no longer supported. (17072)

>>> x = torch.randn(2,3)
>>> x.data = torch.sparse_coo_tensor((2, 3))
RuntimeError: Attempted to call `variable.set_data(tensor)`,
but `variable` and  `tensor` have incompatible tensor type.

Sparse Tensors: in-place shape modifications of Dense Tensor Constructor Arguments will no longer modify the Sparse Tensor itself (20614)

Version 1.1:

>>> i = torch.tensor([[0, 1]])
>>> v = torch.ones(2)
>>> s = torch.sparse_coo_tensor(i, v)
>>> i.resize_(1, 1)
>>> v.resize_(1)

>>> s.coalesce().indices().shape
torch.Size([1, 1])

>>> s.coalesce().values().shape
torch.Size([1])

Notice indices() and values() reflect the resized tensor shapes.

Version 1.2:

>>> i = torch.tensor([[0, 1]])
>>> v = torch.ones(2)
>>> s = torch.sparse_coo_tensor(i, v)
>>> i.resize_(1, 1)
>>> v.resize_(1)

>>> s.coalesce().indices().shape
torch.Size([1, 2])

>>> s.coalesce().values().shape
torch.Size([2])

Notice indices() and values() reflect the original tensor shapes.

Sparse Tensors: Accumulating dense gradients into a sparse .grad will no longer retain Python object identity. (17072)

Version 1.1:

>>> m = torch.nn.Embedding(10, 3, sparse=True)
>>> m(torch.tensor([[1,2,4,5],[4,3,2,9]])).sum().backward()
>>> assert m.weight.grad.layout == torch.sparse_coo
>>> m_weight_grad_saved = m.weight.grad

# accumulate dense gradient into sparse .grad, change sparsity
>>> m.weight.sum().backward()
>>> assert m.weight.grad.layout == torch.strided
# m_weight_grad_saved still refers to the .grad of m's weight
# even though the sparsity has changed
>>> assert id(m_weight_grad_saved) == id (m.weight.grad)

Version 1.2:

>>> m = torch.nn.Embedding(10, 3, sparse=True)
>>> m(torch.tensor([[1,2,4,5],[4,3,2,9]])).sum().backward()
>>> assert m.weight.grad.layout == torch.sparse_coo
>>> m_weight_grad_saved = m.weight.grad

# accumulate dense gradient into sparse .grad, change sparsity
>>> m.weight.sum().backward()
>>> assert m.weight.grad.layout == torch.strided
# m_weight_grad_saved NO LONGER refers to the .grad of m's weight
>>> assert id(m_weight_grad_saved) == id (m.weight.grad)
AssertionError

nn.utils.convert_sync_batchnorm has been replaced with nn.SyncBatchNorm.convert_sync_batchnorm (18787)

Example of new usage:

>>> # Network with nn.BatchNorm layer
>>> module = torch.nn.Sequential(
>>>     torch.nn.Linear(20, 100),
>>>     torch.nn.BatchNorm1d(100)
>>> ).cuda()
>>> # creating process group (optional)
>>> process_group = torch.distributed.new_group(process_ids)
>>> sync_bn_module = torch.nn.SyncBatchNorm.convert_sync_batchnorm(module, process_group)

Error Checking: torch.addcmul and torch.lerp operators enforce stronger shape requirements on the output tensor (out= keyword argument) and do not allow output tensor to be resized if it is also used as one of the inputs.

Version 1.1:

>>> x=torch.zeros(1)
>>> torch.addcmul(x, x, torch.zeros(2,3), out=x)
tensor([[0., 0., 0.],
        [0., 0., 0.]])

Version 1.2:

>>> x=torch.zeros(1)
>>> torch.addcmul(x, x, torch.zeros(2,3), out=x)
RuntimeError: output with shape [1] doesn't match the broadcast shape [2, 3]

If you run into this error, please ensure the out parameter is of the correct output shape (post-broadcasting).

Error Checking: Improved Variable version tracking (20391, 22821, 21865)

PyTorch’s autograd system uses a version tracking mechanism to ensure that Tensors that are saved for backwards computations retain their correct values when the backward pass is computed (i.e. that they haven’t been updated in-place since they were saved). See In Place Correctness Checks in the docs for more information.

In PyTorch 1.2 we have enhanced the version tracking in a number of cases, which may flag issues that were not caught previously. There is now additional tracking through the Variable() constructor, the nn.Parameter() constructor, after setting .data, and via nn.Module._apply (internal API).

Track changes through Variable constructor:

>>> x = torch.ones(1, requires_grad=True)+1
>>> y = x*x

# do an in-place update through Variable constructor
>>> torch.autograd.Variable(x).add_(1)
>>> y.backward()
RuntimeError: one of the variables needed for gradient computation has been modified
by an inplace operation: [torch.FloatTensor [1]] is at version 1; expected version 0 
instead.

Track changes on an nn.Parameter:

>>> x = torch.ones(1)
>>> p = torch.nn.Parameter(x)
>>> y = p * p

# do an in-place update on a saved Parameter
>>> x.add_(1)
>>> y.sum().backward()
RuntimeError: one of the variables needed for gradient computation has been modified
by an inplace operation: [torch.FloatTensor [1]] is at version 1; expected version 0 
instead.

Track changes after setting .data:

>>> x = torch.zeros(1, requires_grad=True)+1
>>> y = x * x
>>> x.data = torch.zeros(1, requires_grad=True)+1

>>> x.add_(1)
>>> y.backward()
RuntimeError: one of the variables needed for gradient computation has been modified
by an inplace operation: [torch.FloatTensor [1]], which is output 0 of AddBackward0,
is at version 1; expected version 0 instead.

[JIT] Python called from scripted modules must be @ignored

torch.jit.script now recursively compiles everything it finds in the original function, so if you had Python functions called from in your scripted function or module, you must now explicitly @ignore it. See the new API guide for more details.

Version 1.1

def my_unscriptable_python_fn():
    # weird stuff

@torch.jit.script
def fn():
    # This gets inserted as a Python call, and only errors on `save()`.
    my_unscriptable_python_fn()

Version 1.2

@torch.jit.ignore  # this needs to be added ...
def my_unscriptable_python_fn():
    ...

@torch.jit.script
def fn():
    # ... or else recursive compilation will attempt to compile this call
    my_unscriptable_python_fn()

NOTE: This is also a change to behavior of the @torch.jit.ignore decorator. In version 1.1, @ignore tells the compiler to omit compiling a function entirely, to mark Python functions that you know will not be called after export. In version 1.2 @ignore, tells the compiler to insert a call back to the Python interpreter instead of trying to compile the function.

To get the old behavior, use @torch.jit.ignore(drop_on_export=True) (@torch.jit.ignore with no arguments is equivalent to @torch.jit.ignore(drop_on_export=False)).

[JIT] optimize for ScriptModules is now a context manager

Whether optimization passes are run is now a thread-local flag. This better reflects how optimization actually happens in the JIT (i.e. it is decided at runtime, not compilation time).

Version 1.1

@torch.jit.script(optimize=False)
def fn(inputs):
    ...

fn(inputs)

Version 1.2

@torch.jit.script
def fn(inputs):
    ...

with @torch.jit.optimized_execution(False):
    fn(inputs)

[jit] script::Module is now a reference type

To better align with the PyTorch C++ API philosophy, script::Module and script::Method are now reference types. Our APIs have been updated to use script::Module instead of std::shared_ptr<script::Module>.

Version 1.1

using torch::jit::script::Module;

std::shared_ptr<Module> m = torch::jit::load("my_model.py");
m->forward(...);

Version 1.2

using torch::jit::script::Module;

Module m = torch::jit::load("my_model.py");
m.forward(...);

[C++ only] mean() / sum() / prod() APIs have changed slightly (21088)

Version 1.1 API:

Tensor sum(IntArrayRef dim, bool keepdim=false) const;    
Tensor sum(IntArrayRef dim, ScalarType dtype) const;

Version 1.2 API:

Tensor sum(IntArrayRef dim, bool keepdim=false,
           c10::optional<ScalarType> dtype=c10::nullopt) const;

that is, to override dtype, keepdim must now be provided.

Binary distribution and nightly changes

We have streamlined our conda and wheel binary distributions, so that it is easier than ever to install the version of PyTorch appropriate for your needs. The install instructions on https://pytorch.org/ have been updated, but if you have tooling to download and install PyTorch, here is a detailed description of the changes we made:

Wheels now have local version identifiers. Wheels that are for non-default CUDA configurations (the default CUDA version for this release is 10.0) now have local version identifiers like +cpu and +cu92. This means that, when installing, it is no longer necessary to specify a full wheel URL—just specify an appropriate version constraint like torch==1.2.0+cu92.

Version 1.1 (for Python 3.7 on Linux only):

pip install numpy
pip install https://download.pytorch.org/whl/cpu/torch-1.1.0-cp37-cp37m-linux_x86_64.whl

Version 1.2 (works for all versions of Python, and both Linux and Mac):

pip install torch==1.2.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

CPU-only binaries on conda can be selected with the cpuonly feature. We’ve eliminated the pytorch-cpu conda package; instead, the cpu-only conda package can be enabled by installing the cpuonly metapackage. Similarly, there is no longer both a torchvision and torchvision-cpu package; the feature will ensure that the CPU version of torchvision is selected.

Version 1.1:

conda install -c pytorch pytorch-cpu

Version 1.2:

conda install -c pytorch pytorch cpuonly

Conda nightlies now live in the pytorch-nightly channel and no longer have “-nightly” in their name. We have added a new dedicated channel for nightlies called pytorch-nightly; all nightlies (pytorch, torchvision, torchaudio, etc.) will now be uploaded to this channel, but with the same name as their corresponding stable versions (unlike before, when we had a separate pytorch-nightly, torchvision-nightly, etc. packages.) This makes it more difficult to accidentally install a copy of the nightly and stable at the same time.

Version 1.1:

conda install -c pytorch pytorch-nightly

Version 1.2:

conda install -c pytorch-nightly pytorch

Wheel nightlies no longer have -nightly in their name. Similar to the changes we made in Conda, we no longer suffix wheel nightlies with “-nightly”, to make it harder to accidentally install a copy of nightly and stable at the same time.

Version 1.1:

pip install --pre torch_nightly -f https://download.pytorch.org/whl/nightly/torch_nightly.html

Version 1.2:

pip install --pre torch -f https://download.pytorch.org/whl/nightly/torch_nightly.html

New Features

Tensor Type Support

NN Package

Operators

Optim Package

Distributed Package

IterableDataset

Tensorboard Package

JIT Features

Improvements

Distributed Improvements

Tensorboard Improvements

Numpy Compatibility Improvements

JIT Improvements

C++ API Improvements

MKLDNN Tensor Improvements

Add support for a number of operators on MKLDNN Tensors including:

Bug Fixes

torch.nn Bug Fixes

Distributed Bug fixes

JIT Bug Fixes

C++ Frontend bug fixes

Deprecations

Masking via torch.uint8 Tensors is now deprecated in favor of masking via torch.bool Tensors.

See the Breaking Changes section for more details about torch.bool Tensors and comparison operators.

torch.masked_select, torch.masked_fill, torch.masked_scatter now expect torch.bool masks rather than torch.uint8.

>>> a = torch.tensor([1, 2, 3])
>>> b = torch.tensor([3, 1, 2])

>>> a.masked_select(tensor([0, 1, 1], dtype=torch.uint8))
UserWarning: masked_select received a mask with dtype torch.uint8,
this behavior is now deprecated, please use a mask with dtype torch.bool instead.

tensor([2, 3])

# instead use torch.bool
>>> a.masked_select(tensor([False,  True,  True]))
tensor([2, 3])

Comparison operators with out= parameters now expect torch.bool dtype rather than torch.uint8.

>>> a = torch.tensor([1, 2, 3])
>>> b = torch.tensor([3, 1, 2])
>>> res = torch.empty_like(a, dtype=torch.uint8)
>>> torch.gt(a, b, out=res)
UserWarning: torch.gt received 'out' parameter with dtype torch.uint8, this behavior
is now deprecated, please use 'out' parameter with dtype torch.bool instead.

tensor([0, 1, 1], dtype=torch.uint8)

# instead use torch.bool
>>> res = torch.empty_like(a, dtype=torch.bool)
>>> torch.gt(a, b, out=res)
tensor([False, True, True])

Legacy autograd.Function (Function without static forward method) is now deprecated

>>> class MyLegacyFunction(Function):
>>>     def forward(self, x):
>>>         return x
>>>
>>>     def backward(self, grad_output):
>>>         return grad_output
>>>
>>> MyLegacyFunction()(torch.randn((3,), requires_grad=True)
UserWarning: Legacy autograd function with non-static forward method is deprecated
and will be removed in 1.3. Please use new-style autograd function
with static forward method.

# instead use new-style Autograd Function
>>> class MyFunction(Function):
>>>     @staticmethod
>>>     def forward(ctx, x):
>>>         return x
>>>
>>>     @staticmethod
>>>     def backward(ctx, grad_output):
>>>         return grad_output
>>>
>>> MyFunction.apply(torch.randn((3,), requires_grad=True)

See the torch.autograd.Function documentation for more details.

torch.gels: has been renamed to torch.lstsq; torch.gels will work for this release but is now deprecated. (23460)

Performance

Torch.NN Performance Improvements

Documentation

Torch.NN Documentation

Contributor Documentation

Build Documentation

TensorBoard Documentation

Torch HUB Documentation

ONNX

In PyTorch 1.2, we have added the full support for ONNX Opset 7, 8, 9 and 10 in ONNX exporter, and we have also enhanced the constant folding pass to support Opset 10. The export of ScriptModule has better support. Additionally, users now are able to register their own symbolic to export custom ops, and specify the dynamic dimensions of inputs during export.

Supporting More ONNX Opsets

Enhancing the Support for ScriptModule

Exporting More Torch Operators to ONNX

Extending Existing Exporting Logic

Optimizing Exported ONNX Graph

Bugfixes/Improvements

相关地址:原始地址 下载(tar) 下载(zip)

查看:2019-08-09发行的版本