v0.4.1
版本发布时间: 2018-07-27 03:09:19
pytorch/pytorch最新发布版本:v2.5.1(2024-10-30 01:58:24)
Table of Contents
- Breaking Changes
- New Features
- Neural Networks
- Adaptive Softmax, Spectral Norm, etc.
- Operators
- torch.bincount, torch.as_tensor, ...
- torch.distributions
- Half Cauchy, Gamma Sampling, ...
- Other
- Automatic anomaly detection (detecting NaNs, etc.)
- Neural Networks
- Performance
- Faster CPU ops in a wide variety of cases
- Other improvements
- Bug Fixes
- Documentation Improvements
Breaking Changes
-
torch.stft
has changed its signature to be consistent with librosa https://github.com/pytorch/pytorch/pull/9497- Before:
stft(signal, frame_length, hop, fft_size=None, normalized=False, onesided=True, window=None, pad_end=0)
- After:
stft(input, n_fft, hop_length=None, win_length=None, window=None, center=True, pad_mode='reflect', normalized=False, onesided=True)
-
torch.stft
is also now using FFT internally and is much faster.
- Before:
-
torch.slice
is removed in favor of the tensor slicing notation https://github.com/pytorch/pytorch/pull/7924 -
torch.arange
now does dtype inference: any floating-point argument is inferred to be the defaultdtype
; all integer arguments are inferred to beint64
. https://github.com/pytorch/pytorch/pull/7016 -
torch.nn.functional.embedding_bag
's old signature embedding_bag(weight, input, ...) is deprecated, embedding_bag(input, weight, ...) (consistent with torch.nn.functional.embedding) should be used instead -
torch.nn.functional.sigmoid
andtorch.nn.functional.tanh
are deprecated in favor oftorch.sigmoid
andtorch.tanh
https://github.com/pytorch/pytorch/pull/8748 - Broadcast behavior changed in an (very rare) edge case:
[1] x [0]
now broadcasts to[0]
(used to be[1]
) https://github.com/pytorch/pytorch/pull/9209
New Features
Neural Networks
-
Adaptive Softmax
nn.AdaptiveLogSoftmaxWithLoss
https://github.com/pytorch/pytorch/pull/5287>>> in_features = 1000 >>> n_classes = 200 >>> adaptive_softmax = nn.AdaptiveLogSoftmaxWithLoss(in_features, n_classes, cutoffs=[20, 100, 150]) >>> adaptive_softmax AdaptiveLogSoftmaxWithLoss( (head): Linear(in_features=1000, out_features=23, bias=False) (tail): ModuleList( (0): Sequential( (0): Linear(in_features=1000, out_features=250, bias=False) (1): Linear(in_features=250, out_features=80, bias=False) ) (1): Sequential( (0): Linear(in_features=1000, out_features=62, bias=False) (1): Linear(in_features=62, out_features=50, bias=False) ) (2): Sequential( (0): Linear(in_features=1000, out_features=15, bias=False) (1): Linear(in_features=15, out_features=50, bias=False) ) ) ) >>> batch = 15 >>> input = torch.randn(batch, in_features) >>> target = torch.randint(n_classes, (batch,), dtype=torch.long) >>> # get the log probabilities of target given input, and mean negative log probability loss >>> adaptive_softmax(input, target) ASMoutput(output=tensor([-6.8270, -7.9465, -7.3479, -6.8511, -7.5613, -7.1154, -2.9478, -6.9885, -7.7484, -7.9102, -7.1660, -8.2843, -7.7903, -8.4459, -7.2371], grad_fn=<ThAddBackward>), loss=tensor(7.2112, grad_fn=<MeanBackward1>)) >>> # get the log probabilities of all targets given input as a (batch x n_classes) tensor >>> adaptive_softmax.log_prob(input) tensor([[-2.6533, -3.3957, -2.7069, ..., -6.4749, -5.8867, -6.0611], [-3.4209, -3.2695, -2.9728, ..., -7.6664, -7.5946, -7.9606], [-3.6789, -3.6317, -3.2098, ..., -7.3722, -6.9006, -7.4314], ..., [-3.3150, -4.0957, -3.4335, ..., -7.9572, -8.4603, -8.2080], [-3.8726, -3.7905, -4.3262, ..., -8.0031, -7.8754, -8.7971], [-3.6082, -3.1969, -3.2719, ..., -6.9769, -6.3158, -7.0805]], grad_fn=<CopySlices>) >>> # predit: get the class that maximize log probaility for each input >>> adaptive_softmax.predict(input) tensor([ 8, 6, 6, 16, 14, 16, 16, 9, 4, 7, 5, 7, 8, 14, 3])
-
Add spectral normalization
nn.utils.spectral_norm
https://github.com/pytorch/pytorch/pull/6929>>> # Usage is similar to weight_norm >>> convT = nn.ConvTranspose2d(3, 64, kernel_size=3, pad=1) >>> # Can specify number of power iterations applied each time, or use default (1) >>> convT = nn.utils.spectral_norm(convT, n_power_iterations=2) >>> >>> # apply to every conv and conv transpose module in a model >>> def add_sn(m): for name, c in m.named_children(): m.add_module(name, add_sn(c)) if isinstance(m, (nn.Conv2d, nn.ConvTranspose2d)): return nn.utils.spectral_norm(m) else: return m >>> my_model = add_sn(my_model)
-
nn.ModuleDict
andnn.ParameterDict
containers https://github.com/pytorch/pytorch/pull/8463 -
Add
nn.init.zeros_
andnn.init.ones_
https://github.com/pytorch/pytorch/pull/7488 -
Add sparse gradient option to pretrained embedding https://github.com/pytorch/pytorch/pull/7492
-
Add max pooling support to
nn.EmbeddingBag
https://github.com/pytorch/pytorch/pull/5725 -
Depthwise convolution support for MKLDNN https://github.com/pytorch/pytorch/pull/8782
-
Add
nn.FeatureAlphaDropout
(featurewise Alpha Dropout layer) https://github.com/pytorch/pytorch/pull/9073
Operators
-
torch.bincount
(count frequency of each value in an integral tensor) https://github.com/pytorch/pytorch/pull/6688>>> input = torch.randint(0, 8, (5,), dtype=torch.int64) >> weights = torch.linspace(0, 1, steps=5) >> input, weights tensor([4, 3, 6, 3, 4]), tensor([ 0.0000, 0.2500, 0.5000, 0.7500, 1.0000]) >> torch.bincount(input) ensor([0, 0, 0, 2, 2, 0, 1]) >> input.bincount(weights) ensor([0.0000, 0.0000, 0.0000, 1.0000, 1.0000, 0.0000, 0.5000])
-
torch.as_tensor
(similar totorch.tensor
but never copies unless necessary) https://github.com/pytorch/pytorch/pull/7109>>> tensor = torch.randn(3, device='cpu', dtype=torch.float32) >>> torch.as_tensor(tensor) # doesn't copy >>> torch.as_tensor(tensor, dtype=torch.float64) # copies due to incompatible dtype >>> torch.as_tensor(tensor, device='cuda') # copies due to incompatible device >>> array = np.array([3, 4.5]) >>> torch.as_tensor(array) # doesn't copy, sharing memory with the numpy array >>> torch.as_tensor(array, device='cuda') # copies due to incompatible device
-
torch.randperm
for CUDA tensors https://github.com/pytorch/pytorch/pull/7606 -
nn.HardShrink
for CUDA tensors https://github.com/pytorch/pytorch/pull/8117 -
torch.flip
(flips a tensor along specified dims) https://github.com/pytorch/pytorch/pull/7873 -
torch.flatten
(flattens a contiguous range of dims) https://github.com/pytorch/pytorch/pull/8578 -
torch.pinverse
(computes svd-based pseudo-inverse) https://github.com/pytorch/pytorch/pull/9052 -
torch.unique
for CUDA tensors https://github.com/pytorch/pytorch/pull/8899 -
torch.erfc
(complementary error function) https://github.com/pytorch/pytorch/pull/9366/files -
torch.isinf
andtorch.isfinite
https://github.com/pytorch/pytorch/pull/9169 https://github.com/pytorch/pytorch/pull/9487 -
torch.reshape_as
https://github.com/pytorch/pytorch/pull/9452 -
Support backward for target tensor in
torch.nn.functional.kl_div
https://github.com/pytorch/pytorch/pull/7839 -
torch.logsumexp
https://github.com/pytorch/pytorch/pull/7254 -
Add batched linear solver to
torch.gesv
https://github.com/pytorch/pytorch/pull/6100 -
torch.sum
now supports summing over multiple dimensions https://github.com/pytorch/pytorch/pull/6152/files -
torch.diagonal
torch.diagflat
to take arbitrary diagonals with numpy semantics https://github.com/pytorch/pytorch/pull/6718 -
tensor.any
andtensor.all
onByteTensor
can now acceptdim
andkeepdim
arguments https://github.com/pytorch/pytorch/pull/4627
Distributions
- Half Cauchy and Half Normal https://github.com/pytorch/pytorch/pull/8411
- Gamma sampling for CUDA tensors https://github.com/pytorch/pytorch/pull/6855
- Allow vectorized counts in Binomial Distribution https://github.com/pytorch/pytorch/pull/6720
Misc
- Autograd automatic anomaly detection for
NaN
and errors occuring in backward. Two functions detect_anomaly and set_detect_anomaly are provided for this. https://github.com/pytorch/pytorch/pull/7677 - Support
reversed(torch.Tensor)
https://github.com/pytorch/pytorch/pull/9216 - Support
hash(torch.device)
https://github.com/pytorch/pytorch/pull/9246 - Support
gzip
intorch.load
https://github.com/pytorch/pytorch/pull/6490
Performance
- Accelerate bernoulli number generation on CPU https://github.com/pytorch/pytorch/pull/7171
- Enable cuFFT plan caching (80% speed-up in certain cases) https://github.com/pytorch/pytorch/pull/8344
- Fix unnecessary copying in
bernoulli_
https://github.com/pytorch/pytorch/pull/8682 - Fix unnecessary copying in
broadcast
https://github.com/pytorch/pytorch/pull/8222 - Speed-up multidim
sum
(2x~6x speed-up in certain cases) https://github.com/pytorch/pytorch/pull/8992 - Vectorize CPU
sigmoid
(>3x speed-up in most cases) https://github.com/pytorch/pytorch/pull/8612 - Optimize CPU
nn.LeakyReLU
andnn.PReLU
(2x speed-up) https://github.com/pytorch/pytorch/pull/9206 - Vectorize
softmax
andlogsoftmax
(4.5x speed-up on single core and 1.8x on 10 threads) https://github.com/pytorch/pytorch/pull/7375 - Speed up
nn.init.sparse
(10-20x speed-up) https://github.com/pytorch/pytorch/pull/6899
Improvements
Tensor printing
- Tensor printing now includes
requires_grad
andgrad_fn
information https://github.com/pytorch/pytorch/pull/8211 - Improve number formatting in tensor print https://github.com/pytorch/pytorch/pull/7632
- Fix scale when printing some tensors https://github.com/pytorch/pytorch/pull/7189
- Speed up printing of large tensors https://github.com/pytorch/pytorch/pull/6876
Neural Networks
-
NaN
is now propagated through many activation functions https://github.com/pytorch/pytorch/pull/8033 - Add
non_blocking
option to nn.Module.to https://github.com/pytorch/pytorch/pull/7312 - Loss modules now allow target to require gradient https://github.com/pytorch/pytorch/pull/8460
- Add
pos_weight
argument tonn.BCEWithLogitsLoss
https://github.com/pytorch/pytorch/pull/6856 - Support
grad_clip
for parameters on different devices https://github.com/pytorch/pytorch/pull/9302 - Removes the requirement that input sequences to
pad_sequence
have to be sorted https://github.com/pytorch/pytorch/pull/7928 -
stride
argument formax_unpool1d
,max_unpool2d
,max_unpool3d
now defaults tokernel_size
https://github.com/pytorch/pytorch/pull/7388 - Allowing calling grad mode context managers (e.g.,
torch.no_grad
,torch.enable_grad
) as decorators https://github.com/pytorch/pytorch/pull/7737 -
torch.optim.lr_scheduler._LRSchedulers
__getstate__
include optimizer info https://github.com/pytorch/pytorch/pull/7757 - Add support for accepting
Tensor
as input inclip_grad_*
functions https://github.com/pytorch/pytorch/pull/7769 - Return
NaN
inmax_pool
/adaptive_max_pool
forNaN
inputs https://github.com/pytorch/pytorch/pull/7670 -
nn.EmbeddingBag
can now handle empty bags in all modes https://github.com/pytorch/pytorch/pull/7389 -
torch.optim.lr_scheduler.ReduceLROnPlateau
is now serializable https://github.com/pytorch/pytorch/pull/7201 - Allow only tensors of floating point dtype to require gradients https://github.com/pytorch/pytorch/pull/7034 and https://github.com/pytorch/pytorch/pull/7185
- Allow resetting of BatchNorm running stats and cumulative moving average https://github.com/pytorch/pytorch/pull/5766
- Set the gradient of
LP-Pool
ing to zero if the sum of all input elements to the power of p is zero https://github.com/pytorch/pytorch/pull/6766
Operators
- Add ellipses ('...') and diagonals (e.g. 'ii→i') to
torch.einsum
https://github.com/pytorch/pytorch/pull/7173 - Add
to
method forPackedSequence
https://github.com/pytorch/pytorch/pull/7319 - Add support for
__floordiv__
and__rdiv__
for integral tensors https://github.com/pytorch/pytorch/pull/7245 -
torch.clamp
now has subgradient 1 at min and max https://github.com/pytorch/pytorch/pull/7049 -
torch.arange
now uses NumPy-style type inference: https://github.com/pytorch/pytorch/pull/7016 - Support infinity norm properly in
torch.norm
andtorch.renorm
https://github.com/pytorch/pytorch/pull/6969 - Allow passing an output tensor via
out=
keyword arugment intorch.dot
andtorch.matmul
https://github.com/pytorch/pytorch/pull/6961
Distributions
- Always enable grad when calculating
lazy_property
https://github.com/pytorch/pytorch/pull/7708
Sparse Tensor
- Add log1p for sparse tensor https://github.com/pytorch/pytorch/pull/8969
- Better support for adding zero-filled sparse tensors https://github.com/pytorch/pytorch/pull/7479
Data Parallel
- Allow modules that return scalars in
nn.DataParallel
https://github.com/pytorch/pytorch/pull/7973 - Allow
nn.parallel.parallel_apply
to take in a list/tuple of tensors https://github.com/pytorch/pytorch/pull/8047
Misc
-
torch.Size
can now accept PyTorch scalars https://github.com/pytorch/pytorch/pull/5676 - Move
torch.utils.data.dataset.random_split
to torch.utils.data.random_split, andtorch.utils.data.dataset.Subset
totorch.utils.data.Subset
https://github.com/pytorch/pytorch/pull/7816 - Add serialization for
torch.device
https://github.com/pytorch/pytorch/pull/7713 - Allow copy.deepcopy of
torch.(int/float/...)*
dtype objects https://github.com/pytorch/pytorch/pull/7699 -
torch.load
can now take atorch.device
as map location https://github.com/pytorch/pytorch/pull/7339
Bug Fixes
- Fix
nn.BCELoss
sometimes returning negative results https://github.com/pytorch/pytorch/pull/8147 - Fix
tensor._indices
on scalar sparse tensor giving wrong result https://github.com/pytorch/pytorch/pull/8197 - Fix backward of
tensor.as_strided
not working properly when input has overlapping memory https://github.com/pytorch/pytorch/pull/8721 - Fix
x.pow(0)
gradient when x contains 0 https://github.com/pytorch/pytorch/pull/8945 - Fix CUDA
torch.svd
andtorch.eig
returning wrong results in certain cases https://github.com/pytorch/pytorch/pull/9082 - Fix
nn.MSELoss
having low precision https://github.com/pytorch/pytorch/pull/9287 - Fix segmentation fault when calling
torch.Tensor.grad_fn
https://github.com/pytorch/pytorch/pull/9292 - Fix
torch.topk
returning wrong results when input isn't contiguous https://github.com/pytorch/pytorch/pull/9441 - Fix segfault in convolution on CPU with large
inputs
/dilation
https://github.com/pytorch/pytorch/pull/9274 - Fix
avg_pool2/3d
count_include_pad
having default valueFalse
(should beTrue
) https://github.com/pytorch/pytorch/pull/8645 - Fix
nn.EmbeddingBag
'smax_norm
option https://github.com/pytorch/pytorch/pull/7959 - Fix returning scalar input in Python autograd function https://github.com/pytorch/pytorch/pull/7934
- Fix THCUNN
SpatialDepthwiseConvolution
assuming contiguity https://github.com/pytorch/pytorch/pull/7952 - Fix bug in seeding random module in
DataLoader
https://github.com/pytorch/pytorch/pull/7886 - Don't modify variables in-place for
torch.einsum
https://github.com/pytorch/pytorch/pull/7765 - Make return uniform in lbfgs step https://github.com/pytorch/pytorch/pull/7586
- The return value of
uniform.cdf()
is now clamped to[0..1]
https://github.com/pytorch/pytorch/pull/7538 - Fix advanced indexing with negative indices https://github.com/pytorch/pytorch/pull/7345
-
CUDAGenerator
will not initialize on the current device anymore, which will avoid unnecessary memory allocation onGPU:0
https://github.com/pytorch/pytorch/pull/7392 - Fix
tensor.type(dtype)
not preserving device https://github.com/pytorch/pytorch/pull/7474 - Batch sampler should return the same results when used alone or in dataloader with
num_workers
> 0 https://github.com/pytorch/pytorch/pull/7265 - Fix broadcasting error in LogNormal, TransformedDistribution https://github.com/pytorch/pytorch/pull/7269
- Fix
torch.max
andtorch.min
on CUDA in presence ofNaN
https://github.com/pytorch/pytorch/pull/7052 - Fix
torch.tensor
device-type calculation when used with CUDA https://github.com/pytorch/pytorch/pull/6995 - Fixed a missing
'='
innn.LPPoolNd
repr function https://github.com/pytorch/pytorch/pull/9629
Documentation
- Expose and document
torch.autograd.gradcheck
andtorch.autograd.gradgradcheck
https://github.com/pytorch/pytorch/pull/8166 - Document
tensor.scatter_add_
https://github.com/pytorch/pytorch/pull/9630 - Document variants of
torch.add
andtensor.add_
, e.g.tensor.add(value=1, other)
-> Tensor https://github.com/pytorch/pytorch/pull/9027 - Document
torch.logsumexp
https://github.com/pytorch/pytorch/pull/8428 - Document
torch.sparse_coo_tensor
https://github.com/pytorch/pytorch/pull/8152 - Document
torch.utils.data.dataset.random_split
https://github.com/pytorch/pytorch/pull/7676 - Document
torch.nn.GroupNorm
https://github.com/pytorch/pytorch/pull/7086 - A lot of other various documentation improvements including RNNs,
ConvTransposeNd
,Fold
/Unfold
,Embedding
/EmbeddingBag
, Loss functions, etc.