v0.3.1

pytorch/pytorch

版本发布时间: 2018-02-14 08:36:58

pytorch/pytorch最新发布版本:v2.5.1(2024-10-30 01:58:24)

Binaries

Removed support for CUDA capability 3.0 and 5.0 (they still work for source builds for now, but the commitment to support this forward is removed)
Stop binary releases for CUDA 7.5
Add CPU-only binary releases that are 10x smaller in size than the full binary with CUDA capabilities.

As always, links to our binaries are on http://pytorch.org

New features

Add Cosine Annealing Learning Rate Scheduler https://github.com/pytorch/pytorch/pull/3311
add reduce argument to PoissonNLLLoss to be able to compute unreduced losses https://github.com/pytorch/pytorch/pull/3770
Allow target.requires_grad=True in l1_loss and mse_loss (compute loss wrt target) https://github.com/pytorch/pytorch/pull/3876
Add random_split that randomly splits a dataset into non-overlapping new datasets of given lengths https://github.com/pytorch/pytorch/pull/4435
Introduced scopes to annotate ONNX graphs to have better TensorBoard visualization of models https://github.com/pytorch/pytorch/pull/5153 Allow map_location in torch.load to be a string, such as map_location='cpu' or map_location='cuda:2' https://github.com/pytorch/pytorch/pull/4203

Bug Fixes

Data Loader / Datasets / Multiprocessing

Made DataLoader workers more verbose on bus error and segfault. Additionally, add a timeout option to the DataLoader, which will error if sample loading time exceeds the given value. https://github.com/pytorch/pytorch/pull/3474
DataLoader workers used to all have the same random number generator (RNG) seed because of the semantics of fork syscall. Now, each worker will have it's RNG seed set to base_seed + worker_id where base_seed is a random int64 value generated by the parent process. You may use torch.initial_seed() to access this value in worker_init_fn, which can be used to set other seeds (e.g. NumPy) before data loading. worker_init_fn is an optional argument that will be called on each worker subprocess with the worker id as input, after seeding and before data loading https://github.com/pytorch/pytorch/pull/4018
Add additional signal handling in DataLoader worker processes when workers abruptly die.
Negative value for n_workers now gives a ValueError https://github.com/pytorch/pytorch/pull/4019
fixed a typo in ConcatDataset.cumulative_sizes attribute name https://github.com/pytorch/pytorch/pull/3534
Accept longs in default_collate for dataloader in python 2 https://github.com/pytorch/pytorch/pull/4001
Re-initialize autograd engine in child processes https://github.com/pytorch/pytorch/pull/4158
Fix distributed dataloader so it pins memory to current GPU not GPU 0. https://github.com/pytorch/pytorch/pull/4196

CUDA / CuDNN

allow cudnn for fp16 batch norm https://github.com/pytorch/pytorch/pull/4021
Use enabled argument in torch.autograd.profiler.emit_nvtx (was being ignored) https://github.com/pytorch/pytorch/pull/4032
Fix cuBLAS arguments for fp16 torch.dot https://github.com/pytorch/pytorch/pull/3660
Fix CUDA index_fill_ boundary check with small tensor size https://github.com/pytorch/pytorch/pull/3953
Fix CUDA Multinomial checks https://github.com/pytorch/pytorch/pull/4009
Fix CUDA version typo in warning https://github.com/pytorch/pytorch/pull/4175
Initialize cuda before setting cuda tensor types as default https://github.com/pytorch/pytorch/pull/4788
Add missing lazy_init in cuda python module https://github.com/pytorch/pytorch/pull/4907
Lazy init order in set device, should not be called in getDevCount https://github.com/pytorch/pytorch/pull/4918
Make torch.cuda.empty_cache() a no-op when cuda is not initialized https://github.com/pytorch/pytorch/pull/4936

CPU

Assert MKL ld* conditions for ger, gemm, and gemv https://github.com/pytorch/pytorch/pull/4056

torch operators

Fix tensor.repeat when the underlying storage is not owned by torch (for example, coming from numpy) https://github.com/pytorch/pytorch/pull/4084
Add proper shape checking to torch.cat https://github.com/pytorch/pytorch/pull/4087
Add check for slice shape match in index_copy_ and index_add_. https://github.com/pytorch/pytorch/pull/4342
Fix use after free when advanced indexing tensors with tensors https://github.com/pytorch/pytorch/pull/4559
Fix triu and tril for zero-strided inputs on gpu https://github.com/pytorch/pytorch/pull/4962
Fix blas addmm (gemm) condition check https://github.com/pytorch/pytorch/pull/5048
Fix topk work size computation https://github.com/pytorch/pytorch/pull/5053
Fix reduction functions to respect the stride of the output https://github.com/pytorch/pytorch/pull/4995
Improve float precision stability of linspace op, fix 4419. https://github.com/pytorch/pytorch/pull/4470

autograd

Fix python gc race condition with THPVariable_traverse https://github.com/pytorch/pytorch/pull/4437

nn layers

Fix padding_idx getting ignored in backward for Embedding(sparse=True) https://github.com/pytorch/pytorch/pull/3842 Fix cosine_similarity's output shape https://github.com/pytorch/pytorch/pull/3811
Add rnn args check https://github.com/pytorch/pytorch/pull/3925
NLLLoss works for arbitrary dimensions https://github.com/pytorch/pytorch/pull/4654
More strict shape check on Conv operators https://github.com/pytorch/pytorch/pull/4637
Fix maxpool3d / avgpool3d crashes https://github.com/pytorch/pytorch/pull/5052
Fix setting using running stats in InstanceNorm*d https://github.com/pytorch/pytorch/pull/4444

Multi-GPU

Fix DataParallel scattering for empty lists / dicts / tuples https://github.com/pytorch/pytorch/pull/3769
Fix refcycles in DataParallel scatter and gather (fix elevated memory usage) https://github.com/pytorch/pytorch/pull/4988
Broadcast output requires_grad only if corresponding input requires_grad https://github.com/pytorch/pytorch/pull/5061

core

Remove hard file offset reset in load() https://github.com/pytorch/pytorch/pull/3695
Have sizeof account for size of stored elements https://github.com/pytorch/pytorch/pull/3821
Fix undefined FileNotFoundError https://github.com/pytorch/pytorch/pull/4384
make torch.set_num_threads also set MKL threads (take 2) https://github.com/pytorch/pytorch/pull/5002

others

Fix wrong learning rate evaluation in CosineAnnealingLR in Python 2 https://github.com/pytorch/pytorch/pull/4656

Performance improvements

slightly simplified math in IndexToOffset https://github.com/pytorch/pytorch/pull/4040
improve performance of maxpooling backwards https://github.com/pytorch/pytorch/pull/4106
Add cublas batched gemm support. https://github.com/pytorch/pytorch/pull/4151
Rearrange dimensions for pointwise operations for better performance. https://github.com/pytorch/pytorch/pull/4174
Improve memory access patterns for index operations. https://github.com/pytorch/pytorch/pull/4493
Improve CUDA softmax performance https://github.com/pytorch/pytorch/pull/4973
Fixed double memory accesses of several pointwise operations. https://github.com/pytorch/pytorch/pull/5068

Documentation and UX Improvements

Better error messages for blas ops with cuda.LongTensor https://github.com/pytorch/pytorch/pull/4160
Add missing trtrs, orgqr, ormqr docs https://github.com/pytorch/pytorch/pull/3720
change doc for Adaptive Pooling https://github.com/pytorch/pytorch/pull/3746
Fix MultiLabelMarginLoss docs https://github.com/pytorch/pytorch/pull/3836
More docs for Conv1d Conv2d https://github.com/pytorch/pytorch/pull/3870
Improve Tensor.scatter_ doc https://github.com/pytorch/pytorch/pull/3937
[docs] rnn.py: Note zero defaults for hidden state/cell https://github.com/pytorch/pytorch/pull/3951
Improve Tensor.new doc https://github.com/pytorch/pytorch/pull/3954
Improve docs for torch and torch.Tensor https://github.com/pytorch/pytorch/pull/3969
Added explicit tuple dimensions to doc for Conv1d. https://github.com/pytorch/pytorch/pull/4136
Improve svd doc https://github.com/pytorch/pytorch/pull/4155
Correct instancenorm input size https://github.com/pytorch/pytorch/pull/4171
Fix StepLR example docs https://github.com/pytorch/pytorch/pull/4478

相关地址：原始地址下载(tar) 下载(zip)

查看：2018-02-14发行的版本