v11.0.0rc1
版本发布时间: 2022-06-30 14:53:52
cupy/cupy最新发布版本:v13.3.0(2024-08-22 15:42:45)
This is the release note of v11.0.0rc1. See here for the complete list of solved issues and merged PRs.
We are going to release v11.0.0 on July 28th. Please start testing your workload with this release candidate (pip install --pre cupy-cuda11x -f https://pip.cupy.dev/pre
). See the Upgrade Guide for the list of possible breaking changes.
We are running a Gitter chat for general discussions and quick questions. Feel free to join the channel to talk with developers and users!
Highlights
Support CUDA 11.7 (#6767)
Full support for CUDA 11.7 has been added as of this release. Binary packages can be installed with the following command: pip install --pre cupy-cuda11x -f https://pip.cupy.dev/pre
Unified Binary Package for CUDA 11.2 or later (#6730)
CuPy v11 provides a unified binary package named cupy-cuda11x
that supports all CUDA 11.2+ releases. This replaces per-CUDA version binary packages (cupy-cuda112
, cupy-cuda113
, …, cupy-cuda117
) provided in CuPy v10 or earlier.
Note that CUDA 11.1 or earlier still requires per-CUDA version binary packages. cupy-cuda102
, cupy-cuda110
, and cupy-cuda111
will be provided for CUDA 10.2, 11.0, and 11.1, respectively.
Binary Package for Arm Platform (#6705)
CuPy v11 provides cupy-cuda11x
binary package built for aarch64, which supports CUDA 11.2+ Arm SBSA and JetPack 5.
These wheels are available through our Pip index: pip install --pre cupy-cuda11x -f https://pip.cupy.dev/aarch64
Support for ndarray
subclassing (#6720, #6755)
This release allows users to subclass cupy.ndarray
, using the same protocol as NumPy:
class C(cupy.ndarray):
def __new__(cls, *args, info=None, **kwargs):
obj = super().__new__(cls, *args, **kwargs)
obj.info = info
return obj
def __array_finalize__(self, obj):
if obj is None:
return
self.info = getattr(obj, 'info', None)
a = C([0, 1, 2, 3], info='information')
assert type(a) is C
assert issubclass(type(a), cupy.ndarray)
assert a.info == 'information'
Note that view casting and new from template mechanisms are also supported as described by the NumPy documentation.
Add Collective Communication APIs in cupyx.distributed
for Sparse Matrices
All the collective calls implemented for dense matrices now support sparse matrices. Users interested in this feature should install mpi4py
in order to perform an efficient metadata exchange.
Google Summer of Code 2022
We would like to give a warm welcome to @khushi-411 who will be working in adding support for the cupyx.scipy.interpolate
APIs as part of her GSoC internship!
Changes without compatibility
Bump base Docker image to the latest supported one (#6802)
CuPy official Docker images have been upgraded. Users relying on these images may suffer from compatibility issues with preinstalled tools or libraries.
Changes
New Features
- Add
cupy.setxor1d
(#6582) - Add initial
cupyx.spatial.distance
support from pylibraft (#6690) - Support
cupy.ndarray
subclassing - Part 2 - View casting (#6720) - Add sparse
broadcast
(#6758) - Add sparse
reduce
(#6761) - Add sparse
all_reduce
and minor fixes (#6762) - Add sparse
all_to_all
,reduce_scatter
,send_recv
(#6765) - Subclass
cupy.ndarray
subclassing - Part 3 - New from template (ufunc) (#6775) - Add
cupyx.scipy.special.log_ndtr
(#6776) - Add
cupyx.scipy.special.expn
(#6790)
Enhancements
- Utilize CUDA Enhanced Compatibility (#6730)
- Fix to return correct CUDA version when in CUDA Python mode (#6736)
- Support CUDA 11.7 (#6767)
- Make the warning for cupy.array_api say "cupy" instead of "numpy" (#6791)
- Utilize CUDA Enhanced Compatibility in all wrappers (#6799)
- Add support for
cupy-cuda11x
wheel (#6800) - Bump base Docker image to the latest supported one (#6802)
- Remove
CUPY_CUDA_VERSION
as much as possible (#6810) - Raise UserWarning in
cupy.cuda.compile_with_cache
(#6818) - cupy-wheel: Use NVRTC to infer the toolkit version (#6819)
- Support NumPy 1.23 (#6820)
- Fix for NumPy 1.23 (#6807)
Performance Improvements
- Improved integer matrix multiplication performance by modifying tuning parameters (#6703)
- Use fast convolution algorithm in
cupy.poly1d.__pow__
(#6770)
Bug Fixes
- Fix polynomial tests (#6721)
- Fix batched matmul for integral numbers (#6725)
- Fix
cupy.median
for NaN inputs (#6759) - Fix required cusparse symbol not loaded in CUDA 11.1.1 (#6806)
Code Fixes
- Add type annotation in
_cuda_types.py
(#6726) - Subclass rename (#6746)
- Add type annotation to JIT internal types (#6778)
Documentation
- Add CUDA 11.7 on documents (#6768)
- Improved NVTX documentation (#6774)
- Fix docs to hide
ndarray_base
(#6782) - Update docs for
cupy-cuda11x
wheel (#6803) - Bump NumPy version used in docs (#6824)
- Add upgrade guide for CuPy v11 (#6826)
Tests
- Fix mempool tests (#6591)
- CI: Fix prep script to show build failure details (#6781)
- Fix a potential variable misuse bug (#6786)
- Fix CI Docker image build failing in head test (#6804)
- Tiny clean up in CI script (#6809)
Others
- Fix docker workflow to push to latest image (#6832)
Contributors
The CuPy Team would like to thank all those who contributed to this release!
@andoorve @asi1024 @asmeurer @cjnolet @emcastillo @khushi-411 @kmaehashi @leofang @LostBenjamin @pri1311 @rietmann-nv @takagi
1、 cupy_cuda102-11.0.0rc1-cp310-cp310-manylinux1_x86_64.whl 61.9MB
2、 cupy_cuda102-11.0.0rc1-cp310-cp310-manylinux2014_aarch64.whl 35.72MB
3、 cupy_cuda102-11.0.0rc1-cp310-cp310-win_amd64.whl 43.33MB
4、 cupy_cuda102-11.0.0rc1-cp37-cp37m-manylinux1_x86_64.whl 60.31MB
5、 cupy_cuda102-11.0.0rc1-cp37-cp37m-manylinux2014_aarch64.whl 33.96MB
6、 cupy_cuda102-11.0.0rc1-cp37-cp37m-win_amd64.whl 43.37MB
7、 cupy_cuda102-11.0.0rc1-cp38-cp38-manylinux1_x86_64.whl 63.63MB
8、 cupy_cuda102-11.0.0rc1-cp38-cp38-manylinux2014_aarch64.whl 37.21MB
9、 cupy_cuda102-11.0.0rc1-cp38-cp38-win_amd64.whl 43.46MB
10、 cupy_cuda102-11.0.0rc1-cp39-cp39-manylinux1_x86_64.whl 61.82MB
11、 cupy_cuda102-11.0.0rc1-cp39-cp39-manylinux2014_aarch64.whl 35.65MB
12、 cupy_cuda102-11.0.0rc1-cp39-cp39-win_amd64.whl 43.46MB
13、 cupy_cuda110-11.0.0rc1-cp310-cp310-manylinux1_x86_64.whl 76.73MB
14、 cupy_cuda110-11.0.0rc1-cp310-cp310-win_amd64.whl 58.11MB
15、 cupy_cuda110-11.0.0rc1-cp37-cp37m-manylinux1_x86_64.whl 75.14MB
16、 cupy_cuda110-11.0.0rc1-cp37-cp37m-win_amd64.whl 58.15MB
17、 cupy_cuda110-11.0.0rc1-cp38-cp38-manylinux1_x86_64.whl 78.45MB
18、 cupy_cuda110-11.0.0rc1-cp38-cp38-win_amd64.whl 58.25MB
19、 cupy_cuda110-11.0.0rc1-cp39-cp39-manylinux1_x86_64.whl 76.65MB
20、 cupy_cuda110-11.0.0rc1-cp39-cp39-win_amd64.whl 58.25MB
21、 cupy_cuda111-11.0.0rc1-cp310-cp310-manylinux1_x86_64.whl 95.91MB
22、 cupy_cuda111-11.0.0rc1-cp310-cp310-win_amd64.whl 78.27MB
23、 cupy_cuda111-11.0.0rc1-cp37-cp37m-manylinux1_x86_64.whl 94.32MB
24、 cupy_cuda111-11.0.0rc1-cp37-cp37m-win_amd64.whl 78.31MB
25、 cupy_cuda111-11.0.0rc1-cp38-cp38-manylinux1_x86_64.whl 97.63MB
26、 cupy_cuda111-11.0.0rc1-cp38-cp38-win_amd64.whl 78.41MB
27、 cupy_cuda111-11.0.0rc1-cp39-cp39-manylinux1_x86_64.whl 95.83MB
28、 cupy_cuda111-11.0.0rc1-cp39-cp39-win_amd64.whl 78.4MB
29、 cupy_cuda11x-11.0.0rc1-cp310-cp310-manylinux1_x86_64.whl 79.89MB
30、 cupy_cuda11x-11.0.0rc1-cp310-cp310-manylinux2014_aarch64.whl 82.29MB
31、 cupy_cuda11x-11.0.0rc1-cp310-cp310-win_amd64.whl 61.08MB
32、 cupy_cuda11x-11.0.0rc1-cp37-cp37m-manylinux1_x86_64.whl 78.3MB
33、 cupy_cuda11x-11.0.0rc1-cp37-cp37m-manylinux2014_aarch64.whl 79.8MB
34、 cupy_cuda11x-11.0.0rc1-cp37-cp37m-win_amd64.whl 61.12MB
35、 cupy_cuda11x-11.0.0rc1-cp38-cp38-manylinux1_x86_64.whl 81.61MB
36、 cupy_cuda11x-11.0.0rc1-cp38-cp38-manylinux2014_aarch64.whl 83.92MB
37、 cupy_cuda11x-11.0.0rc1-cp38-cp38-win_amd64.whl 61.22MB
38、 cupy_cuda11x-11.0.0rc1-cp39-cp39-manylinux1_x86_64.whl 79.81MB
39、 cupy_cuda11x-11.0.0rc1-cp39-cp39-manylinux2014_aarch64.whl 82.2MB
40、 cupy_cuda11x-11.0.0rc1-cp39-cp39-win_amd64.whl 61.21MB
41、 cupy_rocm_4_3-11.0.0rc1-cp310-cp310-manylinux1_x86_64.whl 36.6MB
42、 cupy_rocm_4_3-11.0.0rc1-cp37-cp37m-manylinux1_x86_64.whl 35.23MB
43、 cupy_rocm_4_3-11.0.0rc1-cp38-cp38-manylinux1_x86_64.whl 38.13MB
44、 cupy_rocm_4_3-11.0.0rc1-cp39-cp39-manylinux1_x86_64.whl 36.53MB
45、 cupy_rocm_5_0-11.0.0rc1-cp310-cp310-manylinux1_x86_64.whl 54.67MB
46、 cupy_rocm_5_0-11.0.0rc1-cp37-cp37m-manylinux1_x86_64.whl 53.3MB
47、 cupy_rocm_5_0-11.0.0rc1-cp38-cp38-manylinux1_x86_64.whl 56.2MB
48、 cupy_rocm_5_0-11.0.0rc1-cp39-cp39-manylinux1_x86_64.whl 54.6MB