v1.9.0-beta
版本发布时间: 2021-11-20 08:57:40
NVIDIA/gpu-operator最新发布版本:v24.6.1(2024-08-13 05:24:17)
New Features
- Support for NVIDIA Data Center GPU Driver version
470.82.01
. - MIG Manager support with Preinstalled GPU Drivers on the node.
- Entitlement free driver builds using DriverToolkit images on RedHat OpenShift 4.9.
- Support for enabling GPUDirect RDMA with Preinstalled MOFED drivers on the node using
driver.rdma.useHostMofed
option. - Upgrade support of GPU Operator and Operand Pods using Operator Lifecycle Manager(OLM) on RedHat OpenShift.
Improvements
- Automatic detection of default runtime used in the cluster and deprecate
operator.defaultRuntime
parameter. - Installation of GPU Operator and its Operands into a single user specified Namespace.
- Automatic detection and unload of
nouveau
driver prior to installing NVIDIA GPU Driver on each node. - Option to mount a ConfigMap of self-signed certificates into driver container using
driver.certConfig.name
parameter. This can be used for SSL connections to Proxy server or private package repositories.
Fixed issues
- DCGM Exporter in
CrashLoopBackOff
as it cannot connect to DCGM port on the same node.
Known Limitations
- Upgrades from
v1.8.x
tov1.9.x
are not supported due to change in default Namespacegpu-operator-resources
used earlier. Starting fromv1.9.x
user can specify custom Namespace for both GPU Operator and Operand Pods to be installed. User would need to un-install old version prior to installingv1.9.x
version of GPU Operator. - Collection of GPU metrics in MIG mode is not supported with
470+
drivers.