Skip to content

Commit f392a5e

Browse files
yansun1996sajmera-pensando
authored andcommitted
[Doc] Init release notes for v1.3.1 and amend tag changes (#859)
1 parent 6d68276 commit f392a5e

28 files changed

+52
-56
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ endif
55

66
# PROJECT_VERSION defines the project version.
77
# Update this value when you upgrade the version of your project.
8-
PROJECT_VERSION ?= v1.2.0
8+
PROJECT_VERSION ?= v1.3.1
99

1010
####################################
1111
# GPU Operator Image Build variables

bundle/manifests/amd-gpu-operator.clusterserviceversion.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ metadata:
3232
capabilities: Seamless Upgrades
3333
categories: AI/Machine Learning,Monitoring
3434
containerImage: docker.io/rocm/gpu-operator:v1.2.0
35-
createdAt: "2025-08-06T05:54:02Z"
35+
createdAt: "2025-08-09T01:44:36Z"
3636
description: |-
3737
Operator responsible for deploying AMD GPU kernel drivers, device plugin, device test runner and device metrics exporter
3838
For more information, visit [documentation](https://instinct.docs.amd.com/projects/gpu-operator/en/latest/)
@@ -44,7 +44,7 @@ metadata:
4444
features.operators.openshift.io/token-auth-aws: "false"
4545
features.operators.openshift.io/token-auth-azure: "false"
4646
features.operators.openshift.io/token-auth-gcp: "false"
47-
metricsExporterImage: docker.io/rocm/device-metrics-exporter:v1.2.0
47+
metricsExporterImage: docker.io/rocm/device-metrics-exporter:v1.3.1
4848
nodelabellerImage: docker.io/rocm/k8s-device-plugin:labeller-rhubi-latest
4949
operatorframework.io/cluster-monitoring: "true"
5050
operatorframework.io/suggested-namespace: openshift-amd-gpu
@@ -53,7 +53,7 @@ metadata:
5353
operators.operatorframework.io/project_layout: go.kubebuilder.io/v3
5454
repository: https://github.com/ROCm/gpu-operator
5555
support: Advanced Micro Devices, Inc.
56-
name: amd-gpu-operator.v1.2.0
56+
name: amd-gpu-operator.v1.3.1
5757
namespace: placeholder
5858
spec:
5959
apiservicedefinitions: {}
@@ -1245,4 +1245,4 @@ spec:
12451245
maturity: stable
12461246
provider:
12471247
name: Advanced Micro Devices, Inc.
1248-
version: 1.2.0
1248+
version: 1.3.1

config/manifests/bases/amd-gpu-operator.clusterserviceversion.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ metadata:
1717
features.operators.openshift.io/token-auth-aws: "false"
1818
features.operators.openshift.io/token-auth-azure: "false"
1919
features.operators.openshift.io/token-auth-gcp: "false"
20-
metricsExporterImage: docker.io/rocm/device-metrics-exporter:v1.2.0
20+
metricsExporterImage: docker.io/rocm/device-metrics-exporter:v1.3.1
2121
nodelabellerImage: docker.io/rocm/k8s-device-plugin:labeller-rhubi-latest
2222
operatorframework.io/cluster-monitoring: "true"
2323
operatorframework.io/suggested-namespace: openshift-amd-gpu

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
external_projects_current_project = "amd-gpu-operator"
1212

1313
project = "AMD GPU Operator"
14-
version = "1.3.0"
14+
version = "1.3.1"
1515
release = version
1616
html_title = f"{project} {version}"
1717
author = "Advanced Micro Devices, Inc."

docs/drivers/installation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ spec:
8080
serviceType: "NodePort"
8181
# Node port for metrics exporter service, metrics endpoint $node-ip:$nodePort
8282
nodePort: 32500
83-
image: docker.io/rocm/device-metrics-exporter:v1.2.0
83+
image: docker.io/rocm/device-metrics-exporter:v1.3.1
8484

8585
# Specifythe node to be managed by this DeviceConfig Custom Resource
8686
selector:
@@ -134,7 +134,7 @@ spec:
134134
serviceType: "NodePort"
135135
# Node port for metrics exporter service, metrics endpoint $node-ip:$nodePort
136136
nodePort: 32500
137-
image: docker.io/rocm/device-metrics-exporter:v1.2.0
137+
image: docker.io/rocm/device-metrics-exporter:v1.3.1
138138

139139
# Specifythe node to be managed by this DeviceConfig Custom Resource
140140
selector:

docs/fulldeviceconfig.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Below is an example of a full DeviceConfig CR that can be used to install the AM
3030
apiVersion: amd.com/v1alpha1
3131
kind: DeviceConfig #New Custom Resource Definition used by the GPU Operator
3232
metadata:
33-
# Name of the DeviceConfig CR. Note that the name of device plugin, node-labeller and metric-explorter pods will be prefixed with
33+
# Name of the DeviceConfig CR. Note that the name of device plugin, node-labeller and metric-exporter pods will be prefixed with
3434
name: gpu-operator
3535
namespace: kube-amd-gpu # Namespace for the GPU Operator and it's components
3636
spec:
@@ -147,7 +147,7 @@ Below is an example of a full DeviceConfig CR that can be used to install the AM
147147
serviceType: ClusterIP # ServiceType used to expose the Metrics Exporter endpoint. Can be either `ClusterIp` or `NodePort`.
148148
port: 5000 # Note if specifying NodePort as the serviceType use `32500` as the port number must be between 30000-32767
149149
# (Optional) Specifying metrics exporter image is optional. Default imagename shown here if not specified.
150-
image: rocm/device-metrics-exporter:v1.2.0 # Change this to trigger metrics exporter upgrade on CR update
150+
image: rocm/device-metrics-exporter:v1.3.1 # Change this to trigger metrics exporter upgrade on CR update
151151
imagePullPolicy: "IfNotPresent" # image pull policy for the metrics exporter container. Either `Always`, `IfNotPresent` or `Never`
152152
# imagePullPolicy default value is "IfNotPresent" for valid tags, "Always" for no tag or "latest" tag
153153
config:
@@ -187,7 +187,7 @@ Below is an example of a full DeviceConfig CR that can be used to install the AM
187187
serviceType: ClusterIP # ServiceType used to expose the Metrics Exporter endpoint. Can be either `ClusterIp` or `NodePort`.
188188
port: 5000 # Note if specifying NodePort as the serviceType use `32500` as the port number must be between 30000-32767
189189
# (Optional) Specifying metrics exporter image is optional. Default imagename shown here if not specified.
190-
image: docker.io/rocm/test-runner:v1.2.0-beta.0 # Change this to trigger metrics exporter upgrade on CR update
190+
image: docker.io/rocm/test-runner:v1.3.1 # Change this to trigger metrics exporter upgrade on CR update
191191
imagePullPolicy: "IfNotPresent" # image pull policy for the test runner container. Either `Always`, `IfNotPresent` or `Never`
192192
# imagePullPolicy default value is "IfNotPresent" for valid tags, "Always" for no tag or "latest" tag
193193
config:

docs/installation/kubernetes-helm.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ The following parameters are able to be configued when using the Helm Chart. In
155155
|-----|------|---------|-------------|
156156
| controllerManager.affinity | object | `{"nodeAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"preference":{"matchExpressions":[{"key":"node-role.kubernetes.io/control-plane","operator":"Exists"}]},"weight":1}]}}` | Deployment affinity configs for controller manager |
157157
| controllerManager.manager.image.repository | string | `"docker.io/rocm/gpu-operator"` | AMD GPU operator controller manager image repository |
158-
| controllerManager.manager.image.tag | string | `"v1.2.0"` | AMD GPU operator controller manager image tag |
158+
| controllerManager.manager.image.tag | string | `"v1.3.1"` | AMD GPU operator controller manager image tag |
159159
| controllerManager.manager.imagePullPolicy | string | `"Always"` | Image pull policy for AMD GPU operator controller manager pod |
160160
| controllerManager.manager.imagePullSecrets | string | `""` | Image pull secret name for pulling AMD GPU operator controller manager image if registry needs credential to pull image |
161161
| controllerManager.manager.resources.limits.cpu | string | `"1000m"` | CPU limits for the controller manager. Consider increasing for large clusters |
@@ -173,12 +173,12 @@ The following parameters are able to be configued when using the Helm Chart. In
173173
| kmm.controller.manager.containerSecurityContext.allowPrivilegeEscalation | bool | `false` | |
174174
| kmm.controller.manager.env.relatedImageBuild | string | `"gcr.io/kaniko-project/executor:v1.23.2"` | KMM kaniko builder image for building driver image within cluster |
175175
| kmm.controller.manager.env.relatedImageBuildPullSecret | string | `""` | Image pull secret name for pulling KMM kaniko builder image if registry needs credential to pull image |
176-
| kmm.controller.manager.env.relatedImageSign | string | `"docker.io/rocm/kernel-module-management-signimage:v1.2.0"` | KMM signer image for signing driver image's kernel module with given key pairs within cluster |
176+
| kmm.controller.manager.env.relatedImageSign | string | `"docker.io/rocm/kernel-module-management-signimage:v1.3.1"` | KMM signer image for signing driver image's kernel module with given key pairs within cluster |
177177
| kmm.controller.manager.env.relatedImageSignPullSecret | string | `""` | Image pull secret name for pulling KMM signer image if registry needs credential to pull image |
178-
| kmm.controller.manager.env.relatedImageWorker | string | `"docker.io/rocm/kernel-module-management-worker:v1.2.0"` | KMM worker image for loading / unloading driver kernel module on worker nodes |
178+
| kmm.controller.manager.env.relatedImageWorker | string | `"docker.io/rocm/kernel-module-management-worker:v1.3.1"` | KMM worker image for loading / unloading driver kernel module on worker nodes |
179179
| kmm.controller.manager.env.relatedImageWorkerPullSecret | string | `""` | Image pull secret name for pulling KMM worker image if registry needs credential to pull image |
180180
| kmm.controller.manager.image.repository | string | `"docker.io/rocm/kernel-module-management-operator"` | KMM controller manager image repository |
181-
| kmm.controller.manager.image.tag | string | `"v1.2.0"` | KMM controller manager image tag |
181+
| kmm.controller.manager.image.tag | string | `"v1.3.1"` | KMM controller manager image tag |
182182
| kmm.controller.manager.imagePullPolicy | string | `"Always"` | Image pull policy for KMM controller manager pod |
183183
| kmm.controller.manager.imagePullSecrets | string | `""` | Image pull secret name for pulling KMM controller manager image if registry needs credential to pull image |
184184
| kmm.controller.manager.resources.limits.cpu | string | `"500m"` | |
@@ -332,7 +332,7 @@ spec:
332332
serviceType: "NodePort"
333333
# Node port for metrics exporter service, metrics endpoint $node-ip:$nodePort
334334
nodePort: 32500
335-
image: docker.io/rocm/device-metrics-exporter:v1.2.0
335+
image: docker.io/rocm/device-metrics-exporter:v1.3.1
336336

337337
# Specifythe node to be managed by this DeviceConfig Custom Resource
338338
selector:
@@ -382,7 +382,7 @@ spec:
382382
serviceType: "NodePort"
383383
# Node port for metrics exporter service, metrics endpoint $node-ip:$nodePort
384384
nodePort: 32500
385-
image: docker.io/rocm/device-metrics-exporter:v1.2.0
385+
image: docker.io/rocm/device-metrics-exporter:v1.3.1
386386

387387
# Specifythe node to be managed by this DeviceConfig Custom Resource
388388
selector:

docs/metrics/exporter.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ metricsExporter:
4545
nodePort: 32500
4646

4747
# image for the metrics-exporter container
48-
image: "rocm/device-metrics-exporter:v1.2.0"
48+
image: "rocm/device-metrics-exporter:v1.3.1"
4949

5050
```
5151

docs/releasenotes.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,6 @@ The AMD GPU Operator v1.3.1 release extends platform support to OpenShift v4.19
3636

3737
### Documentation Updates
3838

39-
- Updated [Release notes](https://instinct.docs.amd.com/projects/gpu-operator/en/latest/releasenotes.html) detailing new features in v1.3.1.
40-
- Updated GPU Operator install instructions to include the default DeviceConfig custom resource that gets created and how to skip installing it if desired.
41-
42-
### Known Limitations
43-
4439
> **Note:** All current and historical limitations for the GPU Operator, including their latest statuses and any associated workarounds or fixes, are tracked in the following documentation page: [Known Issues and Limitations](https://instinct.docs.amd.com/projects/gpu-operator/en/latest/knownlimitations.html).
4540
Please refer to this page regularly for the most up-to-date information.
4641

docs/test/auto-unhealthy-device-test.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,15 +25,15 @@ metricsExporter:
2525
nodePort: 32500
2626

2727
# image for the metrics-exporter container
28-
image: "rocm/device-metrics-exporter:v1.2.0"
28+
image: "rocm/device-metrics-exporter:v1.3.1"
2929

3030
# Specify the test runner config
3131
testRunner:
3232
# To enable/disable the test runner, disabled by default
3333
enable: true
3434

3535
# image for the test runner container
36-
image: docker.io/rocm/test-runner:v1.2.0-beta.0
36+
image: docker.io/rocm/test-runner:v1.3.1
3737

3838
# specify the mount for test logs
3939
logsLocation:

0 commit comments

Comments
 (0)