From 4b6d5107c8f0a830f039309367841a8d0a772aca Mon Sep 17 00:00:00 2001 From: ChrsMark Date: Mon, 15 Sep 2025 15:15:38 +0300 Subject: [PATCH 1/2] Add memory metrics for k8s.pod, k8s.node and container Signed-off-by: ChrsMark --- .chloggen/add_k8s_memory_metrics.yaml | 22 +++ docs/non-normative/k8s-migration.md | 57 +++++++ docs/system/container-metrics.md | 110 +++++++++++++ docs/system/k8s-metrics.md | 218 ++++++++++++++++++++++++++ model/container/metrics.yaml | 92 +++++++++++ model/k8s/metrics.yaml | 162 +++++++++++++++++++ 6 files changed, 661 insertions(+) create mode 100644 .chloggen/add_k8s_memory_metrics.yaml diff --git a/.chloggen/add_k8s_memory_metrics.yaml b/.chloggen/add_k8s_memory_metrics.yaml new file mode 100644 index 0000000000..d548c10b92 --- /dev/null +++ b/.chloggen/add_k8s_memory_metrics.yaml @@ -0,0 +1,22 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: enhancement + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: k8s + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Add memory metrics for k8s.node, k8s.pod and container resources + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [2776] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/docs/non-normative/k8s-migration.md b/docs/non-normative/k8s-migration.md index ef604e4242..41c30cf7cc 100644 --- a/docs/non-normative/k8s-migration.md +++ b/docs/non-normative/k8s-migration.md @@ -63,6 +63,9 @@ and one for disabling the old schema called `semconv.k8s.disableLegacy`. Then: - [K8s Node condition metrics](#k8s-node-condition-metrics) - [K8s Filesystem metrics](#k8s-filesystem-metrics) - [K8s Pod Volume metrics](#k8s-pod-volume-metrics) + - [K8s Pod Memory metrics](#k8s-pod-memory-metrics) + - [Container memory metrics](#container-memory-metrics) + - [K8s Node memory metrics](#k8s-node-memory-metrics) - [Container Runtime](#container-runtime) @@ -422,6 +425,60 @@ The changes in these metrics are the following: +### K8s Pod Memory metrics + +The K8s Pod memory metrics implemented by the Collector and specifically the +[k8scluster](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.119.0/receiver/k8sclusterreceiver/documentation.md) +receiver were introduced as semantic conventions in +[#1490](https://github.com/open-telemetry/semantic-conventions/issues/1490). + +The changes in these metrics are the following: + + + +| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | +|------------------------------------------------------------------------------------|------------------------------------------------| +| `k8s.pod.memory.page_faults` | `k8s.pod.memory.paging.faults` with attribute `system.paging.type` | +| `k8s.pod.memory.major_page_faults` | `k8s.pod.memory.paging.faults` with attribute `system.paging.type` | + + + +### Container memory metrics + +The Container memory metrics implemented by the Collector and specifically the +[k8scluster](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.119.0/receiver/k8sclusterreceiver/documentation.md) +receiver were introduced as semantic conventions in +[#1490](https://github.com/open-telemetry/semantic-conventions/issues/1490). + +The changes in these metrics are the following: + + + +| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | +|------------------------------------------------------------------------------------|---------------------------------------------------------------| +| `container.memory.page_faults` | `container.memory.paging.faults` with attribute `system.paging.type` | +| `container.memory.major_page_faults` | `container.memory.paging.faults` with attribute `system.paging.type` | + + + +### K8s Node memory metrics + +The K8s Node memory metrics implemented by the Collector and specifically the +[k8scluster](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.119.0/receiver/k8sclusterreceiver/documentation.md) +receiver were introduced as semantic conventions in +[#1490](https://github.com/open-telemetry/semantic-conventions/issues/1490). + +The changes in these metrics are the following: + + + +| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | +|------------------------------------------------------------------------------------|---------------------------------------------------------------------| +| `k8s.node.memory.page_faults` | `k8s.node.memory.paging.faults` with attribute `system.paging.type` | +| `k8s.node.memory.major_page_faults` | `k8s.node.memory.paging.faults` with attribute `system.paging.type` | + + + ### Container Runtime The container runtime has become more descriptive with changes introduced to semantic conventions diff --git a/docs/system/container-metrics.md b/docs/system/container-metrics.md index a08eded4f6..63e41b6f49 100644 --- a/docs/system/container-metrics.md +++ b/docs/system/container-metrics.md @@ -18,6 +18,10 @@ well-defined APIs (e.g. Kubelet's API or container runtimes). - [Metric: `container.cpu.time`](#metric-containercputime) - [Metric: `container.cpu.usage`](#metric-containercpuusage) - [Metric: `container.memory.usage`](#metric-containermemoryusage) +- [Metric: `container.memory.available`](#metric-containermemoryavailable) +- [Metric: `container.memory.rss`](#metric-containermemoryrss) +- [Metric: `container.memory.working_set`](#metric-containermemoryworking_set) +- [Metric: `container.memory.paging.faults`](#metric-containermemorypagingfaults) - [Metric: `container.disk.io`](#metric-containerdiskio) - [Metric: `container.network.io`](#metric-containernetworkio) - [Metric: `container.filesystem.available`](#metric-containerfilesystemavailable) @@ -161,6 +165,112 @@ This metric is [opt-in][MetricOptIn]. +### Metric: `container.memory.available` + +This metric is [opt-in][MetricOptIn]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `container.memory.available` | UpDownCounter | `By` | Container memory available. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`container`](/docs/registry/entities/container.md#container) | + +**[1]:** Available memory for use. This is defined as the memory limit - workingSetBytes. If memory limit is undefined, the available bytes is omitted. +In general, this metric can be derived from [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) and by subtracting the `container_memory_working_set_bytes` metric from the `container_spec_memory_limit_bytes` metric. +In K8s, this metric is derived from the [MemoryStats.AvailableBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + + + + + + +### Metric: `container.memory.rss` + +This metric is [opt-in][MetricOptIn]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `container.memory.rss` | UpDownCounter | `By` | Container memory RSS. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`container`](/docs/registry/entities/container.md#container) | + +**[1]:** In general, this metric can be derived from [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) and specifically the `container_memory_rss` metric. +In K8s, this metric is derived from the [MemoryStats.RSSBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + + + + + + +### Metric: `container.memory.working_set` + +This metric is [opt-in][MetricOptIn]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `container.memory.working_set` | UpDownCounter | `By` | Container memory working set. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`container`](/docs/registry/entities/container.md#container) | + +**[1]:** In general, this metric can be derived from [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) and specifically the `container_memory_working_set_bytes` metric. +In K8s, this metric is derived from the [MemoryStats.WorkingSetBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + + + + + + +### Metric: `container.memory.paging.faults` + +This metric is [opt-in][MetricOptIn]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `container.memory.paging.faults` | Counter | `{fault}` | Container memory paging faults. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`container`](/docs/registry/entities/container.md#container) | + +**[1]:** In general, this metric can be derived from [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) and specifically the `container_memory_failures_total{failure_type=pgfault, scope=container}` and `container_memory_failures_total{failure_type=pgmajfault, scope=container}`metric. +In K8s, this metric is derived from the [MemoryStats.PageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) and [MemoryStats.MajorPageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`system.paging.type`](/docs/registry/attributes/system.md) | string | The memory paging type | `minor` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) | + +--- + +`system.paging.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `major` | major | ![Development](https://img.shields.io/badge/-development-blue) | +| `minor` | minor | ![Development](https://img.shields.io/badge/-development-blue) | + + + + + + ### Metric: `container.disk.io` This metric is [opt-in][MetricOptIn]. diff --git a/docs/system/k8s-metrics.md b/docs/system/k8s-metrics.md index a27af2515f..4737ec6627 100644 --- a/docs/system/k8s-metrics.md +++ b/docs/system/k8s-metrics.md @@ -22,6 +22,10 @@ and therefore inherit its attributes, like `k8s.pod.name` and `k8s.pod.uid`. - [Metric: `k8s.pod.cpu.time`](#metric-k8spodcputime) - [Metric: `k8s.pod.cpu.usage`](#metric-k8spodcpuusage) - [Metric: `k8s.pod.memory.usage`](#metric-k8spodmemoryusage) + - [Metric: `k8s.pod.memory.available`](#metric-k8spodmemoryavailable) + - [Metric: `k8s.pod.memory.rss`](#metric-k8spodmemoryrss) + - [Metric: `k8s.pod.memory.working_set`](#metric-k8spodmemoryworking_set) + - [Metric: `k8s.pod.memory.paging.faults`](#metric-k8spodmemorypagingfaults) - [Metric: `k8s.pod.network.io`](#metric-k8spodnetworkio) - [Metric: `k8s.pod.network.errors`](#metric-k8spodnetworkerrors) - [Metric: `k8s.pod.filesystem.available`](#metric-k8spodfilesystemavailable) @@ -46,6 +50,10 @@ and therefore inherit its attributes, like `k8s.pod.name` and `k8s.pod.uid`. - [Metric: `k8s.node.cpu.time`](#metric-k8snodecputime) - [Metric: `k8s.node.cpu.usage`](#metric-k8snodecpuusage) - [Metric: `k8s.node.memory.usage`](#metric-k8snodememoryusage) + - [Metric: `k8s.node.memory.available`](#metric-k8snodememoryavailable) + - [Metric: `k8s.node.memory.rss`](#metric-k8snodememoryrss) + - [Metric: `k8s.node.memory.working_set`](#metric-k8snodememoryworking_set) + - [Metric: `k8s.node.memory.paging.faults`](#metric-k8snodememorypagingfaults) - [Metric: `k8s.node.network.io`](#metric-k8snodenetworkio) - [Metric: `k8s.node.network.errors`](#metric-k8snodenetworkerrors) - [Metric: `k8s.node.filesystem.available`](#metric-k8snodefilesystemavailable) @@ -218,6 +226,111 @@ This metric is [recommended][MetricRecommended]. +### Metric: `k8s.pod.memory.available` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.pod.memory.available` | UpDownCounter | `By` | Pod memory available. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.pod`](/docs/registry/entities/k8s.md#k8s-pod) | + +**[1]:** Available memory for use. This is defined as the memory limit - workingSetBytes. If memory limit is undefined, the available bytes is omitted. +This metric is derived from the [MemoryStats.AvailableBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + + + + + + +### Metric: `k8s.pod.memory.rss` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.pod.memory.rss` | UpDownCounter | `By` | Pod memory RSS. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.pod`](/docs/registry/entities/k8s.md#k8s-pod) | + +**[1]:** The amount of anonymous and swap cache memory (includes transparent hugepages). +This metric is derived from the [MemoryStats.RSSBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + + + + + + +### Metric: `k8s.pod.memory.working_set` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.pod.memory.working_set` | UpDownCounter | `By` | Pod memory working set. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.pod`](/docs/registry/entities/k8s.md#k8s-pod) | + +**[1]:** The amount of working set memory. This includes recently accessed memory, dirty memory, and kernel memory. WorkingSetBytes is <= UsageBytes. +This metric is derived from the [MemoryStats.WorkingSetBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + + + + + + +### Metric: `k8s.pod.memory.paging.faults` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.pod.memory.paging.faults` | Counter | `{fault}` | Pod memory paging faults. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.pod`](/docs/registry/entities/k8s.md#k8s-pod) | + +**[1]:** Cumulative number of major/minor page faults. +This metric is derived from the [MemoryStats.PageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) and [MemoryStats.MajorPageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) of the Kubelet's stats API. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`system.paging.type`](/docs/registry/attributes/system.md) | string | The memory paging type | `minor` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) | + +--- + +`system.paging.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `major` | major | ![Development](https://img.shields.io/badge/-development-blue) | +| `minor` | minor | ![Development](https://img.shields.io/badge/-development-blue) | + + + + + + ### Metric: `k8s.pod.network.io` This metric is [recommended][MetricRecommended]. @@ -944,6 +1057,111 @@ This metric is [recommended][MetricRecommended]. +### Metric: `k8s.node.memory.available` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.node.memory.available` | UpDownCounter | `By` | Node memory available. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.node`](/docs/registry/entities/k8s.md#k8s-node) | + +**[1]:** Available memory for use. This is defined as the memory limit - workingSetBytes. If memory limit is undefined, the available bytes is omitted. +This metric is derived from the [MemoryStats.AvailableBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) of the Kubelet's stats API. + + + + + + +### Metric: `k8s.node.memory.rss` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.node.memory.rss` | UpDownCounter | `By` | Node memory RSS. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.node`](/docs/registry/entities/k8s.md#k8s-node) | + +**[1]:** The amount of anonymous and swap cache memory (includes transparent hugepages). +This metric is derived from the [MemoryStats.RSSBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) of the Kubelet's stats API. + + + + + + +### Metric: `k8s.node.memory.working_set` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.node.memory.working_set` | UpDownCounter | `By` | Node memory working set. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.node`](/docs/registry/entities/k8s.md#k8s-node) | + +**[1]:** The amount of working set memory. This includes recently accessed memory, dirty memory, and kernel memory. WorkingSetBytes is <= UsageBytes. +This metric is derived from the [MemoryStats.WorkingSetBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) field of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) of the Kubelet's stats API. + + + + + + +### Metric: `k8s.node.memory.paging.faults` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | +| -------- | --------------- | ----------- | -------------- | --------- | ------ | +| `k8s.node.memory.paging.faults` | Counter | `{fault}` | Node memory paging faults. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.node`](/docs/registry/entities/k8s.md#k8s-node) | + +**[1]:** Cumulative number of major/minor page faults. +This metric is derived from the [MemoryStats.PageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) and [MemoryStats.MajorPageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) fields of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) of the Kubelet's stats API. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`system.paging.type`](/docs/registry/attributes/system.md) | string | The memory paging type | `minor` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) | + +--- + +`system.paging.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `major` | major | ![Development](https://img.shields.io/badge/-development-blue) | +| `minor` | minor | ![Development](https://img.shields.io/badge/-development-blue) | + + + + + + ### Metric: `k8s.node.network.io` This metric is [recommended][MetricRecommended]. diff --git a/model/container/metrics.yaml b/model/container/metrics.yaml index 9e1bd5ea34..86615be509 100644 --- a/model/container/metrics.yaml +++ b/model/container/metrics.yaml @@ -65,6 +65,98 @@ groups: Memory usage of the container. instrument: counter unit: "By" + - id: metric.container.memory.available + type: metric + metric_name: container.memory.available + stability: development + annotations: + code_generation: + metric_value_type: int + entity_associations: + - container + brief: "Container memory available." + note: > + Available memory for use. This is defined as the memory limit - workingSetBytes. + If memory limit is undefined, the available bytes is omitted. + + In general, this metric can be derived from + [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) + and by subtracting the `container_memory_working_set_bytes` metric + from the `container_spec_memory_limit_bytes` metric. + + In K8s, this metric is derived from the + [MemoryStats.AvailableBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.container.memory.rss + type: metric + metric_name: container.memory.rss + stability: development + entity_associations: + - container + brief: "Container memory RSS." + annotations: + code_generation: + metric_value_type: int + note: > + In general, this metric can be derived from + [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) + and specifically the `container_memory_rss` metric. + + In K8s, this metric is derived from the + [MemoryStats.RSSBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.container.memory.working_set + type: metric + metric_name: container.memory.working_set + stability: development + annotations: + code_generation: + metric_value_type: int + entity_associations: + - container + brief: "Container memory working set." + note: > + In general, this metric can be derived from + [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) + and specifically the `container_memory_working_set_bytes` metric. + + In K8s, this metric is derived from the + [MemoryStats.WorkingSetBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.container.memory.paging.faults + type: metric + metric_name: container.memory.paging.faults + stability: development + annotations: + code_generation: + metric_value_type: int + entity_associations: + - container + brief: "Container memory paging faults." + attributes: + - ref: system.paging.type + note: > + In general, this metric can be derived from + [cadvisor](https://github.com/google/cadvisor/blob/v0.53.0/docs/storage/prometheus.md#prometheus-container-metrics) + and specifically the `container_memory_failures_total{failure_type=pgfault, scope=container}` + and `container_memory_failures_total{failure_type=pgmajfault, scope=container}`metric. + + In K8s, this metric is derived from the + [MemoryStats.PageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) and + [MemoryStats.MajorPageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: counter + unit: "{fault}" # container.disk.io.* metrics and attribute group - id: metric.container.disk.io diff --git a/model/k8s/metrics.yaml b/model/k8s/metrics.yaml index cc56ff3304..05fda97850 100644 --- a/model/k8s/metrics.yaml +++ b/model/k8s/metrics.yaml @@ -58,6 +58,87 @@ groups: Total memory usage of the Pod instrument: gauge unit: "By" + - id: metric.k8s.pod.memory.available + type: metric + metric_name: k8s.pod.memory.available + stability: development + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.pod + brief: "Pod memory available." + note: > + Available memory for use. This is defined as the memory limit - workingSetBytes. + If memory limit is undefined, the available bytes is omitted. + + This metric is derived from the + [MemoryStats.AvailableBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.k8s.pod.memory.rss + type: metric + metric_name: k8s.pod.memory.rss + stability: development + brief: "Pod memory RSS." + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.pod + note: > + The amount of anonymous and swap cache memory (includes transparent hugepages). + + This metric is derived from the + [MemoryStats.RSSBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.k8s.pod.memory.working_set + type: metric + metric_name: k8s.pod.memory.working_set + stability: development + entity_associations: + - k8s.pod + brief: "Pod memory working set." + annotations: + code_generation: + metric_value_type: int + note: > + The amount of working set memory. This includes recently accessed memory, + dirty memory, and kernel memory. WorkingSetBytes is <= UsageBytes. + + This metric is derived from the + [MemoryStats.WorkingSetBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.k8s.pod.memory.paging.faults + type: metric + metric_name: k8s.pod.memory.paging.faults + stability: development + brief: "Pod memory paging faults." + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.pod + attributes: + - ref: system.paging.type + note: > + Cumulative number of major/minor page faults. + + This metric is derived from the + [MemoryStats.PageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) and + [MemoryStats.MajorPageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [PodStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#PodStats) + of the Kubelet's stats API. + instrument: counter + unit: "{fault}" # k8s.pod.network.* metrics - id: metric.k8s.pod.network.io @@ -508,6 +589,87 @@ groups: entity_associations: - k8s.node unit: "By" + - id: metric.k8s.node.memory.available + type: metric + metric_name: k8s.node.memory.available + stability: development + brief: "Node memory available." + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.node + note: > + Available memory for use. This is defined as the memory limit - workingSetBytes. + If memory limit is undefined, the available bytes is omitted. + + This metric is derived from the + [MemoryStats.AvailableBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.k8s.node.memory.rss + type: metric + metric_name: k8s.node.memory.rss + stability: development + brief: "Node memory RSS." + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.node + note: > + The amount of anonymous and swap cache memory (includes transparent hugepages). + + This metric is derived from the + [MemoryStats.RSSBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.k8s.node.memory.working_set + type: metric + metric_name: k8s.node.memory.working_set + stability: development + brief: "Node memory working set." + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.node + note: > + The amount of working set memory. This includes recently accessed memory, + dirty memory, and kernel memory. WorkingSetBytes is <= UsageBytes. + + This metric is derived from the + [MemoryStats.WorkingSetBytes](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + field of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) + of the Kubelet's stats API. + instrument: updowncounter + unit: "By" + - id: metric.k8s.node.memory.paging.faults + type: metric + metric_name: k8s.node.memory.paging.faults + stability: development + brief: "Node memory paging faults." + annotations: + code_generation: + metric_value_type: int + entity_associations: + - k8s.node + attributes: + - ref: system.paging.type + note: > + Cumulative number of major/minor page faults. + + This metric is derived from the + [MemoryStats.PageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) and + [MemoryStats.MajorPageFaults](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#MemoryStats) + fields of the [NodeStats.Memory](https://pkg.go.dev/k8s.io/kubelet@v0.34.0/pkg/apis/stats/v1alpha1#NodeStats) + of the Kubelet's stats API. + instrument: counter + unit: "{fault}" # k8s.node.network.* metrics - id: metric.k8s.node.network.io From 005573c81db89b4a122bbaadeb33c33709165955 Mon Sep 17 00:00:00 2001 From: ChrsMark Date: Thu, 2 Oct 2025 10:54:51 +0300 Subject: [PATCH 2/2] improve migration note Signed-off-by: ChrsMark --- docs/non-normative/k8s-migration.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/non-normative/k8s-migration.md b/docs/non-normative/k8s-migration.md index 41c30cf7cc..1d201c7c06 100644 --- a/docs/non-normative/k8s-migration.md +++ b/docs/non-normative/k8s-migration.md @@ -436,10 +436,10 @@ The changes in these metrics are the following: -| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | -|------------------------------------------------------------------------------------|------------------------------------------------| -| `k8s.pod.memory.page_faults` | `k8s.pod.memory.paging.faults` with attribute `system.paging.type` | -| `k8s.pod.memory.major_page_faults` | `k8s.pod.memory.paging.faults` with attribute `system.paging.type` | +| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | +|------------------------------------------------------------------------------------|---------------------------------------------------------------------| +| `k8s.pod.memory.page_faults` | `k8s.pod.memory.paging.faults` with attribute `system.paging.type` set to `minor` | +| `k8s.pod.memory.major_page_faults` | `k8s.pod.memory.paging.faults` with attribute `system.paging.type` set to `major` | @@ -454,10 +454,10 @@ The changes in these metrics are the following: -| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | -|------------------------------------------------------------------------------------|---------------------------------------------------------------| -| `container.memory.page_faults` | `container.memory.paging.faults` with attribute `system.paging.type` | -| `container.memory.major_page_faults` | `container.memory.paging.faults` with attribute `system.paging.type` | +| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | +|------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------| +| `container.memory.page_faults` | `container.memory.paging.faults` with attribute `system.paging.type` set to `minor` | +| `container.memory.major_page_faults` | `container.memory.paging.faults` with attribute `system.paging.type` set to `major` | @@ -472,10 +472,10 @@ The changes in these metrics are the following: -| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | -|------------------------------------------------------------------------------------|---------------------------------------------------------------------| -| `k8s.node.memory.page_faults` | `k8s.node.memory.paging.faults` with attribute `system.paging.type` | -| `k8s.node.memory.major_page_faults` | `k8s.node.memory.paging.faults` with attribute `system.paging.type` | +| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New | +|------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------| +| `k8s.node.memory.page_faults` | `k8s.node.memory.paging.faults` with attribute `system.paging.type` set to `minor` | +| `k8s.node.memory.major_page_faults` | `k8s.node.memory.paging.faults` with attribute `system.paging.type` set to `major` |