This repository contains several applications to support Kubernetes integration with the CloudZero platform, including:
-
CloudZero Webhook Server - provides telemetry to the CloudZero platform enabling complex cost allocation and analysis. This webhook application securely receives resource provisioning and deprovisioning requests from the Kubernetes API. It collects resource labels, annotations, and relationship metadata between resources, ultimately supporting the identification of CSP resources not directly connected to a Kubernetes node.
-
CloudZero Collector - The collector application which implements a prometheus compliant interface for metrics collection; which writes the metrics payloads to files to a shared location for consumption by the shipper. Today the collector classifies incoming metrics data, and will save the data into either cost telemetry files, or into observability files. These files are compressed on disk to save space.
-
CloudZero Shipper - The shipper application monitors shared locations for metrics file creation, allocates pre-signed S3 PUT URLs for customers (using the
CloudZero upload API), and then uploads data to the AWS S3 bucket at set intervals. This approach protects against invalid API keys and enables end-to-end file tracking. -
CloudZero Agent Validator - the validator application is part of the agentβs pod lifecycle hooks. It is responsible for performing basic validation checks, and notifying the CloudZero platform of installation status changes (initializing, started, stopping). This application runs during the lifecycle hook, then exits when complete.
Note the Agent Component which is responsible for executing metrics scrape jobs at various intervals. The metrics collector communicates with kube-state-metrics and cAdvisor exporters to collect metrics, then forwards them to the CloudZero Collector via the Prometheus remote write protocol.
The easiest way to get started with the CloudZero Webhook Server is by
using the cloudzero-agent Helm chart from the cloudzero-charts
repository.
See the Development Guide for comprehensive information about:
- Building and testing components
- Deployment workflows
- Multi-cluster development
- Making changes to the codebase
See the Installation Guide for details.
See the Configuration Guide for details.
# Remove build artifacts
make clean
# Delete KIND cluster and cleanup
make kind-downThe applications are based on a scratch container, so no shell is available. The container images are less than 8MB.
To monitor the data directory, you must deploy a debug container as follows:
-
Deploy a debug container
kubectl apply -f cluster/deployments/debug/deployment.yaml
-
Attach to the shell of the debug container
kubectl exec -it temp-shell -- /bin/shTo inspect the data directory,
cd /cloudzero/data
eksctl delete cluster -f cluster/cluster.yaml --disable-nodegroup-evictionThis project provides a collector application, written in golang, which provides two applications:
Collector- the collector application exposes a prometheus remote write API which can receive POST requests from prometheus in either v1 or v2 encoded format. It decodes the messages, then writes them to thedatadirectory as Brotri-compressed JSON.Shipper- the shipper application watches the data directory looking for completed parquet files on a regular interval (eg. 10 min), then will call theCloudZero upload APIto allocate S3 Presigned PUT URLS. These URLs are used to upload the file. The application has the ability to compress the files before sending them to S3.
The output of the CloudZero Webhook Server (formerly Insights Controller) application is a JSON object
that represents cloudzero metrics, which is POSTed to the CloudZero remote
write API. The format of these objects is based on the Prometheus Timeseries
protobuf message, defined in the
Prometheus types.proto.
Protobuf definitions for the cloudzero metrics are in the proto/ directory.
There are four kinds of objects that can be sent:
- Pod metrics
cloudzero_pod_labelscloudzero_pod_annotations
__name__; will be one of the valid pod metric namesnamespace; the namespace that the pod is launched inresource_type; will always bepodfor pod metrics
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_pod_labels"
},
{
"name": "namespace",
"value": "default"
},
{
"name": "pod",
"value": "hello-28889630-955wd"
},
{
"name": "resource_type",
"value": "pod"
},
{
"name": "label_batch.kubernetes.io/controller-uid",
"value": "cc52c38d-b461-40ab-a65d-2d5a68ac08e5"
},
{
"name": "label_batch.kubernetes.io/job-name",
"value": "hello-28889630"
},
{
"name": "label_controller-uid",
"value": "cc52c38d-b461-40ab-a65d-2d5a68ac08e5"
},
{
"name": "label_job-name",
"value": "hello-28889630"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733378003953"
}
]
}- Workload Metrics
cloudzero_deployment_labelscloudzero_deployment_annotationscloudzero_statefulset_labelscloudzero_statefulset_annotationscloudzero_daemonset_labelscloudzero_daemonset_annotationscloudzero_job_labelscloudzero_job_annotationscloudzero_cronjob_labelscloudzero_cronjob_annotations
__name__; will be one of the valid workload metric namesnamespace; the namespace that the workload is launched inworkload; the name of the workloadresource_type; will be one ofdeployment,statefulset,daemonset,job, orcronjob
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_deployment_labels"
},
{
"name": "namespace",
"value": "default"
},
{
"name": "workload",
"value": "hello"
},
{
"name": "resource_type",
"value": "deployment"
},
{
"name": "label_component",
"value": "greeting"
},
{
"name": "label_foo",
"value": "bar"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733378003953"
}
]
}- Namespace Metrics
cloudzero_namespace_labelscloudzero_namespace_annotations
__name__; will be one of the valid namespace metric namesnamespace; the name of the namespaceresource_type; will always benamespacefor namespace metrics
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_namespace_labels"
},
{
"name": "namespace",
"value": "default"
},
{
"name": "resource_type",
"value": "namespace"
},
{
"name": "label_engr.os.com/component",
"value": "foo"
},
{
"name": "label_kubernetes.io/metadata.name",
"value": "default"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733880410225"
}
]
}- Node Metrics
cloudzero_node_labelscloudzero_node_annotations
__name__; will be one of the valid node metric namesnode; the name of the noderesource_type; will always benodefor node metrics
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_node_labels"
},
{
"name": "resource_type",
"value": "node"
},
{
"name": "label_alpha.eksctl.io/nodegroup-name",
"value": "spot-nodes"
},
{
"name": "label_beta.kubernetes.io/arch",
"value": "amd64"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733880410225"
}
]
}We appreciate feedback and contribution to this repo! Before you get started, please see the following:
Contact [email protected] for usage, questions, specific cases. See the CloudZero Docs for general information on CloudZero.
Please do not report security vulnerabilities on the public GitHub issue tracker. Email [email protected] instead.
CloudZero is the only cloud cost intelligence platform that puts engineering in control by connecting technical decisions to business results.:
- Cost Allocation And Tagging Organize and allocate cloud spend in new ways, increase tagging coverage, or work on showback.
- Kubernetes Cost Visibility Understand your Kubernetes spend alongside total spend across containerized and non-containerized environments.
- FinOps And Financial Reporting Operationalize reporting on metrics such as cost per customer, COGS, gross margin. Forecast spend, reconcile invoices and easily investigate variance.
- Engineering Accountability Foster a cost-conscious culture, where engineers understand spend, proactively consider cost, and get immediate feedback with fewer interruptions and faster and more efficient innovation.
- Optimization And Reducing Waste Focus on immediately reducing spend by understanding where we have waste, inefficiencies, and discounting opportunities.
Learn more about CloudZero on our website www.cloudzero.com
This project is licensed under the Apache 2.0 LICENSE.

