Mig otel #6

khrm · 2025-09-29T00:30:49Z

Changes

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
Has Tests included if any functionality added or changed
pre-commit Passed
Follows the commit message standard
Meets the Tekton contributor standards (including functionality, content, code)
Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

NONE

Summary by CodeRabbit

New Features
- OpenTelemetry-backed metrics and new observability config keys (metrics, tracing, runtime) plus Tekton-specific metric options; CRD: new stopSignal field; PipelineRunList now requires items.
Improvements
- More robust WebSocket proxy/TLS handling and stronger randomness for masks; expanded Darwin/macOS process metrics; dependency upgrades (Kubernetes, OpenTelemetry, Prometheus, logging); OTLP→Prometheus naming utilities and native histogram support.
Bug Fixes
- Safer Prometheus handling and mdstat “reshaping” detection.

coderabbitai · 2025-09-29T00:30:57Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Removed namespace gating for webhook metrics; migrated in-repo metrics from OpenCensus to OpenTelemetry (meters, instruments, attributes); added OTEL observability config keys; updated many dependencies and vendored packages; propagated context through informer/metrics call sites and adapted tests.

Changes

Cohort / File(s)	Summary
Webhook main & observability config `cmd/webhook/main.go`, `config/config-observability.yaml`	Removed dynamic StatsReporterOptions usage in webhook main (no StatsReporterOptions passed). Added OpenTelemetry metrics/tracing/runtime keys and Tekton-specific observability keys in observability config.
PipelineRun metrics (OTEL) `pkg/pipelinerunmetrics/*` `pkg/pipelinerunmetrics/metrics.go`, `pkg/pipelinerunmetrics/metrics_test.go`	Replaced OpenCensus views/tags with OpenTelemetry Meter instruments and attribute-based tagging; Recorder now holds OTEL instruments and insertTag helpers; initializeInstruments added; public methods now accept context.Context; tests adapted to OTEL SDK.
TaskRun metrics (OTEL) `pkg/taskrunmetrics/*` `pkg/taskrunmetrics/metrics.go`, `pkg/taskrunmetrics/metrics_test.go`	Migrated TaskRun metrics from OpenCensus to OpenTelemetry: histograms/counters/up-down counters, attribute/tag helpers, initializeInstruments and OTEL-based tests and mock listers.
Metric callsites & informers `pkg/reconciler/pipelinerun/pipelinerun.go`, `test/util.go`, many `pkg/client/informers/...`	Call sites updated to pass context.Context into metrics methods; replaced many context.TODO() with context.Background(); added ListWithContextFunc/WatchFuncWithContext variants; test util removed metric exporter init.
Generated client config defaults `pkg/client//typed//_client.go`, related `pkg/client/`	Many generated setConfigDefaults functions changed to return no error; callers no longer check/propagate error.
Fake clientsets: watch options `pkg/client//fake/.go`, resource/resolution fake clientsets	Fake client watch reactors extract metav1.ListOptions from testing actions when available and forward them to tracker Watch calls.
API/model marshaling adjustments `pkg/apis//pipelinerun_types.go`, `openapi_generated.go`, `swagger.json`, `pkg/apis/resolution/`	PipelineRunList.Items tag changed to remove `omitempty` and OpenAPI/Swagger now require `items`; ResolutionRequestList metadata tag changed to `omitempty`.
CRD schema text and new stopSignal `config/300-crds/300-task.yaml`, `300-taskrun.yaml`, `300-pipelinerun.yaml`	Removed beta feature-flag notes from descriptions; added `stopSignal` string field to Task/TaskRun container schemas; updated EnvFromSource/EnvFrom descriptions to mention ConfigMaps or Secrets.
go.mod / dependency bumps `go.mod`	Updated Kubernetes modules, OpenTelemetry metric/sdk/exporters, OTLP exporters, logging (go-logr/zapr), gorilla/websocket, Prometheus components, grafana/regexp, and various indirects.
Large vendor additions & refactors `vendor/...` (many files)	Added/updated vendored packages: otlptranslator (metric/label/unit namers, MetricNamer, MetricType), grafana/regexp engine, go-logr/zapr adapter and slog bridge, Prometheus client_golang enhancements (native histograms, CollectorFunc, promhttp zstd/CreatedSamples support, process collector updates), gorilla/websocket TLS/proxy/dial refactors, plus READMEs/LICENSEs and lint configs.
Vendor removals: legacy golang/protobuf & zipkin-go `vendor/github.com/golang/protobuf/proto/`, `vendor/github.com/openzipkin/zipkin-go/`	Removed many legacy protobuf v1 compatibility files (buffer, defaults, deprecated shims, registry, text encode/decode, properties, wrappers) and multiple Zipkin model files.
Prometheus client internals & helpers `vendor/github.com/prometheus/*`	Added CollectorFunc, native histogram constructors, process collector describe/collect refactors, promhttp compression/Zstd/CreatedSamples support, and multiple internal fixes.
procfs additions & helpers `vendor/github.com/prometheus/procfs/*`	Added ReadHexFromFile, SysReadUintFromFile, SysReadIntFromFile, new constants, meminfo/mdstat fields, Makefile and docs updates.
Misc vendor housekeeping `vendor/**`	Added/updated .gitignore, LICENSE, README, CODE_OF_CONDUCT, MAINTAINERS, and lint configs across vendors; assorted small API/utility edits.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Controller as Reconciler/Controller
  participant Recorder as Recorder (pkg/*metrics)
  participant Meter as OTEL Meter
  participant Exporter as OTEL Exporter

  Controller->>Recorder: NewRecorder(ctx)
  Recorder->>Meter: Get Meter from MeterProvider
  Recorder->>Recorder: initializeInstruments(cfg) (create histograms/counters/gauges)

  Controller->>Recorder: DurationAndCount(ctx, pr, before)
  Recorder->>Recorder: compute duration & attributes
  Recorder->>Meter: prDurationHistogram.Record(ctx, value, attrs)
  Recorder->>Meter: prTotalCounter.Add(ctx, 1, attrs)

  Controller->>Recorder: RunningPipelineRuns(ctx, lister)
  Recorder->>Recorder: aggregate running counts & attrs
  Recorder->>Meter: runningPRsGauge.Add(ctx, delta, attrs)

  Note over Meter,Exporter: MeterProvider exports asynchronously to configured exporters
  Meter-->>Exporter: push metrics

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

In a burrow of bytes I twitch my nose,
Swapping old tags for OTEL rows.
Histograms hop, counters cheer,
Vendors bloom both far and near.
I munch a carrot, tidy deps in tow—hop, hop—metrics grow! 🐇📈

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description is just the unchanged template without any actual summary of the changes in the “# Changes” section or updated checklist items, which means reviewers lack the necessary context about what was modified, why, and whether tests, documentation, and release notes have been addressed.	Please fill in the “# Changes” section with a descriptive summary of the modifications, mark off the checklist items by running tests and updating docs and release notes as needed, and clearly indicate any required user actions in the release notes block.
Docstring Coverage	⚠️ Warning	Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title Check	❓ Inconclusive	The title “Mig otel” is vague and relies on abbreviations that obscure the primary change, lacking clarity about what is being migrated or why, and thus does not meaningfully convey the essence of the pull request.	Please revise the title to a clear, concise sentence that highlights the main change, for example “Migrate metrics reporting from OpenCensus to OpenTelemetry instrumentation.”

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch mig-otel

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull Request Overview

This PR migrates Tekton Pipeline metrics from OpenCensus to OpenTelemetry while improving configuration and test structure.

Migration from OpenCensus to OpenTelemetry for metrics collection and reporting
Simplified test implementations replacing complex metric validation with basic API verification
Updated configuration structure for observability settings

Reviewed Changes

Copilot reviewed 7 out of 983 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
pkg/taskrunmetrics/metrics.go	Migrated from OpenCensus to OpenTelemetry metric instruments
pkg/taskrunmetrics/metrics_test.go	Simplified tests to focus on API functionality rather than metric validation
pkg/pipelinerunmetrics/metrics.go	Migrated from OpenCensus to OpenTelemetry metric instruments
pkg/pipelinerunmetrics/metrics_test.go	Simplified tests to focus on API functionality rather than metric validation
go.mod	Updated dependencies for OpenTelemetry support
config/config-observability.yaml	Added OpenTelemetry configuration options
cmd/webhook/main.go	Removed legacy metrics configuration

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-29T00:36:20Z

pkg/taskrunmetrics/metrics_test.go

 		t.Error("Current TaskRunsCount recording expected to return error but got nil")
 	}
-	if err := metrics.RecordPodLatency(ctx, nil, nil); err == nil {
+	if err := r.RecordPodLatency(ctx, &v1.TaskRun{}, &corev1.Pod{}); err == nil {


The parameter order in the test call doesn't match the actual function signature. The function signature shows RecordPodLatency(ctx context.Context, tr *v1.TaskRun, pod *corev1.Pod) but the test is calling with empty objects. This should use the correct parameter order.

Copilot · 2025-09-29T00:36:20Z

pkg/taskrunmetrics/metrics_test.go

 			}

-			if err := metrics.RecordPodLatency(ctx, td.pod, td.taskRun); td.expectingError && err == nil {
+			if err := metrics.RecordPodLatency(ctx, td.taskRun, td.pod); td.expectingError && err == nil {


The parameter order is incorrect here. Based on the function signature change in the main file, this should be RecordPodLatency(ctx, td.taskRun, td.pod) but the original code was RecordPodLatency(ctx, td.pod, td.taskRun). The test should match the actual function signature.

Copilot · 2025-09-29T00:36:21Z

pkg/taskrunmetrics/metrics.go

+	r.mutex.Lock()
+	defer r.mutex.Unlock()
+
 	trs, err := lister.List(labels.Everything())
 	if err != nil {
 		return err
 	}



The mutex is locked at the beginning of the function but the lister.List() call could potentially be expensive and doesn't need to be protected by the mutex. Consider moving the mutex lock after the list operation to reduce lock contention.

Suggested change

r.mutex.Lock()

defer r.mutex.Unlock()

trs, err := lister.List(labels.Everything())

if err != nil {

return err

}

trs, err := lister.List(labels.Everything())

if err != nil {

return err

}

r.mutex.Lock()

defer r.mutex.Unlock()

Copilot · 2025-09-29T00:36:21Z

pkg/pipelinerunmetrics/metrics.go

+	r.mutex.Lock()
+	defer r.mutex.Unlock()
+
 	prs, err := lister.List(labels.Everything())


Similar to the TaskRun metrics, the expensive lister.List() operation is being performed while holding the mutex lock. This could cause unnecessary blocking and should be moved outside the critical section.

coderabbitai

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

vendor/github.com/prometheus/procfs/Makefile.common (1)

284-289: Pin govulncheck for reproducibility.

install-govulncheck pulls golang.org/x/vuln/cmd/govulncheck@latest, so every new upstream release changes tool behavior under us. This breaks the reproducible-toolchain guarantee the rest of the Makefile provides (note how promu and golangci-lint are version-pinned). Please lock govulncheck to a specific tag and update intentionally when needed.

vendor/github.com/gorilla/websocket/client.go (1)

303-323: Don't reuse TLSClientConfig for proxy handshakes

When the proxy URL is HTTPS we now clone d.TLSClientConfig both while establishing the tunnel (via netDialFromURL) and again here for the post-CONNECT TLS. In many deployments TLSClientConfig.RootCAs is trimmed down to only trust the backend service; reusing that same config for the proxy TLS session causes the CONNECT step to fail with “x509: unknown authority” (or forces operators to add the proxy’s CA, widening trust). This is a regression compared to the previous HTTP-proxy path where the custom CA bundle only applied to the origin server. We need a separate (or default) TLS config for the proxy leg so that backend-specific settings don’t break HTTPS proxies.(go.dev)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (1)
473-491: Error messages should list the actually offered formats, not the defaults; unify wording.

When zstd is requested but unavailable (or an unknown encoding is selected), the error currently prints defaultCompressionFormats(), which may differ from the user-provided OfferedCompressions. Also the message mixes “not implemented yet” and “not recognized”. Prefer “not supported” and show the current offers.
-import (
+import (
 	"compress/gzip"
 	"errors"
 	"fmt"
 	"io"
 	"net/http"
 	"strconv"
+	"strings"
 	"sync"
 	"time"
 )
@@
 	switch selected {
 	case "zstd":
 		if internal.NewZstdWriter == nil {
-			// The content encoding was not implemented yet.
-			return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats())
+			// The requested encoding is not supported in this build.
+			return nil, "", func() {}, fmt.Errorf("content compression format not supported: %s. Valid formats are: %s", selected, strings.Join(compressions, ", "))
 		}
 		writer, closeWriter, err := internal.NewZstdWriter(rw)
 		return writer, selected, closeWriter, err
@@
 	default:
-		// The content encoding was not implemented yet.
-		return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats())
+		// The requested encoding is not supported.
+		return nil, "", func() {}, fmt.Errorf("content compression format not supported: %s. Valid formats are: %s", selected, strings.Join(compressions, ", "))
 	}

🧹 Nitpick comments (11)

vendor/github.com/prometheus/procfs/internal/util/sysreadfile.go (1)
60-60: Trim once to avoid redundant work.

SysReadFile already returns a trimmed string, so the extra strings.TrimSpace(string(data)) re-trims the same value. Dropping the second trim keeps the helpers lean without changing behavior.
-	return strconv.ParseUint(strings.TrimSpace(string(data)), 10, 64)
+	return strconv.ParseUint(data, 10, 64)
-	return strconv.ParseInt(strings.TrimSpace(string(data)), 10, 64)
+	return strconv.ParseInt(data, 10, 64)
Also applies to: 69-69
vendor/github.com/go-logr/zapr/README.md (1)
39-62: Add Go language hint and swap hard tabs for spaces in slog example.

The slog example still triggers markdownlint (missing fenced language and hard tabs). Let’s tag the block as Go and reindent with spaces so CI stays clean.
-```
+```go
 package main
 
 import (
-	"fmt"
-	"log/slog"
-
-	"github.com/go-logr/logr/slogr"
-	"github.com/go-logr/zapr"
-	"go.uber.org/zap"
+    "fmt"
+    "log/slog"
+
+    "github.com/go-logr/logr/slogr"
+    "github.com/go-logr/zapr"
+    "go.uber.org/zap"
 )
 
 func main() {
-	var log *slog.Logger
+    var log *slog.Logger
 
-	zapLog, err := zap.NewDevelopment()
-	if err != nil {
-		panic(fmt.Sprintf("who watches the watchmen (%v)?", err))
-	}
-	log = slog.New(slogr.NewSlogHandler(zapr.NewLogger(zapLog)))
+    zapLog, err := zap.NewDevelopment()
+    if err != nil {
+        panic(fmt.Sprintf("who watches the watchmen (%v)?", err))
+    }
+    log = slog.New(slogr.NewSlogHandler(zapr.NewLogger(zapLog)))
 
-	log.Info("Logr in action!", "the answer", 42)
+    log.Info("Logr in action!", "the answer", 42)
 }
-```
+```
pkg/pipelinerunmetrics/metrics_test.go (1)
220-247: Trim unused expectations from the table.

expectedPipelineTag, expectedPipelinerunTag, expectedNamespaceTag, and expectedCount are no longer referenced after the OTEL refactor, so they now act as dead weight in the table definition. Dropping them keeps the test intent obvious and avoids implying we still verify those values.
-		expectedPipelineTag     string
-		expectedPipelinerunTag  string
-		expectedNamespaceTag    string
-		expectedCount           int64
vendor/github.com/prometheus/client_golang/prometheus/promhttp/internal/compression.go (1)
20-21: Document the NewZstdWriter contract (nil semantics, lifecycle) and consider a named factory type.

Make it explicit that it is nil unless zstd is compiled in, and that callers must always invoke closeWriter(). A small named type also improves readability.
 package internal

 import (
 	"io"
 )
 
-// NewZstdWriter enables zstd write support if non-nil.
-var NewZstdWriter func(rw io.Writer) (_ io.Writer, closeWriter func(), _ error)
+// ZstdWriterFactory returns a writer that applies Zstandard compression to rw.
+// The returned closeWriter must be called exactly once to flush and release resources.
+type ZstdWriterFactory func(rw io.Writer) (_ io.Writer, closeWriter func(), _ error)
+
+// NewZstdWriter is nil unless zstd support is built in (set by platform/tag-specific files).
+// When non-nil, use it to wrap the response writer; always call the returned closeWriter().
+var NewZstdWriter ZstdWriterFactory
vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (4)
68-74: defaultCompressionFormats: clarify zstd availability in doc or name.

Since zstd is included only when internal.NewZstdWriter != nil, add a short comment to reflect that to avoid confusion at call sites.
-func defaultCompressionFormats() []Compression {
+// defaultCompressionFormats returns formats supported by this build (zstd only if available).
+func defaultCompressionFormats() []Compression {
145-154: Filter out “zstd” from OfferedCompressions when zstd is unavailable.

If users configure OfferedCompressions to include zstd while internal.NewZstdWriter is nil, negotiation will pick zstd, then we error and fall back to identity. Proactively filtering avoids noisy logs and an error path in normal operation.
-  if !opts.DisableCompression {
-    offers := defaultCompressionFormats()
-    if len(opts.OfferedCompressions) > 0 {
-      offers = opts.OfferedCompressions
-    }
-    for _, comp := range offers {
-      compressions = append(compressions, string(comp))
-    }
-  }
+  if !opts.DisableCompression {
+    offers := defaultCompressionFormats()
+    if len(opts.OfferedCompressions) > 0 {
+      offers = opts.OfferedCompressions
+    }
+    for _, comp := range offers {
+      if comp == Zstd && internal.NewZstdWriter == nil {
+        // Skip zstd if the factory is not available in this build.
+        continue
+      }
+      compressions = append(compressions, string(comp))
+    }
+  }
391-398: Doc nit: default OfferedCompressions now depends on zstd availability.

Comment says “defaults to identity, gzip and zstd,” but zstd is only available when the factory is present. Suggest reflecting this to avoid surprising users.
-  // OfferedCompressions is a set of encodings (compressions) handler will
-  // try to offer when negotiating with the client. This defaults to identity, gzip
-  // and zstd.
+  // OfferedCompressions is a set of encodings (compressions) the handler will
+  // try to offer when negotiating with the client. Defaults to identity and gzip,
+  // and zstd only when this build includes zstd support.
423-437: New option: add brief operational guidance.

Consider noting that enabling created samples can increase series cardinality and scrape size; recommend enabling only with scrapers that support zero-ingestion of created timestamps (e.g., Prometheus >= 2.50 with the feature flag).
vendor/github.com/grafana/regexp/exec.go (1)

275-279: Panics on unexpected opcodes lack context.

panic("bad inst") / "unhandled" provide little diagnostics in production crashes. Consider enriching the panic with opcode/pc (or guard with build tags for debug-only). Given vendoring, this can be deferred.

Also applies to: 335-337, 448-450
vendor/github.com/grafana/regexp/regexp.go (2)
1261-1283: Avoid shadowing the strings package in Split().

Local variable named strings shadows the imported package, reducing readability. Rename to parts.
-   strings := make([]string, 0, len(matches))
+   parts := make([]string, 0, len(matches))
@@
-       if n > 0 && len(strings) >= n-1 {
+       if n > 0 && len(parts) >= n-1 {
            break
        }
@@
-       if match[1] != 0 {
-           strings = append(strings, s[beg:end])
+       if match[1] != 0 {
+           parts = append(parts, s[beg:end])
        }
        beg = match[1]
@@
-   if end != len(s) {
-       strings = append(strings, s[beg:])
+   if end != len(s) {
+       parts = append(parts, s[beg:])
    }
-   return strings
+   return parts
989-996: Prefer a non-reserved identifier instead of ‘rune’ variable.

Using rune as a variable name shadows the widely-used type alias, hurting readability. Rename to r and use r, size := utf8.DecodeRuneInString(...).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d97e81a and 103b041.

⛔ Files ignored due to path filters (8)

go.sum is excluded by !**/*.sum
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.go is excluded by !**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.gw.go is excluded by !**/*.pb.gw.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service_grpc.pb.go is excluded by !**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/metrics/v1/metrics.pb.go is excluded by !**/*.pb.go
vendor/k8s.io/api/apps/v1/generated.pb.go is excluded by !**/*.pb.go
vendor/k8s.io/api/apps/v1beta1/generated.pb.go is excluded by !**/*.pb.go
vendor/k8s.io/api/apps/v1beta2/generated.pb.go is excluded by !**/*.pb.go

📒 Files selected for processing (107)

cmd/webhook/main.go (0 hunks)
config/config-observability.yaml (2 hunks)
go.mod (8 hunks)
pkg/pipelinerunmetrics/metrics.go (3 hunks)
pkg/pipelinerunmetrics/metrics_test.go (5 hunks)
pkg/reconciler/pipelinerun/pipelinerun.go (1 hunks)
pkg/taskrunmetrics/metrics.go (7 hunks)
pkg/taskrunmetrics/metrics_test.go (9 hunks)
vendor/github.com/go-logr/logr/slogr/slogr.go (1 hunks)
vendor/github.com/go-logr/zapr/.gitignore (1 hunks)
vendor/github.com/go-logr/zapr/.golangci.yaml (1 hunks)
vendor/github.com/go-logr/zapr/LICENSE (1 hunks)
vendor/github.com/go-logr/zapr/README.md (1 hunks)
vendor/github.com/go-logr/zapr/slogzapr.go (1 hunks)
vendor/github.com/go-logr/zapr/zapr.go (1 hunks)
vendor/github.com/go-logr/zapr/zapr_noslog.go (1 hunks)
vendor/github.com/go-logr/zapr/zapr_slog.go (1 hunks)
vendor/github.com/golang/protobuf/proto/buffer.go (0 hunks)
vendor/github.com/golang/protobuf/proto/defaults.go (0 hunks)
vendor/github.com/golang/protobuf/proto/deprecated.go (0 hunks)
vendor/github.com/golang/protobuf/proto/discard.go (0 hunks)
vendor/github.com/golang/protobuf/proto/extensions.go (0 hunks)
vendor/github.com/golang/protobuf/proto/properties.go (0 hunks)
vendor/github.com/golang/protobuf/proto/proto.go (0 hunks)
vendor/github.com/golang/protobuf/proto/registry.go (0 hunks)
vendor/github.com/golang/protobuf/proto/text_decode.go (0 hunks)
vendor/github.com/golang/protobuf/proto/text_encode.go (0 hunks)
vendor/github.com/golang/protobuf/proto/wire.go (0 hunks)
vendor/github.com/golang/protobuf/proto/wrappers.go (0 hunks)
vendor/github.com/google/gofuzz/.travis.yml (0 hunks)
vendor/github.com/google/gofuzz/CONTRIBUTING.md (0 hunks)
vendor/github.com/google/gofuzz/fuzz.go (0 hunks)
vendor/github.com/gorilla/websocket/README.md (2 hunks)
vendor/github.com/gorilla/websocket/client.go (7 hunks)
vendor/github.com/gorilla/websocket/compression.go (1 hunks)
vendor/github.com/gorilla/websocket/conn.go (19 hunks)
vendor/github.com/gorilla/websocket/proxy.go (3 hunks)
vendor/github.com/gorilla/websocket/server.go (6 hunks)
vendor/github.com/gorilla/websocket/tls_handshake.go (0 hunks)
vendor/github.com/gorilla/websocket/tls_handshake_116.go (0 hunks)
vendor/github.com/gorilla/websocket/x_net_proxy.go (0 hunks)
vendor/github.com/grafana/regexp/.gitignore (1 hunks)
vendor/github.com/grafana/regexp/LICENSE (1 hunks)
vendor/github.com/grafana/regexp/README.md (1 hunks)
vendor/github.com/grafana/regexp/backtrack.go (1 hunks)
vendor/github.com/grafana/regexp/exec.go (1 hunks)
vendor/github.com/grafana/regexp/onepass.go (1 hunks)
vendor/github.com/grafana/regexp/regexp.go (1 hunks)
vendor/github.com/openzipkin/zipkin-go/LICENSE (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/annotation.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/doc.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/endpoint.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/kind.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/span.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/span_id.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/traceid.go (0 hunks)
vendor/github.com/prometheus/client_golang/prometheus/collectorfunc.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/desc.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (6 hunks)
vendor/github.com/prometheus/client_golang/prometheus/internal/difflib.go (6 hunks)
vendor/github.com/prometheus/client_golang/prometheus/internal/go_runtime_metrics.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/metric.go (3 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.c (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_not_supported.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_windows.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (7 hunks)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/internal/compression.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/summary.go (4 hunks)
vendor/github.com/prometheus/client_golang/prometheus/vec.go (5 hunks)
vendor/github.com/prometheus/client_golang/prometheus/wrap.go (2 hunks)
vendor/github.com/prometheus/common/expfmt/text_parse.go (1 hunks)
vendor/github.com/prometheus/common/model/alert.go (1 hunks)
vendor/github.com/prometheus/common/model/labels.go (2 hunks)
vendor/github.com/prometheus/common/model/metric.go (2 hunks)
vendor/github.com/prometheus/common/model/time.go (3 hunks)
vendor/github.com/prometheus/otlptranslator/.gitignore (1 hunks)
vendor/github.com/prometheus/otlptranslator/.golangci.yml (1 hunks)
vendor/github.com/prometheus/otlptranslator/CODE_OF_CONDUCT.md (1 hunks)
vendor/github.com/prometheus/otlptranslator/LICENSE (0 hunks)
vendor/github.com/prometheus/otlptranslator/MAINTAINERS.md (1 hunks)
vendor/github.com/prometheus/otlptranslator/README.md (1 hunks)
vendor/github.com/prometheus/otlptranslator/SECURITY.md (1 hunks)
vendor/github.com/prometheus/otlptranslator/constants.go (1 hunks)
vendor/github.com/prometheus/otlptranslator/metric_namer.go (1 hunks)
vendor/github.com/prometheus/otlptranslator/metric_type.go (1 hunks)
vendor/github.com/prometheus/otlptranslator/normalize_label.go (1 hunks)
vendor/github.com/prometheus/otlptranslator/strconv.go (1 hunks)
vendor/github.com/prometheus/otlptranslator/unit_namer.go (1 hunks)
vendor/github.com/prometheus/procfs/.golangci.yml (1 hunks)
vendor/github.com/prometheus/procfs/Makefile.common (4 hunks)
vendor/github.com/prometheus/procfs/README.md (1 hunks)
vendor/github.com/prometheus/procfs/arp.go (1 hunks)
vendor/github.com/prometheus/procfs/fs.go (1 hunks)
vendor/github.com/prometheus/procfs/fs_statfs_notype.go (1 hunks)
vendor/github.com/prometheus/procfs/fscache.go (2 hunks)
vendor/github.com/prometheus/procfs/internal/fs/fs.go (1 hunks)
vendor/github.com/prometheus/procfs/internal/util/parse.go (2 hunks)
vendor/github.com/prometheus/procfs/internal/util/sysreadfile.go (2 hunks)
vendor/github.com/prometheus/procfs/mdstat.go (1 hunks)
vendor/github.com/prometheus/procfs/meminfo.go (11 hunks)

⛔ Files not processed due to max files limit (41)

vendor/github.com/prometheus/procfs/mountstats.go
vendor/github.com/prometheus/procfs/net_dev_snmp6.go
vendor/github.com/prometheus/procfs/net_ip_socket.go
vendor/github.com/prometheus/procfs/net_protocols.go
vendor/github.com/prometheus/procfs/net_tcp.go
vendor/github.com/prometheus/procfs/net_unix.go
vendor/github.com/prometheus/procfs/proc.go
vendor/github.com/prometheus/procfs/proc_cgroup.go
vendor/github.com/prometheus/procfs/proc_io.go
vendor/github.com/prometheus/procfs/proc_netstat.go
vendor/github.com/prometheus/procfs/proc_smaps.go
vendor/github.com/prometheus/procfs/proc_snmp.go
vendor/github.com/prometheus/procfs/proc_snmp6.go
vendor/github.com/prometheus/procfs/proc_stat.go
vendor/github.com/prometheus/procfs/proc_statm.go
vendor/github.com/prometheus/procfs/proc_status.go
vendor/github.com/prometheus/procfs/proc_sys.go
vendor/github.com/prometheus/procfs/softirqs.go
vendor/go.opencensus.io/plugin/ochttp/client.go
vendor/go.opencensus.io/plugin/ochttp/client_stats.go
vendor/go.opencensus.io/plugin/ochttp/doc.go
vendor/go.opencensus.io/plugin/ochttp/propagation/b3/b3.go
vendor/go.opencensus.io/plugin/ochttp/propagation/tracecontext/propagation.go
vendor/go.opencensus.io/plugin/ochttp/route.go
vendor/go.opencensus.io/plugin/ochttp/server.go
vendor/go.opencensus.io/plugin/ochttp/span_annotating_client_trace.go
vendor/go.opencensus.io/plugin/ochttp/stats.go
vendor/go.opencensus.io/plugin/ochttp/trace.go
vendor/go.opencensus.io/plugin/ochttp/wrapped_body.go
vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/env.go
vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/gen.go
vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/httpconv.go
vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/util.go
vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/transport.go
vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/version.go
vendor/go.opentelemetry.io/contrib/instrumentation/runtime/LICENSE
vendor/go.opentelemetry.io/contrib/instrumentation/runtime/doc.go
vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/deprecatedruntime/doc.go
vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/deprecatedruntime/runtime.go
vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/x/README.md
vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/x/x.go

💤 Files with no reviewable changes (28)

vendor/github.com/golang/protobuf/proto/discard.go
vendor/github.com/prometheus/otlptranslator/LICENSE
vendor/github.com/openzipkin/zipkin-go/model/span_id.go
vendor/github.com/openzipkin/zipkin-go/model/traceid.go
vendor/github.com/openzipkin/zipkin-go/model/annotation.go
vendor/github.com/google/gofuzz/CONTRIBUTING.md
vendor/github.com/openzipkin/zipkin-go/model/span.go
vendor/github.com/golang/protobuf/proto/deprecated.go
vendor/github.com/golang/protobuf/proto/wrappers.go
cmd/webhook/main.go
vendor/github.com/openzipkin/zipkin-go/LICENSE
vendor/github.com/golang/protobuf/proto/proto.go
vendor/github.com/gorilla/websocket/tls_handshake_116.go
vendor/github.com/golang/protobuf/proto/text_encode.go
vendor/github.com/openzipkin/zipkin-go/model/kind.go
vendor/github.com/gorilla/websocket/tls_handshake.go
vendor/github.com/golang/protobuf/proto/registry.go
vendor/github.com/openzipkin/zipkin-go/model/endpoint.go
vendor/github.com/golang/protobuf/proto/extensions.go
vendor/github.com/golang/protobuf/proto/text_decode.go
vendor/github.com/golang/protobuf/proto/defaults.go
vendor/github.com/golang/protobuf/proto/properties.go
vendor/github.com/openzipkin/zipkin-go/model/doc.go
vendor/github.com/golang/protobuf/proto/wire.go
vendor/github.com/gorilla/websocket/x_net_proxy.go
vendor/github.com/google/gofuzz/.travis.yml
vendor/github.com/google/gofuzz/fuzz.go
vendor/github.com/golang/protobuf/proto/buffer.go

🧰 Additional context used

🧬 Code graph analysis (6)

vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)

vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)

Histogram (249-261)

pkg/pipelinerunmetrics/metrics.go (3)

pkg/apis/pipeline/v1/pipelinerun_types.go (3)

PipelineRunReasonCancelled (345-345)

PipelineRun (51-60)

PipelineRun (73-75)

pkg/apis/config/metrics.go (1)

Metrics (101-109)

pkg/taskrunmetrics/metrics.go (3)

Recorder (46-65)

NewRecorder (74-91)

OnStore (191-207)

vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)

vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)

Metric (33-59)

pkg/taskrunmetrics/metrics_test.go (2)

pkg/pipelinerunmetrics/metrics.go (2)

Recorder (47-63)

NewRecorder (72-88)

pkg/taskrunmetrics/metrics.go (2)

Recorder (46-65)

NewRecorder (74-91)

pkg/taskrunmetrics/metrics.go (4)

pkg/apis/config/metrics.go (7)

Metrics (101-109)

TaskrunLevelAtTaskrun (52-52)

TaskrunLevelAtTask (54-54)

TaskrunLevelAtNS (56-56)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

Recorder (47-63)

NewRecorder (72-88)

OnStore (312-328)

pkg/apis/pipeline/register.go (4)

PipelineTaskLabelKey (47-47)

TaskLabelKey (29-29)

PipelineLabelKey (38-38)

PipelineRunLabelKey (41-41)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/pipelinerunmetrics/metrics_test.go (2)

pkg/pipelinerunmetrics/metrics.go (1)

NewRecorder (72-88)

pkg/apis/config/metrics.go (3)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

🪛 ast-grep (0.39.5)

vendor/github.com/gorilla/websocket/client.go

[warning] 501-501: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🪛 markdownlint-cli2 (0.18.1)

vendor/github.com/go-logr/zapr/README.md

39-39: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

43-43: Hard tabs
Column: 1

(MD010, no-hard-tabs)

44-44: Hard tabs
Column: 1

(MD010, no-hard-tabs)

46-46: Hard tabs
Column: 1

(MD010, no-hard-tabs)

47-47: Hard tabs
Column: 1

(MD010, no-hard-tabs)

48-48: Hard tabs
Column: 1

(MD010, no-hard-tabs)

52-52: Hard tabs
Column: 1

(MD010, no-hard-tabs)

54-54: Hard tabs
Column: 1

(MD010, no-hard-tabs)

55-55: Hard tabs
Column: 1

(MD010, no-hard-tabs)

56-56: Hard tabs
Column: 1

(MD010, no-hard-tabs)

57-57: Hard tabs
Column: 1

(MD010, no-hard-tabs)

58-58: Hard tabs
Column: 1

(MD010, no-hard-tabs)

60-60: Hard tabs
Column: 1

(MD010, no-hard-tabs)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (27)

vendor/github.com/prometheus/procfs/README.md (1)

50-61: Documentation update looks correct.

The refreshed fixture paths now point to testdata/fixtures, matching the relocated test assets and keeping the workflow instructions consistent.

vendor/github.com/prometheus/common/model/alert.go (1)

68-69: Comment clarification looks good.

The updated wording makes the ResolvedAt contract clearer without changing behavior.

vendor/github.com/prometheus/otlptranslator/.gitignore (1)

1-25: Gitignore patterns look good

Matches the standard Go template; no concerns here.

vendor/github.com/prometheus/procfs/fscache.go (1)

165-179: Comment spelling aligns with US English.

Thanks for standardizing the comment spelling. No functional impact, and this keeps terminology consistent.

vendor/github.com/prometheus/client_golang/prometheus/desc.go (1)

192-200: Nil-guard tightens Desc.String() safety

The extra guard keeps us from dereferencing d.variableLabels when it’s nil, so String() can no longer panic for descriptors without variable labels. Nice defensive hardening.

vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go (1)

291-292: Help text punctuation looks good

Appending the trailing period gives the help string a tidy, sentence-like ending without affecting behavior. 👍

vendor/github.com/prometheus/common/expfmt/text_parse.go (1)

348-349: Logic equivalence preserved.

Rewriting the guard with De Morgan’s law keeps the filtering behavior identical while improving readability. Looks good.

vendor/github.com/prometheus/client_golang/prometheus/vec.go (1)

82-83: Embedded forwarding reads cleanly

Switching to the promoted metricMap helpers keeps the behavior identical while dropping the extra field hop, so MetricVec still exposes Delete* and GetMetricWith* exactly as documented. (pkg.go.dev)

Also applies to: 104-105, 117-118, 219-220, 247-248

vendor/github.com/prometheus/procfs/mdstat.go (1)

126-135: Recognition of reshape state looks good.

Adding the reshape detection keeps the parser aligned with the kernel’s mdstat output and preserves the existing state precedence logic.

vendor/github.com/prometheus/procfs/Makefile.common (2)

160-166: Do not hard-fail when golangci-lint is skipped.

common-format now always depends on $(GOLANGCI_LINT). When SKIP_GOLANGCI_LINT or CI detection blanks this variable, the dependency list collapses to empty, so the target still runs. However, inside the recipe we call $(GOLANGCI_LINT) fmt only if the variable is non-empty—good. No action required.

256-259: Target rename is compatible.

Defining common-proto keeps make proto working thanks to the %: common-% forwarding rule earlier in the file. The new name also matches the convention used throughout this Makefile. Looks good.

vendor/github.com/prometheus/client_golang/prometheus/summary.go (1)

229-247: Time source injection keeps default behavior intact.

Defaulting opts.now to time.Now preserves prior semantics, and persisting it on the struct gives tests a clean hook for deterministic timing. Looks solid.

pkg/taskrunmetrics/metrics.go (1)

318-321: Release the mutex before calling the lister.

Locking r.mutex while lister.List(...) walks the informer cache needlessly serializes every caller and keeps the metrics instruments/config locked through a potentially heavy read. Please keep the existing pattern from the earlier review: fetch trs first, then lock just long enough to update the recorder state. (Same concern was flagged previously.)
pkg/taskrunmetrics/metrics_test.go (2)
256-260: Fix RecordPodLatency argument order

Same issue as above: the test should call RecordPodLatency with the TaskRun before the Pod. Leaving it as-is misorders the parameters and breaks compilation.
-			if err := metrics.RecordPodLatency(ctx, td.pod, td.taskRun); td.expectingError && err == nil {
+			if err := metrics.RecordPodLatency(ctx, td.taskRun, td.pod); td.expectingError && err == nil {
72-73: Fix RecordPodLatency argument order

Recorder.RecordPodLatency takes (ctx, taskRun, pod). Passing the pod first violates the signature and will not compile once wired up against the implementation. Swap the arguments here.
-	if err := r.RecordPodLatency(ctx, &corev1.Pod{}, &v1.TaskRun{}); err == nil {
+	if err := r.RecordPodLatency(ctx, &v1.TaskRun{}, &corev1.Pod{}); err == nil {
pkg/pipelinerunmetrics/metrics.go (1)

205-218: Keep lister.List() outside the critical section

This mirrors the earlier review feedback: we still invoke lister.List(labels.Everything()) while holding r.mutex, which can block other recorders waiting on the same lock while we hit the informer cache. Please gather the list before taking the mutex and only lock while updating recorder state.

vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go (1)

18-20: Graceful no-cgo fallback confirmed.

Returning errNotImplemented here cleanly lines up with the processCollect guard, so Darwin builds without cgo skip the memory gauges without spurious errors.

vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.go (1)

24-32: Cgo-backed memory collection looks solid.

Bridging through get_memory_info and normalizing into memoryInfo aligns with the caller’s expectations, giving accurate RSS/VSize data when cgo is available.

vendor/github.com/prometheus/otlptranslator/.golangci.yml (1)

17-37: nilnesserr is supported by the CI’s golangci-lint version
CI installs golangci-lint v2.1.6 (via the GitHub Action version: v2.1.6), which includes the nilnesserr linter, so no version bump or hold-off is needed.

vendor/github.com/prometheus/otlptranslator/metric_type.go (1)

21-35: Metric type enumeration aligns with OTel model.

The exported constants mirror the OpenTelemetry metric kinds and look good.

vendor/github.com/gorilla/websocket/compression.go (1)

36-40: Pooling fallback looks solid.

The extra defensive Reset handling keeps the pooled reader usable without risking panics if the contract changes; nice catch.

vendor/github.com/gorilla/websocket/conn.go (1)

183-193: Mask key generation upgrade looks good.

Switching to crypto/rand for mask generation is a nice improvement for compliance without affecting tests thanks to the overridable reader.

vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (2)

201-212: Minor: ensure closeWriter is always set to a no-op on error.

negotiateEncodingWriter already returns a no-op closer on error, so this is safe. No action needed; just calling out that this invariant is relied upon.

If you prefer defensive code, you can set closeWriter = func(){} in the error branch.

217-222: No gating required: WithCreatedLines exists and is no-op outside OpenMetrics formats
Vendored expfmt defines WithCreatedLines (vendor/github.com/prometheus/common/expfmt/openmetrics_create.go:49) and its encode logic ignores extra options for non-OpenMetrics formats (vendor/github.com/prometheus/common/expfmt/encode.go:148–149).

vendor/github.com/grafana/regexp/exec.go (2)

333-374: Sparse queue indexing relies on pre-sized slices.

add() indexes q.sparse[pc] without bounds checks. This is fine only if q.sparse is sized to len(prog.Inst). get() appears to ensure that, but please keep these coupled: any future changes to machine sizing must preserve this invariant.

521-550: Verified referenced symbols and LICENSE presence

Confirmed backtrack, compileOnePass, onePassPrefix, onePassNext, type onePassProg, and the LICENSE file are all present in vendor/github.com/grafana/regexp. No further action needed.

vendor/github.com/grafana/regexp/regexp.go (1)

820-841: Parity check with stdlib regexp (behavioral regressions).

Given this reimplementation mirrors Go’s regexp, please run a small parity suite comparing Find*/Split/Replace* against stdlib for representative cases (ASCII/UTF‑8, empty matches, prefixes, named groups). This will catch subtle divergences. I can provide scaffolding if helpful.

Also applies to: 843-867, 1028-1073, 1149-1190, 1191-1233

pkg/pipelinerunmetrics/metrics.go

coderabbitai · 2025-09-29T01:22:44Z

pkg/taskrunmetrics/metrics.go

+	r.runningTRsGauge.Add(ctx, float64(runningTrs))
+	r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))
+	r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode))



⚠️ Potential issue | 🔴 Critical

Fix running TaskRun gauges to add deltas, not totals.

Float64UpDownCounter.Add expects per-call deltas. Adding the full count on every scrape makes the exported value soar cumulatively (e.g., 5 running TaskRuns sampled twice yields 10). That breaks the “current number running” metric and any alerts derived from it. Please either switch to an asynchronous gauge/observable callback or maintain the previous sample and add only the delta before updating the cached value. For example:

type Recorder struct { runningTRsThrottledByNodeGauge metric.Float64UpDownCounter + lastRunningTRs int + lastTrsThrottledByQuota int + lastTrsThrottledByNode int } @@ r.podLatencyHistogram = podLatencyHistogram + r.lastRunningTRs = 0 + r.lastTrsThrottledByQuota = 0 + r.lastTrsThrottledByNode = 0 @@ - r.runningTRsGauge.Add(ctx, float64(runningTrs)) - r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota)) - r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode)) + r.runningTRsGauge.Add(ctx, float64(runningTrs-r.lastRunningTRs)) + r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota-r.lastTrsThrottledByQuota)) + r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode-r.lastTrsThrottledByNode)) + + r.lastRunningTRs = runningTrs + r.lastTrsThrottledByQuota = trsThrottledByQuota + r.lastTrsThrottledByNode = trsThrottledByNode

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

r.runningTRsGauge.Add(ctx, float64(runningTrs))

r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))

r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode))

// pkg/taskrunmetrics/metrics.go

type Recorder struct {

// existing fields...

runningTRsGauge metric.Float64UpDownCounter

runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter

runningTRsThrottledByNodeGauge metric.Float64UpDownCounter

lastRunningTRs int

lastTrsThrottledByQuota int

lastTrsThrottledByNode int

// other fields...

}

func NewRecorder(/* params */) *Recorder {

r := &Recorder{

// existing initializations...

runningTRsGauge: /* ... */,

runningTRsThrottledByQuotaGauge: /* ... */,

runningTRsThrottledByNodeGauge: /* ... */,

podLatencyHistogram: podLatencyHistogram,

lastRunningTRs: 0,

lastTrsThrottledByQuota: 0,

lastTrsThrottledByNode: 0,

}

return r

}

func (r *Recorder) Record(ctx context.Context, /* params */) error {

// ... prior code ...

- r.runningTRsGauge.Add(ctx, float64(runningTrs))

- r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))

r.runningTRsGauge.Add(ctx, float64(runningTrs - r.lastRunningTRs))

r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota - r.lastTrsThrottledByQuota))

r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode - r.lastTrsThrottledByNode))

r.lastRunningTRs = runningTrs

r.lastTrsThrottledByQuota = trsThrottledByQuota

r.lastTrsThrottledByNode = trsThrottledByNode

// ... remaining code ...

}

🤖 Prompt for AI Agents

In pkg/taskrunmetrics/metrics.go around lines 344-347, the code is calling Float64UpDownCounter.Add with the full totals which results in cumulative inflation; change this to emit deltas instead: either replace these counters with asynchronous observable gauges (register callbacks that return the current running/trhottled counts) or keep a small cached previous value per metric and call Add(ctx, current-prev) then update the cache; implement one of these two approaches and remove the existing direct Add(ctx, float64(total)) calls so only the per-scrape delta is added.

coderabbitai · 2025-09-29T01:22:44Z

vendor/github.com/go-logr/zapr/zapr.go

+		fields = append(fields, zapIt(keyStr, val))
+		i += 2
+	}


⚠️ Potential issue | 🔴 Critical

Fix go1.21 build: provide zapIt implementation.

Under Go 1.21, zapr_noslog.go is excluded, so there is no definition of zapIt. This line and other call sites leave zapr.go uncompilable on the very toolchain that introduces slogzapr.go. Please add a go1.21-compatible definition (for example alongside the slog adapter) so the package builds across supported Go versions.

🤖 Prompt for AI Agents

In vendor/github.com/go-logr/zapr/zapr.go around lines 164-166, calls to zapIt are failing under Go 1.21 because the zapIt implementation in zapr_noslog.go is excluded; add a go1.21-compatible implementation (either directly in this file or in a new file without build tags) named zapIt(key string, val interface{}) zap.Field that mirrors expected behavior: convert common types and the slog.Value wrapper into appropriate zap.Field values (fallback to zap.Any for unknown types) and preserve existing semantics used by other call sites so the package compiles under Go 1.21.

coderabbitai · 2025-09-29T01:22:44Z

vendor/github.com/gorilla/websocket/server.go

+func (b *brNetConn) Read(p []byte) (n int, err error) {
+	if b.br != nil {
+		// Limit read to buferred data.
+		if n := b.br.Buffered(); len(p) > n {
+			p = p[:n]
+		}
+		n, err = b.br.Read(p)
+		if b.br.Buffered() == 0 {
+			b.br = nil
+		}
+		return n, err
+	}
+	return b.Conn.Read(p)


⚠️ Potential issue | 🟠 Major

Avoid returning 0 bytes with nil error.

When br.Buffered() is zero we now return 0, nil even though the caller’s buffer length is >0. That violates the io.Reader contract and can make consumers spin forever, depending on how they treat (0, nil) reads.(typeerror.org)

We should drop through to the underlying connection in that case instead of handing back a zero-length read.

func (b *brNetConn) Read(p []byte) (n int, err error) { if b.br != nil { - // Limit read to buferred data. - if n := b.br.Buffered(); len(p) > n { - p = p[:n] - } - n, err = b.br.Read(p) - if b.br.Buffered() == 0 { - b.br = nil - } - return n, err + if buffered := b.br.Buffered(); buffered > 0 { + // Limit read to buffered data. + if len(p) > buffered { + p = p[:buffered] + } + n, err = b.br.Read(p) + if b.br.Buffered() == 0 { + b.br = nil + } + return n, err + } + b.br = nil } return b.Conn.Read(p) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (b *brNetConn) Read(p []byte) (n int, err error) {

if b.br != nil {

// Limit read to buferred data.

if n := b.br.Buffered(); len(p) > n {

p = p[:n]

}

n, err = b.br.Read(p)

if b.br.Buffered() == 0 {

b.br = nil

}

return n, err

}

return b.Conn.Read(p)

func (b *brNetConn) Read(p []byte) (n int, err error) {

if b.br != nil {

// Only read from the buffer if there's data available.

if buffered := b.br.Buffered(); buffered > 0 {

// Limit read to buffered data.

if len(p) > buffered {

p = p[:buffered]

}

n, err = b.br.Read(p)

if b.br.Buffered() == 0 {

b.br = nil

}

return n, err

}

// No buffered data—fall back to the underlying connection.

b.br = nil

}

return b.Conn.Read(p)

}

🤖 Prompt for AI Agents

In vendor/github.com/gorilla/websocket/server.go around lines 354 to 366 the Read method can return (0, nil) when b.br.Buffered() == 0 which violates io.Reader expectations; change the logic so that if Buffered() is zero you clear b.br and fall through to call and return from the underlying b.Conn.Read(p) instead of returning 0,nil; otherwise keep the existing behavior of slicing p to the buffered size, reading from b.br, and clearing b.br when drained.

coderabbitai · 2025-09-29T01:22:44Z

vendor/github.com/grafana/regexp/exec.go

+			matched = true
+			if len(m.matchcap) > 0 {
+				m.matchcap[0] = 0
+				m.matchcap[1] = pos
+			}
+			goto Return


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Possible wrong match start index in one-pass path (start set to 0).

When InstMatch fires, m.matchcap[0] is set to 0, not the search start pos. For unanchored scans or after prefix skipping, this can report an incorrect start offset.

Suggest:

Initialize start once to pos before the loop.

On InstMatch, only set end.

@@ - matched := false + matched := false for i := range m.matchcap { m.matchcap[i] = -1 } @@ - i, _ := m.inputs.init(ir, ib, is) + i, _ := m.inputs.init(ir, ib, is) + if ncap > 0 { + m.matchcap[0] = pos + } @@ - case syntax.InstMatch: - matched = true - if len(m.matchcap) > 0 { - m.matchcap[0] = 0 - m.matchcap[1] = pos - } + case syntax.InstMatch: + matched = true + if len(m.matchcap) > 0 { + m.matchcap[1] = pos + } goto Return

Please add a quick test comparing FindStringIndex/FindIndex results vs stdlib regexp for cases where the match starts after index 0 and where a literal prefix is skipped.

Also applies to: 415-417

🏁 Script executed:

#!/usr/bin/env bash set -e echo "=== doOnePass definition & context ===" rg -n 'func doOnePass' -g 'vendor/github.com/grafana/regexp/exec.go' -C5 echo "=== InstMatch case ===" rg -n 'case .*InstMatch' -g 'vendor/github.com/grafana/regexp/exec.go' -C5 echo "=== matchcap initialization ===" rg -n 'matchcap\[' -g 'vendor/github.com/grafana/regexp/exec.go' -C5

Length of output: 38

🏁 Script executed:

#!/usr/bin/env bash set -e echo "=== snippet m.matchcap[0] ===" rg -n "m.matchcap\\[0\\] = 0" -C3 . echo "=== locate doOnePass ===" rg -n "doOnePass" -C3 . echo "=== locate InstMatch ===" rg -n "InstMatch" -C3 .

Length of output: 10485

🏁 Script executed:

#!/usr/bin/env bash sed -n '390,480p' vendor/github.com/grafana/regexp/exec.go

Length of output: 2088

🏁 Script executed:

#!/usr/bin/env bash sed -n '480,550p' vendor/github.com/grafana/regexp/exec.go

Length of output: 1527

Record correct start offset in one-pass engine

After any literal-prefix skip (i.e. immediately after pos += len(re.prefix)), if ncap > 0 set m.matchcap[0] = pos; do not initialize it before the skip.

In the case syntax.InstMatch block remove m.matchcap[0] = 0 and only assign m.matchcap[1] = pos.

Add a test verifying that FindStringIndex/FindIndex on a pattern with a literal prefix reports the same start/end indices as the stdlib regexp for matches beyond index 0.

🤖 Prompt for AI Agents

In vendor/github.com/grafana/regexp/exec.go around lines 451-456, the one-pass engine records the match start offset too early and resets it incorrectly: after any literal-prefix skip (i.e. after pos += len(re.prefix)) if ncap > 0 set m.matchcap[0] = pos (do not initialize it before the skip); in the case syntax.InstMatch remove the line that sets m.matchcap[0] = 0 and only assign m.matchcap[1] = pos; also add a unit test that runs FindStringIndex/FindIndex on a pattern with a literal prefix and asserts the start/end indices match the stdlib regexp for matches starting beyond index 0.

coderabbitai · 2025-09-29T01:22:45Z

vendor/github.com/prometheus/common/model/time.go

+func ParseDurationAllowNegative(s string) (Duration, error) {
+	if s == "" || s[0] != '-' {
+		return ParseDuration(s)
+	}
+
+	d, err := ParseDuration(s[1:])
+
+	return -d, err
+}


⚠️ Potential issue | 🟠 Major

Ensure negative durations round-trip through JSON/YAML.

ParseDurationAllowNegative exposes negative values, and Duration.String now produces "-1m" (and MarshalJSON wraps that), but UnmarshalJSON, UnmarshalText, and UnmarshalYAML still invoke ParseDuration, which bails out on any leading -. As soon as a negative Duration is marshaled or read from config, unmarshalling fails, making the new feature unusable. Please update the unmarshal paths to call the negative-aware helper so serialization remains symmetric.

Apply this diff:

func (d Duration) MarshalJSON() ([]byte, error) { return json.Marshal(d.String()) } func (d *Duration) UnmarshalJSON(bytes []byte) error { var s string if err := json.Unmarshal(bytes, &s); err != nil { return err } - dur, err := ParseDuration(s) + dur, err := ParseDurationAllowNegative(s) if err != nil { return err } *d = dur return nil } @@ func (d *Duration) UnmarshalText(text []byte) error { var err error - *d, err = ParseDuration(string(text)) + *d, err = ParseDurationAllowNegative(string(text)) return err } @@ func (d *Duration) UnmarshalYAML(unmarshal func(interface{}) error) error { var s string if err := unmarshal(&s); err != nil { return err } - dur, err := ParseDuration(s) + dur, err := ParseDurationAllowNegative(s) if err != nil { return err } *d = dur return nil }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func ParseDurationAllowNegative(s string) (Duration, error) {

if s == "" || s[0] != '-' {

return ParseDuration(s)

}

d, err := ParseDuration(s[1:])

return -d, err

}

func (d Duration) MarshalJSON() ([]byte, error) {

return json.Marshal(d.String())

}

func (d *Duration) UnmarshalJSON(bytes []byte) error {

var s string

if err := json.Unmarshal(bytes, &s); err != nil {

return err

}

dur, err := ParseDurationAllowNegative(s)

if err != nil {

return err

}

*d = dur

return nil

}

func (d *Duration) UnmarshalText(text []byte) error {

var err error

*d, err = ParseDurationAllowNegative(string(text))

return err

}

func (d *Duration) UnmarshalYAML(unmarshal func(interface{}) error) error {

var s string

if err := unmarshal(&s); err != nil {

return err

}

dur, err := ParseDurationAllowNegative(s)

if err != nil {

return err

}

*d = dur

return nil

}

🤖 Prompt for AI Agents

vendor/github.com/prometheus/common/model/time.go around lines 262-270 and the UnmarshalJSON/UnmarshalText/UnmarshalYAML methods: currently parsing during unmarshal uses ParseDuration which rejects leading '-' so negative durations fail to round-trip; update the unmarshal paths to call ParseDurationAllowNegative instead of ParseDuration (ensure you strip any surrounding quotes or whitespace the same way as before) so negative values are accepted; keep error handling and type assignments identical, only swap the parser call to the negative-aware helper to restore symmetric marshal/unmarshal behavior.

coderabbitai · 2025-09-29T01:22:45Z

vendor/github.com/prometheus/otlptranslator/metric_namer.go

+	if mn.Namespace != "" {
+		namespace := strings.Join(strings.FieldsFunc(mn.Namespace, func(r rune) bool {
+			return invalidMetricCharRE.MatchString(string(r))
+		}), "_")
+		return namespace + "_" + metricName


⚠️ Potential issue | 🔴 Critical

Still guard namespace-prefixed names against leading digits.

When mn.Namespace resolves to something starting with a digit (e.g., "123"), return namespace + "_" + metricName emits 123_metric, which violates Prometheus’ [a-zA-Z_:][a-zA-Z0-9_:]* constraint. The digit guard below never runs because we return early. Please funnel the combined name through the same unicode.IsDigit check before returning.

- return namespace + "_" + metricName + fullName := namespace + "_" + metricName + if fullName != "" && unicode.IsDigit(rune(fullName[0])) { + fullName = "_" + fullName + } + return fullName

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/otlptranslator/metric_namer.go around lines 122 to 126, the function returns the combined namespace + "_" + metricName immediately which skips the leading-digit guard; change the flow to build the combined name into a variable (e.g., combined := namespace + "_" + metricName) and then run the same leading-digit check used elsewhere (inspect the first rune with unicode.IsDigit and if true prefix an underscore) before returning the combined value so namespace-prefixed names that start with a digit become valid Prometheus metric names.

coderabbitai · 2025-09-29T01:22:45Z

vendor/github.com/prometheus/otlptranslator/metric_namer.go

+	if namespace != "" {
+		nameTokens = append([]string{namespace}, nameTokens...)
+	}


⚠️ Potential issue | 🔴 Critical

Normalize namespace tokens before prefixing.

Here we prepend the raw namespace string, so values like "otel-collector" inject - into the supposedly “normalized” name, again breaking the Prometheus regex. We need to run the namespace through the same tokenization as the metric name before appending.

- if namespace != "" { - nameTokens = append([]string{namespace}, nameTokens...) - } + if namespace != "" { + namespaceTokens := strings.FieldsFunc(namespace, func(r rune) bool { + return !isValidCompliantMetricChar(r) + }) + if len(namespaceTokens) > 0 { + nameTokens = append(namespaceTokens, nameTokens...) + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if namespace != "" {

nameTokens = append([]string{namespace}, nameTokens...)

}

if namespace != "" {

namespaceTokens := strings.FieldsFunc(namespace, func(r rune) bool {

return !isValidCompliantMetricChar(r)

})

if len(namespaceTokens) > 0 {

nameTokens = append(namespaceTokens, nameTokens...)

}

}

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/otlptranslator/metric_namer.go around lines 187 to 189, the code currently prepends the raw namespace string which can contain invalid characters; instead, run the namespace through the same tokenization/normalization path used for the metric name (split into tokens, normalize/lowercase/strip or replace invalid chars, and filter out empty tokens) and then prepend those normalized namespace tokens to nameTokens; if tokenization yields no tokens, skip prepending to avoid injecting invalid characters into the final Prometheus metric name.

coderabbitai · 2025-09-29T01:22:45Z

vendor/github.com/prometheus/procfs/internal/util/parse.go

+	if !strings.HasPrefix(hexString, "0x") {
+		return 0, errors.New("invalid format: hex string does not start with '0x'")
+	}
+	return strconv.ParseUint(hexString[2:], 16, 64)


⚠️ Potential issue | 🟠 Major

Allow both 0x/0X hex prefixes.

Procfs writers frequently emit uppercase 0X in addition to lowercase 0x; rejecting the uppercase form turns otherwise valid kernel-provided values into hard failures.(lkml.iu.edu) Please accept both cases before parsing.

- if !strings.HasPrefix(hexString, "0x") { + if len(hexString) < 2 || (hexString[:2] != "0x" && hexString[:2] != "0X") { return 0, errors.New("invalid format: hex string does not start with '0x'") } - return strconv.ParseUint(hexString[2:], 16, 64) + return strconv.ParseUint(hexString[2:], 16, 64)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if !strings.HasPrefix(hexString, "0x") {

return 0, errors.New("invalid format: hex string does not start with '0x'")

}

return strconv.ParseUint(hexString[2:], 16, 64)

if len(hexString) < 2 || (hexString[:2] != "0x" && hexString[:2] != "0X") {

return 0, errors.New("invalid format: hex string does not start with '0x'")

}

return strconv.ParseUint(hexString[2:], 16, 64)

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/procfs/internal/util/parse.go around lines 122 to 125, the code only accepts a lowercase "0x" prefix and rejects "0X" which valid procfs writers may emit; change the prefix check to accept both "0x" and "0X" (e.g., check hexString has length >=2 and hexString[0]=='0' and (hexString[1]=='x' || hexString[1]=='X') or normalize the prefix case) and then call strconv.ParseUint on the slice starting after the two-character prefix; ensure you still validate length before slicing to avoid panics.

coderabbitai · 2025-09-29T01:22:46Z

vendor/github.com/prometheus/procfs/meminfo.go

+			m.Zswapped = &val
+			m.ZswapBytes = &valBytes
 		case "Dirty:":


⚠️ Potential issue | 🔴 Critical

Fix assignment for ZswappedBytes

Line 311 stores the byte-normalized value for Zswapped: into m.ZswapBytes, leaving ZswappedBytes nil and overwriting the bytes value captured for Zswap:. This breaks consumers that expect ZswappedBytes to be populated. Please assign the value to the correct field.

- case "Zswapped:": - m.Zswapped = &val - m.ZswapBytes = &valBytes + case "Zswapped:": + m.Zswapped = &val + m.ZswappedBytes = &valBytes

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

m.Zswapped = &val

m.ZswapBytes = &valBytes

case "Dirty:":

m.Zswapped = &val

m.ZswappedBytes = &valBytes

case "Dirty:":

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/procfs/meminfo.go around lines 309 to 311, the code assigns the byte-normalized value for "Zswapped:" to m.ZswapBytes instead of m.ZswappedBytes, leaving m.ZswappedBytes nil and overwriting the Zswap bytes value; change the assignment so the normalized bytes value is stored in m.ZswappedBytes (and ensure m.ZswapBytes remains assigned only for the "Zswap:" key).

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 103b041 and 5a9bdab.

📒 Files selected for processing (7)

config/config-observability.yaml (2 hunks)
pkg/pipelinerunmetrics/metrics.go (3 hunks)
pkg/pipelinerunmetrics/metrics_test.go (5 hunks)
pkg/reconciler/pipelinerun/pipelinerun.go (1 hunks)
pkg/taskrunmetrics/metrics.go (7 hunks)
pkg/taskrunmetrics/metrics_test.go (9 hunks)
test/util.go (0 hunks)

💤 Files with no reviewable changes (1)

test/util.go

🧰 Additional context used

🧬 Code graph analysis (4)

pkg/pipelinerunmetrics/metrics_test.go (3)

pkg/pipelinerunmetrics/metrics.go (1)

NewRecorder (72-88)

pkg/taskrunmetrics/metrics.go (1)

NewRecorder (74-91)

pkg/apis/config/metrics.go (3)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

pkg/apis/pipeline/v1/pipelinerun_types.go (3)

PipelineRunReasonCancelled (345-345)

PipelineRun (51-60)

PipelineRun (73-75)

pkg/apis/config/metrics.go (4)

Metrics (101-109)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/taskrunmetrics/metrics.go (3)

Recorder (46-65)

NewRecorder (74-91)

OnStore (191-207)

pkg/taskrunmetrics/metrics.go (4)

pkg/apis/config/metrics.go (7)

Metrics (101-109)

TaskrunLevelAtTaskrun (52-52)

TaskrunLevelAtTask (54-54)

TaskrunLevelAtNS (56-56)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

Recorder (47-63)

NewRecorder (72-88)

OnStore (312-328)

pkg/apis/pipeline/register.go (4)

PipelineTaskLabelKey (47-47)

TaskLabelKey (29-29)

PipelineLabelKey (38-38)

PipelineRunLabelKey (41-41)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/taskrunmetrics/metrics_test.go (2)

pkg/pipelinerunmetrics/metrics.go (2)

Recorder (47-63)

NewRecorder (72-88)

pkg/taskrunmetrics/metrics.go (2)

Recorder (46-65)

NewRecorder (74-91)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (5)

pkg/reconciler/pipelinerun/pipelinerun.go (1)

311-313: Nice context propagation for metrics.

Threading the reconcile context through to DurationAndCount keeps span data and cancellation semantics intact for metrics emission. Thanks for tightening that up.

pkg/taskrunmetrics/metrics_test.go (1)

175-200: Appreciate the lightweight lister double.

The minimal mockTaskRunLister keeps this test focused on recorder wiring without pulling informer plumbing into the mix—good call.

config/config-observability.yaml (1)

40-101: Clearer OTel guidance in the sample config.

The expanded example docs make it much easier for operators to flip on OTLP exporters while still seeing the legacy knobs—nice documentation upgrade.
pkg/pipelinerunmetrics/metrics.go (1)
206-248: Fix RunningPipelineRuns gauge semantics and trim the critical section.

Float64UpDownCounter.Add expects per-call deltas. Feeding it the full count on every scrape (and trying to “reset” with Add(ctx, 0)) makes the exported gauge march upward forever, so alerts will scream even when the number of running PipelineRuns stays flat. This also contradicts the OTEL guidance that synchronous UpDownCounters should receive incremental updates.(opentelemetry.netlify.app)

On top of that, we grab r.mutex before invoking lister.List(...). Listing can block on the informer cache, so we end up serializing every other metric call behind a potentially expensive read.

Please compute the totals first, then lock only while updating recorder state and emit the delta since the last snapshot. Initializing the cached values when (re)building instruments keeps the gauges consistent after config changes.
@@
-	runningPRsWaitingOnPipelineResolutionGauge metric.Float64UpDownCounter
-	runningPRsWaitingOnTaskResolutionGauge     metric.Float64UpDownCounter
+	runningPRsWaitingOnPipelineResolutionGauge metric.Float64UpDownCounter
+	runningPRsWaitingOnTaskResolutionGauge     metric.Float64UpDownCounter
+
+	runningPRsLastValue                  float64
+	runningPRsWaitingOnPipelineLastValue float64
+	runningPRsWaitingOnTaskLastValue     float64
@@
-	runningPRsWaitingOnTaskResolutionGauge, err := r.meter.Float64UpDownCounter(
+	runningPRsWaitingOnTaskResolutionGauge, err := r.meter.Float64UpDownCounter(
@@
-	r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge
+	r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge
+
+	r.runningPRsLastValue = 0
+	r.runningPRsWaitingOnPipelineLastValue = 0
+	r.runningPRsWaitingOnTaskLastValue = 0
@@
-	r.mutex.Lock()
-	defer r.mutex.Unlock()
-
-	prs, err := lister.List(labels.Everything())
+	prs, err := lister.List(labels.Everything())
@@
-	r.runningPRsGauge.Add(ctx, 0, metric.WithAttributes())
-	r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, 0, metric.WithAttributes())
-	r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, 0, metric.WithAttributes())
-
 	runningCount := 0
@@
-	r.runningPRsGauge.Add(ctx, float64(runningCount), metric.WithAttributes())
-	r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, float64(waitingOnPipelineCount), metric.WithAttributes())
-	r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, float64(waitingOnTaskCount), metric.WithAttributes())
+	r.mutex.Lock()
+	defer r.mutex.Unlock()
+
+	runningDelta := float64(runningCount) - r.runningPRsLastValue
+	pipelineDelta := float64(waitingOnPipelineCount) - r.runningPRsWaitingOnPipelineLastValue
+	taskDelta := float64(waitingOnTaskCount) - r.runningPRsWaitingOnTaskLastValue
+
+	if runningDelta != 0 {
+		r.runningPRsGauge.Add(ctx, runningDelta)
+	}
+	if pipelineDelta != 0 {
+		r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, pipelineDelta)
+	}
+	if taskDelta != 0 {
+		r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, taskDelta)
+	}
+
+	r.runningPRsLastValue = float64(runningCount)
+	r.runningPRsWaitingOnPipelineLastValue = float64(waitingOnPipelineCount)
+	r.runningPRsWaitingOnTaskLastValue = float64(waitingOnTaskCount)
pkg/taskrunmetrics/metrics.go (1)
312-347: Make TaskRun running gauges emit deltas and move lister.List outside the lock.

Just like the PipelineRun gauges, we currently add the full counts to Float64UpDownCounter on every scrape. That makes the exported value balloon cumulatively instead of reflecting “current number of running TaskRuns,” which breaks dashboards and alerting. OTEL’s synchronous UpDownCounter contract is to add positive/negative deltas, not absolute totals.(opentelemetry.netlify.app)

We also enter the critical section before calling lister.List, so a slow informer read will block every other metric update.

Capture the counts first, lock only while mutating recorder state, and publish the delta relative to the last snapshot (resetting that cache whenever instruments are rebuilt):
@@
-	runningTRsThrottledByNodeGauge  metric.Float64UpDownCounter
-	podLatencyHistogram             metric.Float64Histogram
+	runningTRsThrottledByNodeGauge  metric.Float64UpDownCounter
+	podLatencyHistogram             metric.Float64Histogram
+
+	runningTRsLastValue            float64
+	trsThrottledByQuotaLastValue   float64
+	trsThrottledByNodeLastValue    float64
@@
-	r.runningTRsThrottledByNodeGauge = runningTRsThrottledByNodeGauge
+	r.runningTRsThrottledByNodeGauge = runningTRsThrottledByNodeGauge
+
+	r.runningTRsLastValue = 0
+	r.trsThrottledByQuotaLastValue = 0
+	r.trsThrottledByNodeLastValue = 0
@@
-	r.mutex.Lock()
-	defer r.mutex.Unlock()
-
-	trs, err := lister.List(labels.Everything())
+	trs, err := lister.List(labels.Everything())
@@
-	r.runningTRsGauge.Add(ctx, float64(runningTrs))
-	r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))
-	r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode))
+	r.mutex.Lock()
+	defer r.mutex.Unlock()
+
+	runningDelta := float64(runningTrs) - r.runningTRsLastValue
+	quotaDelta := float64(trsThrottledByQuota) - r.trsThrottledByQuotaLastValue
+	nodeDelta := float64(trsThrottledByNode) - r.trsThrottledByNodeLastValue
+
+	if runningDelta != 0 {
+		r.runningTRsGauge.Add(ctx, runningDelta)
+	}
+	if quotaDelta != 0 {
+		r.runningTRsThrottledByQuotaGauge.Add(ctx, quotaDelta)
+	}
+	if nodeDelta != 0 {
+		r.runningTRsThrottledByNodeGauge.Add(ctx, nodeDelta)
+	}
+
+	r.runningTRsLastValue = float64(runningTrs)
+	r.trsThrottledByQuotaLastValue = float64(trsThrottledByQuota)
+	r.trsThrottledByNodeLastValue = float64(trsThrottledByNode)

pkg/pipelinerunmetrics/metrics_test.go

This upgrades knative to 1.19 and other dependent dependencies.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

go.mod (1)
3-3: go directive should be a minor version (use go 1.24)

The go directive typically uses the minor version (e.g., go 1.24). If you need to pin an exact toolchain patch, add a separate toolchain go1.24.x line.

Apply:
-go 1.24.0
+go 1.24
+toolchain go1.24.0

🧹 Nitpick comments (16)

go.mod (1)

137-137: Dual gengo module paths present (k8s.io/gengo and k8s.io/gengo/v2)
go.mod includes both k8s.io/gengo v0.0.0-20240404160639-a0386bf69313 (line 243) and k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7 (line 137) as indirect dependencies. This is expected for transitive deps on different major versions. If you control or can update upstream modules, consider consolidating on the /v2 path to reduce confusion.
pkg/pipelinerunmetrics/metrics_test.go (4)
303-305: Isolate OTEL provider per-test with cleanup.

Setting a global meter provider without restoring it can leak across tests. Capture the old provider and restore it in t.Cleanup.
- provider := metric.NewMeterProvider(metric.WithReader(reader))
- otel.SetMeterProvider(provider)
+ provider := metric.NewMeterProvider(metric.WithReader(reader))
+ prev := otel.GetMeterProvider()
+ otel.SetMeterProvider(provider)
+ t.Cleanup(func() { otel.SetMeterProvider(prev) })
307-316: Avoid mutating the package-global recorder in tests.

Directly assigning to r couples tests to internals and can race with once.Do. Prefer resetting once and using NewRecorder with the test provider.
- // Create a new recorder with the test meter provider.
- r = &Recorder{
-   initialized:     true,
-   ReportingPeriod: 30 * time.Second,
-   meter:           provider.Meter("tekton_pipelines_controller"),
- }
- cfg := config.FromContextOrDefaults(ctx)
- r.cfg = cfg.Metrics
- if err := r.initializeInstruments(cfg.Metrics); err != nil {
-   t.Fatalf("initializeInstruments: %v", err)
- }
+ once = sync.Once{}
+ rec, err := NewRecorder(ctx)
+ if err != nil {
+   t.Fatalf("NewRecorder: %v", err)
+ }
+ r = rec
Note: add import "sync" at top.

179-189: Consider asserting histogram/count emission for DurationAndCount.

You already wire OTEL readers elsewhere; doing the same here lets you assert at least one datapoint was recorded for duration/total count.

364-383: Add assertions for “waiting on resolution” gauges.

This test currently only checks initialization. Drive cases that should increment:

waiting_on_pipeline_resolution

waiting_on_task_resolution

Then collect via ManualReader and assert expected attribute sets/values, similar to CountAtAllLevels.
pkg/taskrunmetrics/metrics_test.go (6)
224-227: Restore OTEL provider after test to prevent cross-test leakage.

Add t.Cleanup to put the previous provider back.
 reader := metric.NewManualReader()
 provider := metric.NewMeterProvider(metric.WithReader(reader))
-otel.SetMeterProvider(provider)
+prev := otel.GetMeterProvider()
+otel.SetMeterProvider(provider)
+t.Cleanup(func() { otel.SetMeterProvider(prev) })
222-223: Reset singleton once per-test using t.Cleanup.

Make the reset self-contained and future-proof.
-once = sync.Once{}
+t.Cleanup(func() { once = sync.Once{} })
+once = sync.Once{}
251-272: Optional: also assert expected attributes.

Beyond the value 1, you can validate whether a namespace attribute is present or cluster-level is used, depending on config. Helps catch tag regression.

85-96: Ensure NewRecorder sees the test provider.

Reset once and add provider restore here too, like in other tests, to avoid ordering flakiness.

119-184: Consider validating duration/total count emission.

Mirroring the OTEL reader pattern from other tests would let you assert a histogram/count was recorded for TaskRun durations.

431-469: Fix typos in test field names.

Rename expetected* to expected* for clarity.
- expetectedPipeline    string
- expetectedPipelineRun string
+ expectedPipeline      string
+ expectedPipelineRun   string
...
- if pipeline != test.expetectedPipeline {
+ if pipeline != test.expectedPipeline {
...
- if pipelineRun != test.expetectedPipelineRun {
+ if pipelineRun != test.expectedPipelineRun {
config/config-observability.yaml (1)

40-74: Clarify supported values for OTEL config keys
Update config/config-observability.yaml to list each key’s allowed values and map them to the exporter setup (e.g.,

metrics-protocol: prometheus | grpc | http/protobuf | none

metrics-endpoint: host:port or host%zone:port (or bracketed equivalents)

metrics-export-interval: duration string (e.g., “30s”, “1m”)

tracing-protocol: grpc | http/protobuf | none | stdout

tracing-endpoint: same format as metrics-endpoint

tracing-sampling-rate: 0.0–1.0

runtime-profiling: enabled | disabled

runtime-export-interval: duration string)
And add a reference or link to the parsing logic in vendor/knative.dev/pkg/observability/metrics/config.go, tracing/config.go, and runtime/config.go.
pkg/pipelinerunmetrics/metrics.go (2)
284-287: Log errors during periodic reporting

Currently errors are swallowed. Mirror taskrun reporter and log warnings to aid ops.
 case <-delay.C:
   // Every 30s surface a metric for the number of running pipelines.
   if err := r.RunningPipelineRuns(ctx, lister); err != nil {
-    // Log error but continue reporting
-    // In a real implementation, you might want to use a logger here
+    // TODO: plumb logger; for now use zap via a helper if available
+    // logger.Warnw("Failed to log running pipelineruns metric", zap.Error(err))
   }
172-175: Guard against nil/zero StartTime before computing duration

Defensive check avoids panics for edge states.
 duration := time.Since(pr.Status.StartTime.Time)
 if pr.Status.CompletionTime != nil {
   duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
 }
+// Optional: if StartTime is zero, skip recording
+if pr.Status.StartTime.IsZero() {
+  return nil
+}
Please confirm StartTime is always set before this path in your controller.
pkg/taskrunmetrics/metrics.go (2)
307-307: Consider aligning counter attributes with duration metrics

trTotalCounter currently records only status. For better cardinality control and parity with durations, consider adding namespace (and optionally task/pipeline labels when configured).
- r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))
+ r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(
+   attribute.String("status", status),
+   attribute.String("namespace", tr.Namespace),
+ ))
25-33: Unify error handling packages

This file uses github.com/pkg/errors, while pipelinerunmetrics uses std errors. Prefer one strategy (stdlib errors + fmt.Errorf("…: %w", err)) for consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a9bdab and 74b736e.

📒 Files selected for processing (33)

config/config-observability.yaml (2 hunks)
go.mod (8 hunks)
pkg/client/clientset/versioned/fake/clientset_generated.go (2 hunks)
pkg/client/clientset/versioned/typed/pipeline/v1/pipeline_client.go (3 hunks)
pkg/client/clientset/versioned/typed/pipeline/v1alpha1/pipeline_client.go (3 hunks)
pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go (3 hunks)
pkg/client/informers/externalversions/pipeline/v1/pipeline.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1/pipelinerun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1/task.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1/taskrun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1alpha1/run.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1alpha1/stepaction.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1alpha1/verificationpolicy.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/customrun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/pipeline.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/pipelinerun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/task.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/taskrun.go (1 hunks)
pkg/client/resolution/clientset/versioned/fake/clientset_generated.go (2 hunks)
pkg/client/resolution/clientset/versioned/typed/resolution/v1alpha1/resolution_client.go (3 hunks)
pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go (3 hunks)
pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go (1 hunks)
pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go (1 hunks)
pkg/client/resource/clientset/versioned/fake/clientset_generated.go (2 hunks)
pkg/client/resource/clientset/versioned/typed/resource/v1alpha1/resource_client.go (3 hunks)
pkg/client/resource/informers/externalversions/resource/v1alpha1/pipelineresource.go (1 hunks)
pkg/pipelinerunmetrics/metrics.go (3 hunks)
pkg/pipelinerunmetrics/metrics_test.go (5 hunks)
pkg/reconciler/pipelinerun/pipelinerun.go (1 hunks)
pkg/taskrunmetrics/metrics.go (7 hunks)
pkg/taskrunmetrics/metrics_test.go (8 hunks)
test/util.go (0 hunks)

💤 Files with no reviewable changes (1)

test/util.go

🧰 Additional context used

🧬 Code graph analysis (5)

pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go (2)

pkg/client/informers/externalversions/pipeline/v1beta1/interface.go (1)

Interface (26-39)

pkg/client/injection/reconciler/pipeline/v1beta1/stepaction/reconciler.go (1)

Interface (44-52)

pkg/taskrunmetrics/metrics.go (5)

pkg/apis/config/metrics.go (7)

Metrics (101-109)

TaskrunLevelAtTaskrun (52-52)

TaskrunLevelAtTask (54-54)

TaskrunLevelAtNS (56-56)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

Recorder (47-63)

NewRecorder (72-88)

OnStore (332-348)

pkg/apis/pipeline/register.go (4)

PipelineTaskLabelKey (47-47)

TaskLabelKey (29-29)

PipelineLabelKey (38-38)

PipelineRunLabelKey (41-41)

vendor/go.opentelemetry.io/otel/attribute/set.go (1)

NewSet (183-186)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/pipelinerunmetrics/metrics_test.go (5)

pkg/apis/pipeline/v1/pipelinerun_types.go (4)

PipelineRun (51-60)

PipelineRun (73-75)

PipelineRunSpec (250-286)

PipelineRunStatus (321-326)

pkg/pipelinerunmetrics/metrics.go (2)

NewRecorder (72-88)

Recorder (47-63)

pkg/names/generate.go (1)

SimpleNameGenerator (48-48)

pkg/apis/config/metrics.go (4)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

Metrics (101-109)

vendor/go.opentelemetry.io/otel/attribute/set.go (1)

NewSet (183-186)

pkg/taskrunmetrics/metrics_test.go (5)

pkg/pipelinerunmetrics/metrics.go (2)

Recorder (47-63)

NewRecorder (72-88)

pkg/taskrunmetrics/metrics.go (2)

Recorder (46-65)

NewRecorder (74-91)

pkg/names/generate.go (1)

SimpleNameGenerator (48-48)

pkg/apis/config/metrics.go (1)

Metrics (101-109)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/pipelinerunmetrics/metrics.go (4)

pkg/apis/config/metrics.go (4)

Metrics (101-109)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/taskrunmetrics/metrics.go (3)

Recorder (46-65)

NewRecorder (74-91)

OnStore (191-207)

vendor/go.opentelemetry.io/otel/metric.go (1)

GetMeterProvider (35-37)

vendor/go.opentelemetry.io/otel/metric/instrument.go (3)

WithDescription (116-116)

WithUnit (197-197)

WithExplicitBucketBoundaries (202-202)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (24)

pkg/client/resource/informers/externalversions/resource/v1alpha1/pipelineresource.go (1)

73-84: Context-aware List/Watch wiring looks solid

Thanks for threading the caller-supplied context through to the client while preserving the existing tweak hooks; this keeps the generated informer compatible with the newer cache.ListWatch APIs without changing behavior for legacy callers.

pkg/client/resource/clientset/versioned/typed/resource/v1alpha1/resource_client.go (1)

48-61: Dropping the no-op error path is correct.

setConfigDefaults cannot fail with the generated codec setup, so removing the unused error plumbing in both constructors keeps the flow in sync with upstream client-gen output. Looks good.
pkg/client/informers/externalversions/pipeline/v1alpha1/verificationpolicy.go (1)
79-83: Fix ListWatch field name to avoid compile failure. The cache.ListWatch struct exposes the field as WatchWithContextFunc, not WatchFuncWithContext; this code as written won’t compile. Rename the field to match the API (the existing function body can stay the same).(pkg.go.dev)
-            WatchFuncWithContext: func(ctx context.Context, options v1.ListOptions) (watch.Interface, error) {
+            WatchWithContextFunc: func(ctx context.Context, options v1.ListOptions) (watch.Interface, error) {
Likely an incorrect or invalid review comment.
pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go (1)

73-84: Nice addition of context-aware hooks.

Wiring through ListWithContextFunc/WatchFuncWithContext keeps the tweakListOptions behavior while finally letting callers honor cancellation deadlines—exactly what we’ve needed for informer parity.

pkg/client/informers/externalversions/pipeline/v1/pipelinerun.go (1)

61-84: Context-aware hooks look good

Thanks for wiring both the legacy ListFunc/WatchFunc paths and the new ListWithContextFunc/WatchFuncWithContext variants. That keeps existing callers happy while letting context-aware code avoid context.TODO(). Nicely done.

pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go (1)

83-92: Config defaults updated in step with upstream generator

Switching to rest.CodecFactoryForGeneratedClient(...).WithoutConversion() and dropping the unused error return mirrors the newer client-go generator output, so the client keeps the same behavior without the redundant plumbing. Nice cleanup.

pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go (1)

73-84: Context-aware ListWatch wiring looks solid

Wiring both ListWithContextFunc and WatchFuncWithContext while retaining the legacy fallbacks ensures cancellation can flow through informers without breaking older consumers. Good upgrade.

pkg/client/resolution/clientset/versioned/typed/resolution/v1alpha1/resolution_client.go (1)

83-92: Client defaults kept consistent with new scheme helpers

Mirroring the v1beta1 change here keeps both clients aligned with the current client-go defaults, so everything builds against the updated codec factory without vestigial error handling. Looks good.

pkg/client/clientset/versioned/typed/pipeline/v1alpha1/pipeline_client.go (1)

58-64: Approve non-error defaulting consistency Verified consistent NewForConfig/AndClient signatures and implementations across v1alpha1, v1beta1, and v1 clientsets.

pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go (1)

73-75: setConfigDefaults update verified — definitions no longer return an error and no callers capture one.

pkg/reconciler/pipelinerun/pipelinerun.go (1)

306-315: All DurationAndCount call sites updated. Verified every invocation (including tests) accepts the new ctx parameter.

go.mod (2)

132-132: Drop the incorrect go.yaml.in/yaml/v2 indirect dependency

Remove the line at go.mod:132 (go.yaml.in/yaml/v2 v2.4.2 // indirect); the project already uses gopkg.in/yaml.v2 v2.4.0 // indirect and deleting the typo prevents unknown module path errors.

23-23: Drop OpenCensus from go.mod after confirming no usage
No non-vendor imports of go.opencensus.io or related packages were found—please manually verify there are truly no remaining call sites before removing go.opencensus.io v0.24.0 (line 23 in go.mod).
pkg/client/clientset/versioned/typed/pipeline/v1/pipeline_client.go (1)
63-69: Verify all generated clients use HTTPClientFor→NewForConfigAndClient

Pipeline v1/v1beta1/v1alpha1 wrappers correctly call setConfigDefaults → HTTPClientFor → NewForConfigAndClient.

Confirm resource and resolution client wrappers follow the same pattern:
rg -nP --type=go 'func NewForConfig\(' pkg/client/clientset/versioned/typed
pkg/pipelinerunmetrics/metrics_test.go (1)

324-360: Great end-to-end OTEL assertions; resolves earlier feedback.

Capturing tekton_pipelines_controller_running_pipelineruns via ManualReader and diffing attribute sets meaningfully tests the per-level aggregation. This addresses the prior “restore meaningful assertions” comment.
pkg/taskrunmetrics/metrics_test.go (3)
418-423: Align RecordPodLatency argument order.

Match the function signature (ctx, tr, pod) if confirmed; current call passes (ctx, pod, tr).
- if err := metrics.RecordPodLatency(ctx, td.pod, td.taskRun); td.expectingError && err == nil {
+ if err := metrics.RecordPodLatency(ctx, td.taskRun, td.pod); td.expectingError && err == nil {
315-321: Good structure and coverage for throttled metrics.

End-to-end capture with attribute.Set maps for quota/node throttling looks solid.

72-81: Ignore parameter order suggestion; call sites match the current signature. The method is defined as RecordPodLatency(ctx, pod, tr) and the test calls exactly that, so no changes are needed.

Likely an incorrect or invalid review comment.
config/config-observability.yaml (1)

95-101: Tekton metrics keys align with struct fields.

The keys mirror config.Metrics options (levels, duration types, reason, running-pipelinerun level). Looks consistent.
pkg/pipelinerunmetrics/metrics.go (3)
211-217: Move lister.List() outside the mutex to reduce contention

Holding the lock while listing is unnecessary and can block other recorders. Acquire the lock only when computing and publishing deltas.
- r.mutex.Lock()
- defer r.mutex.Unlock()
-
 prs, err := lister.List(labels.Everything())
 if err != nil {
   return fmt.Errorf("failed to list pipelineruns: %w", err)
 }
+// lock later only for publishing/delta updates
232-237: Verify “waiting on task resolution” heuristic

Incrementing per TaskRun child with a PipelineTaskName may over-count (counts children, not PRs) and may not reflect “waiting on resolution” specifically.

Is there a status/condition that directly indicates a task resolution wait you can check?

If the intent is per-PR, consider incrementing once per PR when any child qualifies.

261-268: UpDownCounter must receive deltas, not totals — current logic inflates values each scrape

Adding the full count on each run makes the exported series grow cumulatively. Compute per-attribute deltas or switch to an async observable gauge. Below adds a small cache to emit deltas and resets it on re-init.
@@
 type Recorder struct {
@@
-  runningPRsWaitingOnTaskResolutionGauge     metric.Float64UpDownCounter
+  runningPRsWaitingOnTaskResolutionGauge     metric.Float64UpDownCounter
+
+  // caches for delta computation
+  lastRunningPRsCounts map[attribute.Set]float64
+  lastWaitingOnPipeline float64
+  lastWaitingOnTask     float64
@@
 func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
@@
   r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge
-
-  return nil
+  // reset caches on (re)initialization
+  r.lastRunningPRsCounts = make(map[attribute.Set]float64)
+  r.lastWaitingOnPipeline = 0
+  r.lastWaitingOnTask = 0
+  return nil
 }
@@
-func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error {
+func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error {
@@
-  r.mutex.Lock()
-  defer r.mutex.Unlock()
-
   prs, err := lister.List(labels.Everything())
   if err != nil {
     return fmt.Errorf("failed to list pipelineruns: %w", err)
   }
@@
-  // Report running_pipelineruns.
-  for attrSet, value := range currentCounts {
-    r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
-  }
-
-  r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes())
-  r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes())
+  // Emit deltas under lock.
+  r.mutex.Lock()
+  defer r.mutex.Unlock()
+
+  // per-attribute deltas (new minus previous)
+  for attrSet, curr := range currentCounts {
+    prev := r.lastRunningPRsCounts[attrSet]
+    if delta := curr - prev; delta != 0 {
+      r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(attrSet.ToSlice()...))
+    }
+  }
+  // attributes that disappeared need a negative delta to zero them
+  for attrSet, prev := range r.lastRunningPRsCounts {
+    if _, ok := currentCounts[attrSet]; !ok && prev != 0 {
+      r.runningPRsGauge.Add(ctx, -prev, metric.WithAttributes(attrSet.ToSlice()...))
+    }
+  }
+  r.lastRunningPRsCounts = currentCounts
+
+  if d := waitingOnPipelineCount - r.lastWaitingOnPipeline; d != 0 {
+    r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, d)
+  }
+  if d := waitingOnTaskCount - r.lastWaitingOnTask; d != 0 {
+    r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, d)
+  }
+  r.lastWaitingOnPipeline = waitingOnPipelineCount
+  r.lastWaitingOnTask = waitingOnTaskCount
Alternative (recommended long-term): replace these with ObservableUpDownCounter/ObservableGauge callbacks that observe the current values directly. I can draft that if you prefer.
pkg/taskrunmetrics/metrics.go (2)
318-324: Avoid holding the mutex while calling lister.List()

Listing can be expensive and doesn’t need the lock. Lock only for publishing.
- r.mutex.Lock()
- defer r.mutex.Unlock()
-
 trs, err := lister.List(labels.Everything())
 if err != nil {
   return err
 }
+// lock later only for publishing/delta updates
353-359: UpDownCounters must add deltas; current code adds totals and inflates metrics

Emit the difference from the previous sample, and clear caches on re-init. Also compute per-attribute deltas for throttled metrics.
@@
 type Recorder struct {
@@
   runningTRsThrottledByNodeGauge  metric.Float64UpDownCounter
   podLatencyHistogram             metric.Float64Histogram
@@
   ReportingPeriod time.Duration
+
+  // caches for delta computation
+  lastRunningTRs            int
+  lastTrsThrottledByQuota   map[attribute.Set]float64
+  lastTrsThrottledByNode    map[attribute.Set]float64
 }
@@
 func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
@@
   r.podLatencyHistogram = podLatencyHistogram
-
-  return nil
+  // reset caches on (re)initialization
+  r.lastRunningTRs = 0
+  r.lastTrsThrottledByQuota = make(map[attribute.Set]float64)
+  r.lastTrsThrottledByNode = make(map[attribute.Set]float64)
+  return nil
 }
@@
 func (r *Recorder) RunningTaskRuns(ctx context.Context, lister listers.TaskRunLister) error {
@@
-  r.mutex.Lock()
-  defer r.mutex.Unlock()
-
   trs, err := lister.List(labels.Everything())
   if err != nil {
     return err
   }
@@
-  r.runningTRsGauge.Add(ctx, float64(runningTrs))
-  for attrSet, value := range trsThrottledByQuota {
-    r.runningTRsThrottledByQuotaGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
-  }
-  for attrSet, value := range trsThrottledByNode {
-    r.runningTRsThrottledByNodeGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
-  }
+  r.mutex.Lock()
+  defer r.mutex.Unlock()
+
+  // running taskruns (single series)
+  if d := runningTrs - r.lastRunningTRs; d != 0 {
+    r.runningTRsGauge.Add(ctx, float64(d))
+  }
+  r.lastRunningTRs = runningTrs
+
+  // throttled by quota (per-attr deltas)
+  for attrSet, curr := range trsThrottledByQuota {
+    prev := r.lastTrsThrottledByQuota[attrSet]
+    if delta := curr - prev; delta != 0 {
+      r.runningTRsThrottledByQuotaGauge.Add(ctx, delta, metric.WithAttributes(attrSet.ToSlice()...))
+    }
+  }
+  for attrSet, prev := range r.lastTrsThrottledByQuota {
+    if _, ok := trsThrottledByQuota[attrSet]; !ok && prev != 0 {
+      r.runningTRsThrottledByQuotaGauge.Add(ctx, -prev, metric.WithAttributes(attrSet.ToSlice()...))
+    }
+  }
+  r.lastTrsThrottledByQuota = trsThrottledByQuota
+
+  // throttled by node (per-attr deltas)
+  for attrSet, curr := range trsThrottledByNode {
+    prev := r.lastTrsThrottledByNode[attrSet]
+    if delta := curr - prev; delta != 0 {
+      r.runningTRsThrottledByNodeGauge.Add(ctx, delta, metric.WithAttributes(attrSet.ToSlice()...))
+    }
+  }
+  for attrSet, prev := range r.lastTrsThrottledByNode {
+    if _, ok := trsThrottledByNode[attrSet]; !ok && prev != 0 {
+      r.runningTRsThrottledByNodeGauge.Add(ctx, -prev, metric.WithAttributes(attrSet.ToSlice()...))
+    }
+  }
+  r.lastTrsThrottledByNode = trsThrottledByNode
If you prefer, I can sketch an ObservableGauge-based solution to observe current values directly.

coderabbitai · 2025-09-29T09:49:16Z

pkg/client/clientset/versioned/fake/clientset_generated.go

+		var opts metav1.ListOptions
+		if watchActcion, ok := action.(testing.WatchActionImpl); ok {
+			opts = watchActcion.ListOptions
+		}
 		gvr := action.GetResource()
 		ns := action.GetNamespace()
-		watch, err := o.Watch(gvr, ns)
+		watch, err := o.Watch(gvr, ns, opts)


🛠️ Refactor suggestion | 🟠 Major

Same pointer handling gap here.

Please mirror the change above so both testing.WatchActionImpl values and pointers feed through, otherwise cloned watch actions will drop their list options. Same diff applies.

🤖 Prompt for AI Agents

In pkg/client/clientset/versioned/fake/clientset_generated.go around lines 57 to 63, the code only type-asserts action to testing.WatchActionImpl, missing the pointer variant, so cloned watch actions that are pointers lose their ListOptions; update the type assertion to handle both testing.WatchActionImpl and *testing.WatchActionImpl (e.g., try value assertion then pointer assertion) and set opts from the matched value’s ListOptions so both pointer and non-pointer watch actions preserve their ListOptions.

coderabbitai · 2025-09-29T09:49:16Z

pkg/client/resolution/clientset/versioned/fake/clientset_generated.go

+		var opts metav1.ListOptions
+		if watchActcion, ok := action.(testing.WatchActionImpl); ok {
+			opts = watchActcion.ListOptions
+		}
 		gvr := action.GetResource()
 		ns := action.GetNamespace()
-		watch, err := o.Watch(gvr, ns)
+		watch, err := o.Watch(gvr, ns, opts)


🛠️ Refactor suggestion | 🟠 Major

Capture pointer WatchActionImpl as well to avoid dropping options.

Right now we only propagate ListOptions if the action is the concrete testing.WatchActionImpl value. When the fake client clones actions via DeepCopy() it can hand reactors a *testing.WatchActionImpl, so the type assertion will fail and we silently fall back to the zero value (losing label/field selectors, RV, etc.). Please handle both forms, e.g.:

- if watchActcion, ok := action.(testing.WatchActionImpl); ok { - opts = watchActcion.ListOptions - } + switch act := action.(type) { + case testing.WatchActionImpl: + opts = act.ListOptions + case *testing.WatchActionImpl: + opts = act.ListOptions + }

🤖 Prompt for AI Agents

In pkg/client/resolution/clientset/versioned/fake/clientset_generated.go around lines 55 to 61, the code only type-asserts a concrete testing.WatchActionImpl and so drops ListOptions when the action is a *testing.WatchActionImpl; update the logic to handle both the value and pointer forms (e.g. check for testing.WatchActionImpl and *testing.WatchActionImpl and set opts from the matched type) so ListOptions (label/field selectors, resourceVersion, etc.) are preserved; keep the rest of the flow unchanged.

coderabbitai · 2025-09-29T09:49:16Z

pkg/client/resource/clientset/versioned/fake/clientset_generated.go

+		var opts metav1.ListOptions
+		if watchActcion, ok := action.(testing.WatchActionImpl); ok {
+			opts = watchActcion.ListOptions
+		}
 		gvr := action.GetResource()
 		ns := action.GetNamespace()
-		watch, err := o.Watch(gvr, ns)
+		watch, err := o.Watch(gvr, ns, opts)


🛠️ Refactor suggestion | 🟠 Major

Propagate options for pointer watch actions.

As with the other fake clientsets, capture both the value and pointer forms of WatchActionImpl so cloned watch actions don’t lose their ListOptions.

🤖 Prompt for AI Agents

In pkg/client/resource/clientset/versioned/fake/clientset_generated.go around lines 53 to 59, the code only type-asserts action to the value form testing.WatchActionImpl and therefore loses ListOptions when the action is a pointer; update the type assertion to handle both the value and pointer forms: first attempt action.(testing.WatchActionImpl) and set opts from its ListOptions, then attempt action.(*testing.WatchActionImpl) and set opts from the pointer’s ListOptions if the first assertion fails, so cloned pointer watch actions preserve their ListOptions before calling o.Watch.

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

vendor/github.com/prometheus/client_golang/prometheus/internal/go_runtime_metrics.go (1)
1-26: Avoid patching vendored dependencies; move change upstream or use a fork/replace.

Direct edits under vendor/ will be overwritten by go mod vendor and are hard to maintain. Prefer:

Fork upstream and pin via go.mod replace, or

Upstream the change and bump the module, or

Maintain an automated patch step in the build.

Example replace snippet:
replace github.com/prometheus/client_golang => github.com/your-org/client_golang vX.Y.Z
vendor/github.com/gorilla/websocket/server.go (1)
100-114: Honor server preference when selecting subprotocols.

The code iterates client protocols outermost, which picks the client's preference. The doc string promises server-preferred order. Iterate server protocols first and return the first present in the client list.

Apply:
-    clientProtocols := Subprotocols(r)
-    for _, clientProtocol := range clientProtocols {
-        for _, serverProtocol := range u.Subprotocols {
-            if clientProtocol == serverProtocol {
-                return clientProtocol
-            }
-        }
-    }
+    clientProtocols := Subprotocols(r)
+    for _, serverProtocol := range u.Subprotocols {
+        for _, clientProtocol := range clientProtocols {
+            if clientProtocol == serverProtocol {
+                return serverProtocol
+            }
+        }
+    }
go.mod (2)

3-3: Ensure Go toolchain versions align to ≥1.24

Update test/gohelloworld/Dockerfile’s base image from golang:1.15 to at least golang:1.24.

If you use GitHub Actions (none found), add or update actions/setup-go to pin go-version: '1.24' or newer.

23-24: Migrate or remove remaining OpenCensus imports in tests
The only non-vendor references to go.opencensus.io are in your test files. Update or delete these before dropping the module:

test/wait.go:54 imports "go.opencensus.io/trace"

test/custom_task_test.go:38 imports "go.opencensus.io/trace"

Ensure these tests no longer use OpenCensus (e.g. switch to OpenTelemetry) before removing the go.opencensus.io and contrib.go.opencensus.io/* deps.
pkg/pipelinerunmetrics/metrics.go (1)
169-201: Respect CountWithReason for prTotalCounter to avoid cardinality spikes

Counter attributes always include reason; this can explode series. Make inclusion of reason conditional on cfg.CountWithReason and align with other dims.
- r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(attrs...))
+ countAttrs := []attribute.KeyValue{
+   attribute.String("namespace", pr.Namespace),
+   attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+   countAttrs = append(countAttrs, attribute.String("reason", reason))
+ }
+ if r.insertTag != nil {
+   countAttrs = append(countAttrs, r.insertTag(pipelineName, pr.Name)...)
+ }
+ r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))
pkg/taskrunmetrics/metrics.go (1)
279-283: Guard against nil StartTime in DurationAndCount

Avoid panic when StartTime is unset.
- duration := time.Since(tr.Status.StartTime.Time)
- if tr.Status.CompletionTime != nil {
-   duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time)
- }
+ if tr.Status.StartTime == nil || tr.Status.StartTime.IsZero() {
+   return nil
+ }
+ duration := time.Since(tr.Status.StartTime.Time)
+ if tr.Status.CompletionTime != nil {
+   duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time)
+ }

🧹 Nitpick comments (22)

vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.c (2)
16-19: Optional: include header that declares mach_port_deallocate.

Some SDKs require <mach/mach_port.h> (or <mach/mach.h>) explicitly.
 #include <mach/mach_init.h>
 #include <mach/task.h>
 #include <mach/mach_vm.h>
+#include <mach/mach_port.h>
35-36: Remove unused variable to avoid -Wunused-variable warning.

task is never used.
-    task_t                      task = MACH_PORT_NULL;
vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go (1)
41-49: Optional: prefer x/sys/unix.Getrlimit over syscall.Getrlimit.

You already import golang.org/x/sys/unix; using it avoids deprecated syscall usage on some platforms.
-func getSoftLimit(which int) (uint64, error) {
-    rlimit := syscall.Rlimit{}
-
-    if err := syscall.Getrlimit(which, &rlimit); err != nil {
+func getSoftLimit(which int) (uint64, error) {
+    var rlimit unix.Rlimit
+    if err := unix.Getrlimit(which, &rlimit); err != nil {
         return 0, err
     }
     return rlimit.Cur, nil
 }
vendor/github.com/go-logr/zapr/slogzapr.go (2)
56-59: Only emit numeric verbosity for negative levels.

The comment says “Record verbosity for info entries,” but the current code adds a 0 for info and warn. Usually we only attach the numeric verbosity for V-level (debug) logs.
- if !isError && zl.numericLevelKey != "" {
-   // Record verbosity for info entries.
-   fields = append(fields, zap.Int(zl.numericLevelKey, intLevel))
- }
+ if record.Level < 0 && zl.numericLevelKey != "" {
+   fields = append(fields, zap.Int(zl.numericLevelKey, intLevel))
+ }
122-136: Error from AddObject ignored (non-blocking).

enc.AddObject returns an error that’s discarded. zap encoders rarely error, so this is fine, but if you want belt-and-suspenders, consider capturing once (debug-only) to aid troubleshooting.
vendor/github.com/gorilla/websocket/compression.go (1)
36-40: Close pooled reader before falling back in decompressNoContextTakeover
The Reset-error path replaces fr without closing the originally pooled reader, which—if ever triggered—will shrink the pool and leak buffers. Even though Reset never fails today, guard against future changes by closing the old reader first:
 vendor/github.com/gorilla/websocket/compression.go
@@ -36,7 +36,10 @@ func decompressNoContextTakeover(r io.Reader) io.ReadCloser {
  	mr := io.MultiReader(r, strings.NewReader(tail))
  	if err := fr.(flate.Resetter).Reset(mr, nil); err != nil {
  		// Reset never fails, but handle error in case that changes.
-		fr = flate.NewReader(mr)
+		old := fr
+		fr = flate.NewReader(mr)
+		_ = old.Close()
  	}
  	return &flateReadWrapper{fr}
 }
This vendor file is identical to upstream at commit e064f32e3674.
vendor/github.com/grafana/regexp/backtrack.go (1)
62-69: Guard against zero-length programs in maxBitStateLen.

Division by zero is possible if len(prog.Inst) == 0. Add a fast return for that case.
 func maxBitStateLen(prog *syntax.Prog) int {
-	if !shouldBacktrack(prog) {
+	if !shouldBacktrack(prog) || len(prog.Inst) == 0 {
 		return 0
 	}
 	return maxBacktrackVector / len(prog.Inst)
 }
vendor/github.com/grafana/regexp/onepass.go (1)
158-206: Simplify mergeRuneSets: remove confusing defer, return early on failure.

The deferred mutation to local vars is redundant since early returns already emit noRune/noNext. Removing it clarifies control flow and avoids confusion.
-	ok := true
-	defer func() {
-		if !ok {
-			merged = nil
-			next = nil
-		}
-	}()
+	ok := true
@@
-		if !ok {
-			return noRune, noNext
-		}
+		if !ok {
+			return noRune, noNext
+		}
vendor/github.com/grafana/regexp/exec.go (1)

512-550: Parity check: add a quick test to compare indices vs stdlib.

Add tests that assert FindStringIndex/FindIndex equality with stdlib for:

anchored one‑pass patterns with literal prefix (match not at 0),

capturing groups,

case‑folded single‑rune prefixes.

I can draft a focused test file if helpful.
vendor/github.com/grafana/regexp/regexp.go (4)
1251-1283: Avoid shadowing the imported strings package in Split

Local var strings shadows the imported package name. Rename for clarity.
-    matches := re.FindAllStringIndex(s, n)
-    strings := make([]string, 0, len(matches))
+    matches := re.FindAllStringIndex(s, n)
+    parts := make([]string, 0, len(matches))
@@
-        if n > 0 && len(strings) >= n-1 {
+        if n > 0 && len(parts) >= n-1 {
             break
         }
@@
-        if match[1] != 0 {
-            strings = append(strings, s[beg:end])
+        if match[1] != 0 {
+            parts = append(parts, s[beg:end])
         }
         beg = match[1]
     }
@@
-    if end != len(s) {
-        strings = append(strings, s[beg:])
+    if end != len(s) {
+        parts = append(parts, s[beg:])
     }
 
-    return strings
+    return parts
9-12: Modernize doc link (optional)

Update golang.org shortlink to go.dev.
-// https://golang.org/s/re2syntax, except for \C.
+// https://go.dev/s/re2syntax, except for \C.
540-571: Hot‑path caution for convenience Match funcs*

These compile the pattern on every call. For hot paths, prefer Compile/MustCompile once and reuse the Regexp.

68-69: Compat tests vs stdlib to de‑risk forked behavior

Recommend adding a small test matrix comparing this package to stdlib regexp for core ops (MatchString, FindStringSubmatch, ReplaceAllString, Split) to catch divergences early.
// vendor/github.com/grafana/regexp/compat_test.go
package regexp_test

import (
	std "regexp"
	graf "github.com/grafana/regexp"
	"testing"
)

func TestCompat_Basic(t *testing.T) {
	cases := []string{`a+b?`, `^foo(\d+)bar$`, `(?P<name>\w+)`, `a*`, `[[:alpha:]]+`, `^$`}
	for _, pat := range cases {
		stdre := std.MustCompile(pat)
		gre := graf.MustCompile(pat)

		in := []string{"", "foo123bar", "aab", "xyz", "aaab?", "αβγ", "foo42bar baz"}
		for _, s := range in {
			if stdre.MatchString(s) != gre.MatchString(s) {
				t.Fatalf("MatchString diverged for %q on %q", pat, s)
			}
			if gotS, gotG := stdre.FindString(s), gre.FindString(s); gotS != gotG {
				t.Fatalf("FindString diverged for %q on %q: %q vs %q", pat, s, gotS, gotG)
			}
			if gotS, gotG := stdre.Split(s, -1), gre.Split(s, -1); len(gotS) != len(gotG) {
				t.Fatalf("Split len diverged for %q on %q: %v vs %v", pat, s, gotS, gotG)
			}
		}
	}
}
vendor/github.com/prometheus/client_golang/prometheus/wrap.go (1)
81-103: Defensive copy of Labels to avoid concurrent map mutation/races

Labels is a map; callers might mutate it after passing in. Snapshot it when wrapping to prevent data races and surprising changes.
-func WrapCollectorWith(labels Labels, c Collector) Collector {
-	return &wrappingCollector{
-		wrappedCollector: c,
-		labels:           labels,
-	}
-}
+func WrapCollectorWith(labels Labels, c Collector) Collector {
+	var labelsCopy Labels
+	if labels != nil {
+		labelsCopy = make(Labels, len(labels))
+		for k, v := range labels {
+			labelsCopy[k] = v
+		}
+	}
+	return &wrappingCollector{
+		wrappedCollector: c,
+		labels:           labelsCopy,
+	}
+}
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (2)
1967-1994: Nit: local variable naming.

nativehistogram is verbose and stutters; use nh or m.

Apply this diff:
-	nativehistogram, err := NewConstNativeHistogram(desc,
+	nh, err := NewConstNativeHistogram(desc,
...
-	if err != nil {
+	if err != nil {
 		panic(err)
 	}
-	return nativehistogram
+	return nh
2006-2056: makeBucketsFromMap: mirror of runtime encoder — OK; watch for negative values.

Given validateCount will now reject negatives, this should be safe. If input counts are per‑bucket, consider cumulating before appendDelta; otherwise, add a brief comment that counts are cumulative to avoid misuse.
vendor/github.com/gorilla/websocket/conn.go (1)
994-996: Nit: fix comment typos ("best effor" → "best effort").

Purely cosmetic.
-// Make a best effor to send a close message describing the problem.
+// Make a best effort to send a close message describing the problem.
...
-// Make a best effor to send the close message.
+// Make a best effort to send the close message.
Also applies to: 1151-1152
vendor/github.com/gorilla/websocket/server.go (1)

356-356: Typo.

“buferred” → “buffered”.
vendor/github.com/gorilla/websocket/client.go (1)
500-506: Optionally set a stricter TLS minimum when cloning config.

For hardened defaults, set MinVersion when not provided to avoid legacy protocol negotiation.

Apply:
 func cloneTLSConfig(cfg *tls.Config) *tls.Config {
     if cfg == nil {
-        return &tls.Config{}
+        return &tls.Config{MinVersion: tls.VersionTLS13}
     }
-    return cfg.Clone()
+    c := cfg.Clone()
+    if c.MinVersion == 0 {
+        c.MinVersion = tls.VersionTLS13
+    }
+    return c
 }
Note: adopt only if compatible with your supported endpoints.
pkg/pipelinerunmetrics/metrics_test.go (1)

117-191: Consider asserting duration/count metrics too.

Nice job wiring the OTEL ManualReader. You can mirror the approach here to validate DurationAndCount emitted the expected histogram/counter with tags, not just “no error”.

I can draft a small reader-based assertion for one success case if helpful.
pkg/taskrunmetrics/metrics_test.go (1)
88-91: Restore OTEL global MeterProvider after tests

Setting a global provider without restoring can leak state across tests. Capture the previous provider and restore it via t.Cleanup in each test that calls SetMeterProvider.
- reader := metric.NewManualReader()
- provider := metric.NewMeterProvider(metric.WithReader(reader))
- otel.SetMeterProvider(provider)
+ reader := metric.NewManualReader()
+ provider := metric.NewMeterProvider(metric.WithReader(reader))
+ prev := otel.GetMeterProvider()
+ otel.SetMeterProvider(provider)
+ t.Cleanup(func() { otel.SetMeterProvider(prev) })
Also applies to: 224-227, 319-321
pkg/taskrunmetrics/metrics.go (1)
389-407: Lock scope in RecordPodLatency can be reduced

No recorder state is mutated until the final Record call. Consider locking only around the Record to minimize contention.
- r.mutex.Lock()
- defer r.mutex.Unlock()
-
 scheduledTime := getScheduledTime(pod)
 ...
- r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...))
+ r.mutex.Lock()
+ r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...))
+ r.mutex.Unlock()

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74b736e and 93faa72.

⛔ Files ignored due to path filters (5)

go.sum is excluded by !**/*.sum
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.go is excluded by !**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.gw.go is excluded by !**/*.pb.gw.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service_grpc.pb.go is excluded by !**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/metrics/v1/metrics.pb.go is excluded by !**/*.pb.go

📒 Files selected for processing (107)

cmd/webhook/main.go (0 hunks)
config/config-observability.yaml (2 hunks)
go.mod (8 hunks)
pkg/client/clientset/versioned/fake/clientset_generated.go (2 hunks)
pkg/client/clientset/versioned/typed/pipeline/v1/pipeline_client.go (3 hunks)
pkg/client/clientset/versioned/typed/pipeline/v1alpha1/pipeline_client.go (3 hunks)
pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go (3 hunks)
pkg/client/informers/externalversions/pipeline/v1/pipeline.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1/pipelinerun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1/task.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1/taskrun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1alpha1/run.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1alpha1/stepaction.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1alpha1/verificationpolicy.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/customrun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/pipeline.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/pipelinerun.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/task.go (1 hunks)
pkg/client/informers/externalversions/pipeline/v1beta1/taskrun.go (1 hunks)
pkg/client/resolution/clientset/versioned/fake/clientset_generated.go (2 hunks)
pkg/client/resolution/clientset/versioned/typed/resolution/v1alpha1/resolution_client.go (3 hunks)
pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go (3 hunks)
pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go (1 hunks)
pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go (1 hunks)
pkg/client/resource/clientset/versioned/fake/clientset_generated.go (2 hunks)
pkg/client/resource/clientset/versioned/typed/resource/v1alpha1/resource_client.go (3 hunks)
pkg/client/resource/informers/externalversions/resource/v1alpha1/pipelineresource.go (1 hunks)
pkg/pipelinerunmetrics/metrics.go (3 hunks)
pkg/pipelinerunmetrics/metrics_test.go (5 hunks)
pkg/reconciler/pipelinerun/pipelinerun.go (1 hunks)
pkg/taskrunmetrics/metrics.go (7 hunks)
pkg/taskrunmetrics/metrics_test.go (8 hunks)
test/util.go (0 hunks)
vendor/github.com/go-logr/logr/slogr/slogr.go (1 hunks)
vendor/github.com/go-logr/zapr/.gitignore (1 hunks)
vendor/github.com/go-logr/zapr/.golangci.yaml (1 hunks)
vendor/github.com/go-logr/zapr/LICENSE (1 hunks)
vendor/github.com/go-logr/zapr/README.md (1 hunks)
vendor/github.com/go-logr/zapr/slogzapr.go (1 hunks)
vendor/github.com/go-logr/zapr/zapr.go (1 hunks)
vendor/github.com/go-logr/zapr/zapr_noslog.go (1 hunks)
vendor/github.com/go-logr/zapr/zapr_slog.go (1 hunks)
vendor/github.com/golang/protobuf/proto/buffer.go (0 hunks)
vendor/github.com/golang/protobuf/proto/defaults.go (0 hunks)
vendor/github.com/golang/protobuf/proto/deprecated.go (0 hunks)
vendor/github.com/golang/protobuf/proto/discard.go (0 hunks)
vendor/github.com/golang/protobuf/proto/extensions.go (0 hunks)
vendor/github.com/golang/protobuf/proto/properties.go (0 hunks)
vendor/github.com/golang/protobuf/proto/proto.go (0 hunks)
vendor/github.com/golang/protobuf/proto/registry.go (0 hunks)
vendor/github.com/golang/protobuf/proto/text_decode.go (0 hunks)
vendor/github.com/golang/protobuf/proto/text_encode.go (0 hunks)
vendor/github.com/golang/protobuf/proto/wire.go (0 hunks)
vendor/github.com/golang/protobuf/proto/wrappers.go (0 hunks)
vendor/github.com/google/gofuzz/.travis.yml (0 hunks)
vendor/github.com/google/gofuzz/CONTRIBUTING.md (0 hunks)
vendor/github.com/google/gofuzz/fuzz.go (0 hunks)
vendor/github.com/gorilla/websocket/README.md (2 hunks)
vendor/github.com/gorilla/websocket/client.go (7 hunks)
vendor/github.com/gorilla/websocket/compression.go (1 hunks)
vendor/github.com/gorilla/websocket/conn.go (19 hunks)
vendor/github.com/gorilla/websocket/proxy.go (3 hunks)
vendor/github.com/gorilla/websocket/server.go (6 hunks)
vendor/github.com/gorilla/websocket/tls_handshake.go (0 hunks)
vendor/github.com/gorilla/websocket/tls_handshake_116.go (0 hunks)
vendor/github.com/gorilla/websocket/x_net_proxy.go (0 hunks)
vendor/github.com/grafana/regexp/.gitignore (1 hunks)
vendor/github.com/grafana/regexp/LICENSE (1 hunks)
vendor/github.com/grafana/regexp/README.md (1 hunks)
vendor/github.com/grafana/regexp/backtrack.go (1 hunks)
vendor/github.com/grafana/regexp/exec.go (1 hunks)
vendor/github.com/grafana/regexp/onepass.go (1 hunks)
vendor/github.com/grafana/regexp/regexp.go (1 hunks)
vendor/github.com/openzipkin/zipkin-go/LICENSE (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/annotation.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/doc.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/endpoint.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/kind.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/span.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/span_id.go (0 hunks)
vendor/github.com/openzipkin/zipkin-go/model/traceid.go (0 hunks)
vendor/github.com/prometheus/client_golang/prometheus/collectorfunc.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/desc.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (6 hunks)
vendor/github.com/prometheus/client_golang/prometheus/internal/difflib.go (6 hunks)
vendor/github.com/prometheus/client_golang/prometheus/internal/go_runtime_metrics.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/metric.go (3 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.c (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_not_supported.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_windows.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (7 hunks)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go (1 hunks)
vendor/github.com/prometheus/client_golang/prometheus/promhttp/internal/compression.go (2 hunks)
vendor/github.com/prometheus/client_golang/prometheus/summary.go (4 hunks)
vendor/github.com/prometheus/client_golang/prometheus/vec.go (5 hunks)
vendor/github.com/prometheus/client_golang/prometheus/wrap.go (2 hunks)
vendor/github.com/prometheus/common/expfmt/text_parse.go (1 hunks)
vendor/github.com/prometheus/common/model/alert.go (1 hunks)
vendor/github.com/prometheus/common/model/labels.go (2 hunks)
vendor/github.com/prometheus/common/model/metric.go (2 hunks)

⛔ Files not processed due to max files limit (54)

vendor/github.com/prometheus/common/model/time.go
vendor/github.com/prometheus/otlptranslator/.gitignore
vendor/github.com/prometheus/otlptranslator/.golangci.yml
vendor/github.com/prometheus/otlptranslator/CODE_OF_CONDUCT.md
vendor/github.com/prometheus/otlptranslator/LICENSE
vendor/github.com/prometheus/otlptranslator/MAINTAINERS.md
vendor/github.com/prometheus/otlptranslator/README.md
vendor/github.com/prometheus/otlptranslator/SECURITY.md
vendor/github.com/prometheus/otlptranslator/constants.go
vendor/github.com/prometheus/otlptranslator/metric_namer.go
vendor/github.com/prometheus/otlptranslator/metric_type.go
vendor/github.com/prometheus/otlptranslator/normalize_label.go
vendor/github.com/prometheus/otlptranslator/strconv.go
vendor/github.com/prometheus/otlptranslator/unit_namer.go
vendor/github.com/prometheus/procfs/.golangci.yml
vendor/github.com/prometheus/procfs/Makefile.common
vendor/github.com/prometheus/procfs/README.md
vendor/github.com/prometheus/procfs/arp.go
vendor/github.com/prometheus/procfs/fs.go
vendor/github.com/prometheus/procfs/fs_statfs_notype.go
vendor/github.com/prometheus/procfs/fscache.go
vendor/github.com/prometheus/procfs/internal/fs/fs.go
vendor/github.com/prometheus/procfs/internal/util/parse.go
vendor/github.com/prometheus/procfs/internal/util/sysreadfile.go
vendor/github.com/prometheus/procfs/mdstat.go
vendor/github.com/prometheus/procfs/meminfo.go
vendor/github.com/prometheus/procfs/mountstats.go
vendor/github.com/prometheus/procfs/net_dev_snmp6.go
vendor/github.com/prometheus/procfs/net_ip_socket.go
vendor/github.com/prometheus/procfs/net_protocols.go
vendor/github.com/prometheus/procfs/net_tcp.go
vendor/github.com/prometheus/procfs/net_unix.go
vendor/github.com/prometheus/procfs/proc.go
vendor/github.com/prometheus/procfs/proc_cgroup.go
vendor/github.com/prometheus/procfs/proc_io.go
vendor/github.com/prometheus/procfs/proc_netstat.go
vendor/github.com/prometheus/procfs/proc_smaps.go
vendor/github.com/prometheus/procfs/proc_snmp.go
vendor/github.com/prometheus/procfs/proc_snmp6.go
vendor/github.com/prometheus/procfs/proc_stat.go
vendor/github.com/prometheus/procfs/proc_statm.go
vendor/github.com/prometheus/procfs/proc_status.go
vendor/github.com/prometheus/procfs/proc_sys.go
vendor/github.com/prometheus/procfs/softirqs.go
vendor/go.opencensus.io/plugin/ochttp/client.go
vendor/go.opencensus.io/plugin/ochttp/client_stats.go
vendor/go.opencensus.io/plugin/ochttp/doc.go
vendor/go.opencensus.io/plugin/ochttp/propagation/b3/b3.go
vendor/go.opencensus.io/plugin/ochttp/propagation/tracecontext/propagation.go
vendor/go.opencensus.io/plugin/ochttp/route.go
vendor/go.opencensus.io/plugin/ochttp/server.go
vendor/go.opencensus.io/plugin/ochttp/span_annotating_client_trace.go
vendor/go.opencensus.io/plugin/ochttp/stats.go
vendor/go.opencensus.io/plugin/ochttp/trace.go

💤 Files with no reviewable changes (28)

vendor/github.com/gorilla/websocket/tls_handshake.go
vendor/github.com/openzipkin/zipkin-go/model/endpoint.go
vendor/github.com/gorilla/websocket/tls_handshake_116.go
vendor/github.com/openzipkin/zipkin-go/model/kind.go
vendor/github.com/golang/protobuf/proto/registry.go
vendor/github.com/golang/protobuf/proto/properties.go
vendor/github.com/openzipkin/zipkin-go/model/traceid.go
vendor/github.com/golang/protobuf/proto/defaults.go
vendor/github.com/openzipkin/zipkin-go/model/span.go
vendor/github.com/golang/protobuf/proto/text_decode.go
vendor/github.com/openzipkin/zipkin-go/model/span_id.go
vendor/github.com/golang/protobuf/proto/proto.go
vendor/github.com/google/gofuzz/fuzz.go
vendor/github.com/golang/protobuf/proto/deprecated.go
test/util.go
vendor/github.com/google/gofuzz/CONTRIBUTING.md
vendor/github.com/openzipkin/zipkin-go/model/annotation.go
vendor/github.com/golang/protobuf/proto/extensions.go
cmd/webhook/main.go
vendor/github.com/golang/protobuf/proto/wire.go
vendor/github.com/openzipkin/zipkin-go/LICENSE
vendor/github.com/golang/protobuf/proto/discard.go
vendor/github.com/openzipkin/zipkin-go/model/doc.go
vendor/github.com/golang/protobuf/proto/wrappers.go
vendor/github.com/golang/protobuf/proto/text_encode.go
vendor/github.com/gorilla/websocket/x_net_proxy.go
vendor/github.com/google/gofuzz/.travis.yml
vendor/github.com/golang/protobuf/proto/buffer.go

✅ Files skipped from review due to trivial changes (2)

vendor/github.com/go-logr/zapr/.gitignore
vendor/github.com/prometheus/common/model/metric.go

🚧 Files skipped from review as they are similar to previous changes (24)

vendor/github.com/go-logr/zapr/.golangci.yaml
vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go
vendor/github.com/prometheus/client_golang/prometheus/summary.go
pkg/client/informers/externalversions/pipeline/v1alpha1/run.go
pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go
pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go
config/config-observability.yaml
pkg/client/informers/externalversions/pipeline/v1/taskrun.go
vendor/github.com/go-logr/zapr/zapr.go
pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go
pkg/client/informers/externalversions/pipeline/v1beta1/task.go
vendor/github.com/prometheus/client_golang/prometheus/desc.go
vendor/github.com/go-logr/zapr/LICENSE
pkg/reconciler/pipelinerun/pipelinerun.go
vendor/github.com/grafana/regexp/LICENSE
pkg/client/informers/externalversions/pipeline/v1beta1/pipelinerun.go
pkg/client/resolution/clientset/versioned/fake/clientset_generated.go
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go
vendor/github.com/prometheus/common/model/labels.go
pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go
vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go
vendor/github.com/go-logr/zapr/zapr_noslog.go
vendor/github.com/grafana/regexp/README.md
pkg/client/informers/externalversions/pipeline/v1beta1/customrun.go

🧰 Additional context used

🧬 Code graph analysis (7)

pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go (1)

pkg/client/informers/externalversions/pipeline/v1beta1/interface.go (1)

Interface (26-39)

vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)

vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)

Histogram (249-261)

vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)

vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)

Metric (33-59)

pkg/pipelinerunmetrics/metrics.go (2)

pkg/apis/config/metrics.go (4)

Metrics (101-109)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/taskrunmetrics/metrics.go (3)

Recorder (46-65)

NewRecorder (74-91)

OnStore (191-207)

pkg/taskrunmetrics/metrics_test.go (4)

pkg/taskrunmetrics/metrics.go (2)

Recorder (46-65)

NewRecorder (74-91)

pkg/names/generate.go (1)

SimpleNameGenerator (48-48)

pkg/apis/config/metrics.go (1)

Metrics (101-109)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/taskrunmetrics/metrics.go (5)

pkg/apis/config/metrics.go (7)

Metrics (101-109)

TaskrunLevelAtTaskrun (52-52)

TaskrunLevelAtTask (54-54)

TaskrunLevelAtNS (56-56)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

Recorder (47-63)

NewRecorder (72-88)

OnStore (332-348)

pkg/apis/pipeline/register.go (4)

PipelineTaskLabelKey (47-47)

TaskLabelKey (29-29)

PipelineLabelKey (38-38)

PipelineRunLabelKey (41-41)

vendor/go.opentelemetry.io/otel/attribute/set.go (1)

NewSet (183-186)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/pipelinerunmetrics/metrics_test.go (2)

pkg/pipelinerunmetrics/metrics.go (2)

NewRecorder (72-88)

Recorder (47-63)

pkg/apis/config/metrics.go (4)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

Metrics (101-109)

🪛 ast-grep (0.39.5)

vendor/github.com/gorilla/websocket/client.go

[warning] 501-501: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🪛 markdownlint-cli2 (0.18.1)

vendor/github.com/go-logr/zapr/README.md

10-10: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

pkg/pipelinerunmetrics/metrics_test.go

coderabbitai · 2025-09-29T10:00:27Z

pkg/pipelinerunmetrics/metrics.go

+	duration := time.Since(pr.Status.StartTime.Time)
+	if pr.Status.CompletionTime != nil {
+		duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
 	}



⚠️ Potential issue | 🟡 Minor

Guard against nil StartTime in DurationAndCount

If StartTime is nil, this panics. Bail out or skip recording.

- duration := time.Since(pr.Status.StartTime.Time) - if pr.Status.CompletionTime != nil { - duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) - } + if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() { + return nil + } + duration := time.Since(pr.Status.StartTime.Time) + if pr.Status.CompletionTime != nil { + duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

duration := time.Since(pr.Status.StartTime.Time)

if pr.Status.CompletionTime != nil {

duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)

}

// Skip recording if StartTime is not set to avoid panics

if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() {

return nil

}

duration := time.Since(pr.Status.StartTime.Time)

if pr.Status.CompletionTime != nil {

duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)

}

🤖 Prompt for AI Agents

In pkg/pipelinerunmetrics/metrics.go around lines 172-176, the code assumes pr.Status.StartTime exists and uses pr.Status.StartTime.Time which can be nil/zero and cause a panic; add a guard that checks if StartTime is nil (if it's a pointer) or StartTime.IsZero() (if it's a metav1.Time value) and if so bail out or skip recording the metric, otherwise compute duration as before and continue; ensure the nil/zero check is performed before any use of StartTime.Time.

coderabbitai · 2025-09-29T10:00:27Z

pkg/pipelinerunmetrics/metrics.go

+			if pr.Status.ChildReferences != nil {
+				for _, childRef := range pr.Status.ChildReferences {
+					if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" {
+						waitingOnTaskCount++
+					}
 				}
 			}


⚠️ Potential issue | 🟡 Minor

Over-counts “waiting on task resolution” per child; count each PipelineRun at most once

Incrementing inside the loop counts multiple TaskRuns per PR. If the intent is “number of PRs waiting…”, gate by a boolean.

- if pr.Status.ChildReferences != nil { - for _, childRef := range pr.Status.ChildReferences { - if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" { - waitingOnTaskCount++ - } - } - } + if pr.Status.ChildReferences != nil { + counted := false + for _, cr := range pr.Status.ChildReferences { + if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" { + counted = true + break + } + } + if counted { + waitingOnTaskCount++ + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if pr.Status.ChildReferences != nil {

for _, childRef := range pr.Status.ChildReferences {

if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" {

waitingOnTaskCount++

}

}

}

if pr.Status.ChildReferences != nil {

counted := false

for _, cr := range pr.Status.ChildReferences {

if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" {

counted = true

break

}

}

if counted {

waitingOnTaskCount++

}

}

🤖 Prompt for AI Agents

pkg/pipelinerunmetrics/metrics.go around lines 232 to 238: the loop currently increments waitingOnTaskCount for every TaskRun childReference, which over-counts when a PipelineRun has multiple TaskRuns; change the logic to count each PipelineRun at most once by using a boolean flag (e.g., foundWaitingTask) initialized false before the inner loop, set it true when you encounter the first childRef with Kind == "TaskRun" and PipelineTaskName != "", break out of the inner loop immediately, and after the loop increment waitingOnTaskCount only if the flag is true.

coderabbitai · 2025-09-29T10:00:28Z

pkg/taskrunmetrics/metrics.go

+	r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))



⚠️ Potential issue | 🟠 Major

Align TaskRun total counter attributes; respect CountWithReason

Currently only “status” is attached. Include namespace and optionally reason depending on cfg.CountWithReason to avoid uncontrolled cardinality and to match PR totals.

- r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status))) + countAttrs := []attribute.KeyValue{ + attribute.String("namespace", tr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + countAttrs = append(countAttrs, attribute.String("reason", reason)) + } + r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))

// build attributes for the total counter

countAttrs := []attribute.KeyValue{

attribute.String("namespace", tr.Namespace),

attribute.String("status", status),

}

if r.cfg != nil && r.cfg.CountWithReason {

countAttrs = append(countAttrs, attribute.String("reason", reason))

}

r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))

🤖 Prompt for AI Agents

In pkg/taskrunmetrics/metrics.go around lines 307 to 308, the call to r.trTotalCounter.Add only attaches the "status" attribute; update it to also include the TaskRun's namespace (attribute.String("namespace", namespace)) and, if cfg.CountWithReason is true, include the reason attribute (attribute.String("reason", reason)); ensure the code reads cfg.CountWithReason and conditionally appends the reason attribute to the slice passed into metric.WithAttributes so we avoid uncontrolled cardinality when CountWithReason is disabled.

coderabbitai · 2025-09-29T10:00:28Z

vendor/github.com/grafana/regexp/regexp.go

+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Vendor license/NOTICE verification

This file carries Go Authors BSD header. Confirm LICENSE for this vendored module is included and compliant (BSD 3‑Clause), and repository NOTICE/THIRD_PARTY files updated if required.

🏁 Script executed:

#!/bin/bash # Check for license files alongside the vendored package and at repo root fd -a -H -I '^LICENSE(|\.md|\.txt)?$' vendor/github.com/grafana/regexp -x ls -l {} || true fd -a -H -I 'LICENSE*|NOTICE*|THIRD*PARTY*' -d 1 . -x ls -l {} || true # Grep for BSD in vendor license (if present) rg -n --hidden -S 'BSD' vendor/github.com/grafana/regexp || true

Length of output: 583

🏁 Script executed:

#!/bin/bash # Search entire repo for NOTICE* or THIRD_PARTY* files fd -H -I 'NOTICE*|THIRD*PARTY*'

Length of output: 622

Add NOTICE reference for vendored grafana/regexp
The module’s BSD-style LICENSE exists at vendor/github.com/grafana/regexp/LICENSE, but there’s no corresponding NOTICE or THIRD_PARTY entry in the repo. Add a NOTICE file under vendor/github.com/grafana/regexp or update the project’s NOTICE/THIRD_PARTY documentation to include grafana/regexp.

🤖 Prompt for AI Agents

In vendor/github.com/grafana/regexp/regexp.go around lines 1 to 3, the vendored grafana/regexp package is missing a NOTICE/THIRD_PARTY entry; add a NOTICE file under vendor/github.com/grafana/regexp (or update the repository-level NOTICE or THIRD_PARTY file) that cites the module name, the BSD-style license and path to vendor/github.com/grafana/regexp/LICENSE, and any required attribution text per the license; ensure the new NOTICE includes the package name, copyright holder, and a pointer to the bundled LICENSE, or alternatively add an entry to the project-wide THIRD_PARTY/NOTICE listing with the same information.

coderabbitai · 2025-09-29T10:00:28Z

vendor/github.com/prometheus/client_golang/prometheus/histogram.go

+func ExponentialBucketsRange(minBucket, maxBucket float64, count int) []float64 {
 	if count < 1 {
 		panic("ExponentialBucketsRange count needs a positive count")
 	}
-	if min <= 0 {
+	if minBucket <= 0 {
 		panic("ExponentialBucketsRange min needs to be greater than 0")
 	}

 	// Formula for exponential buckets.
 	// max = min*growthFactor^(bucketCount-1)

 	// We know max/min and highest bucket. Solve for growthFactor.
-	growthFactor := math.Pow(max/min, 1.0/float64(count-1))
+	growthFactor := math.Pow(maxBucket/minBucket, 1.0/float64(count-1))

 	// Now that we know growthFactor, solve for each bucket.
 	buckets := make([]float64, count)
 	for i := 1; i <= count; i++ {
-		buckets[i-1] = min * math.Pow(growthFactor, float64(i-1))
+		buckets[i-1] = minBucket * math.Pow(growthFactor, float64(i-1))
 	}


⚠️ Potential issue | 🟠 Major

ExponentialBucketsRange: missing input validation for max > min (and count == 1 edge).

Without checks, callers can get non‑increasing buckets (later panics) or odd results for count==1 when min!=max. Add explicit guards.

Apply this diff:

func ExponentialBucketsRange(minBucket, maxBucket float64, count int) []float64 { if count < 1 { panic("ExponentialBucketsRange count needs a positive count") } if minBucket <= 0 { panic("ExponentialBucketsRange min needs to be greater than 0") } + if count == 1 { + if maxBucket != minBucket { + panic("ExponentialBucketsRange with count==1 requires max == min") + } + return []float64{minBucket} + } + if !(maxBucket > minBucket) { + panic("ExponentialBucketsRange requires max > min when count > 1") + } // Formula for exponential buckets. // max = min*growthFactor^(bucketCount-1) // We know max/min and highest bucket. Solve for growthFactor. growthFactor := math.Pow(maxBucket/minBucket, 1.0/float64(count-1)) // Now that we know growthFactor, solve for each bucket. buckets := make([]float64, count) for i := 1; i <= count; i++ { buckets[i-1] = minBucket * math.Pow(growthFactor, float64(i-1)) } return buckets }

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/client_golang/prometheus/histogram.go around lines 339 to 357, add input validation: ensure maxBucket is greater than minBucket (panic if maxBucket <= minBucket) and handle the count==1 edge explicitly (if count == 1 then only allow it when minBucket == maxBucket and return a single-element slice, otherwise panic). Keep the existing checks for count<1 and minBucket<=0, and perform the new validations before computing growthFactor so we never compute invalid or non‑increasing buckets.

coderabbitai · 2025-09-29T10:00:28Z

vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go

+//go:build !windows && !js && !wasip1 && !darwin
+// +build !windows,!js,!wasip1,!darwin



⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Darwin excluded: verify alternate implementation exists.

Excluding darwin here is fine if a darwin-specific file covers Describe/Collect; otherwise mac builds will miss these metrics.

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Process collector variants in vendor:" fd -t f 'process_collector*' vendor/github.com/prometheus/client_golang/prometheus | sort

Length of output: 726

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== process_collector_darwin.go (first 200 lines) ===" sed -n '1,200p' vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go echo echo "=== Searching for Describe/Collect implementations ===" rg -nP '\bfunc $.*ProcessCollector.*$\s+(Describe|Collect)' -C2 vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go || true

Length of output: 4308

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== process_collector.go (first 200 lines) ===" sed -n '1,200p' vendor/github.com/prometheus/client_golang/prometheus/process_collector.go echo for file in vendor/github.com/prometheus/client_golang/prometheus/process_collector*.go; do echo "=== In $file ===" rg -nP 'func \$.*processCollector.*\$ (Describe|Collect)' -C1 "$file" || true done

Length of output: 5967

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Searching for describe implementations in all variants ===" rg -nP 'func \$c \\*processCollector\$ describe' -C2 vendor/github.com/prometheus/client_golang/prometheus/process_collector*.go || true

Length of output: 62

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== process_collector_procfsenabled.go (first 200 lines) ===" sed -n '1,200p' vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go

Length of output: 2972

Implement Describe in process_collector_darwin.go: that file defines processCollect but no describe method, so macOS builds won’t emit any descriptors—add a matching func (c *processCollector) describe to expose metrics on Darwin.

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go around where processCollect is defined (lines near 14–16), there is no Describe method implemented so Darwin builds won't expose metric descriptors; add a method with signature func (c *processCollector) Describe(ch chan<- *prometheus.Desc) that sends each metric descriptor contained in the processCollector to the provided channel (mirror the Describe implementation from the non-Darwin variant), ensuring the collector satisfies prometheus.Collector and that all descriptors used in Collect/ProcessCollect are emitted.

coderabbitai · 2025-09-29T10:00:28Z

vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go

+		if internal.NewZstdWriter == nil {
+			// The content encoding was not implemented yet.
+			return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats())
 		}
-
-		z.Reset(rw)
-		return z, selected, func() { _ = z.Close() }, nil
+		writer, closeWriter, err := internal.NewZstdWriter(rw)
+		return writer, selected, closeWriter, err
 	case "gzip":


⚠️ Potential issue | 🔴 Critical

Compile-time bug: wrong fmt verb for slice argument.

defaultCompressionFormats() returns []Compression, but the error strings use %s (string), causing a compile error. Use %v or join into a string.

Apply this minimal fix:

- return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats()) + return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %v", selected, defaultCompressionFormats())

and

- return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats()) + return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %v", selected, defaultCompressionFormats())

Optionally, for nicer output:

+ // helper to stringify defaults for messages + valid := make([]string, 0, len(defaultCompressionFormats())) + for _, c := range defaultCompressionFormats() { valid = append(valid, string(c)) } - return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats()) + return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, strings.Join(valid, ", "))

(Remember to import "strings" if using the optional improvement.)

Also applies to: 490-491

🤖 Prompt for AI Agents

In vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go around lines 475-481 (and similarly at 490-491), the error message uses the %s verb while defaultCompressionFormats() returns a []Compression slice, causing a compile error; change the format verb to %v (or convert the slice to a joined string and import "strings" if you prefer more readable output) so the slice is printed correctly and update both error calls consistently.

The 'hack/update-codegen.sh' script was failing with 'streaming_list_type_json_tags' API rule violations. This was caused by incorrect JSON tags on list types that prevented them from being correctly interpreted as streaming lists by the Kubernetes API. This commit fixes the following: - Removes ',omitempty' from the 'Items' field in the v1 and v1beta1 PipelineRunList structs. - Adds ',omitempty' to the 'ListMeta' field in the v1beta1 ResolutionRequestList struct.

This isn't supported after knative otel's migration.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

go.mod (1)

23-23: Do not drop OpenCensus deps before migrating tests
Migrate or remove the OpenCensus imports in test/wait.go (line 54) and test/custom_task_test.go (line 38) (both import go.opencensus.io/trace) before removing go.opencensus.io and contrib.go.opencensus.io from go.mod.

pkg/taskrunmetrics/metrics.go (1)

276-283: Harden DurationAndCount: nil StartTime/Condition, and align counter attributes

Prevent panics and align counter attributes with config to control cardinality.

- r.mutex.Lock()
- defer r.mutex.Unlock()
-
- duration := time.Since(tr.Status.StartTime.Time)
- if tr.Status.CompletionTime != nil {
-   duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time)
- }
+ r.mutex.Lock()
+ defer r.mutex.Unlock()
+ if tr.Status.StartTime == nil || tr.Status.StartTime.IsZero() {
+   return nil
+ }
+ duration := time.Since(tr.Status.StartTime.Time)
+ if tr.Status.CompletionTime != nil {
+   duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time)
+ }
@@
- cond := tr.Status.GetCondition(apis.ConditionSucceeded)
+ cond := tr.Status.GetCondition(apis.ConditionSucceeded)
+ if cond == nil {
+   return nil
+ }
@@
- attrs := []attribute.KeyValue{
-   attribute.String("namespace", tr.Namespace),
-   attribute.String("status", status),
-   attribute.String("reason", reason),
- }
+ attrs := []attribute.KeyValue{
+   attribute.String("namespace", tr.Namespace),
+   attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+   attrs = append(attrs, attribute.String("reason", reason))
+ }
@@
- r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))
+ countAttrs := []attribute.KeyValue{
+   attribute.String("namespace", tr.Namespace),
+   attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+   countAttrs = append(countAttrs, attribute.String("reason", reason))
+ }
+ r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))

Also applies to: 286-297, 307-308

🧹 Nitpick comments (15)

config/300-crds/300-taskrun.yaml (1)

4739-4744: New Sidecar stopSignal: verify controller support and consider parity

Please confirm the controller/plumbing maps this field into Pod specs for the targeted Kubernetes version and that any required feature gates are handled in install docs/manifests.

Consider adding the same field for Step/StepTemplate to avoid API asymmetry.

Optional: add a schema pattern or enum guidance (e.g., SIGTERM or numeric) to prevent invalid values.
config/300-crds/300-task.yaml (1)
692-697: New lifecycle.stopSignal: add version/feature‑gate note and basic validation hint.

The field matches upstream K8s (alpha in v1.33 behind ContainerStopSignals; requires .spec.os.name). Suggest making this explicit in the description to prevent unexpected Pod rejections on clusters without the gate, and optionally add a light format check to catch typos (e.g., “^SIG[A-Z0-9]+$”). (kubernetes.io)

Proposed doc tweak (apply to each stopSignal description below):
-  StopSignal defines which signal will be sent to a container when it is being stopped.
-  If not specified, the default is defined by the container runtime in use.
-  StopSignal can only be set for Pods with a non-empty .spec.os.name
+  StopSignal defines which signal will be sent to a container when it is being stopped.
+  If not specified, the default is defined by the container runtime in use.
+  Requires Kubernetes v1.33+ with the ContainerStopSignals feature gate enabled on apiserver and kubelet,
+  and can only be set for Pods with a non-empty .spec.os.name. On Windows nodes, only SIGTERM or SIGKILL are allowed.
Optional CEL validation (if you want an early guard in the CRD; safe to defer):
+ x-kubernetes-validations:
+   - rule: "self.matches('^SIG[A-Z0-9]+$')"
+     message: "stopSignal must look like a POSIX signal name, e.g., SIGTERM, SIGUSR1"
To check cluster support before surfacing this in Tekton-generated Pods, ensure your controller only forwards stopSignal when the API server’s ContainerStopSignals gate is on. (kubernetes.io)

Also applies to: 2061-2067, 3391-3397, 5123-5127
go.mod (1)

95-96: Prefer a tagged release for gorilla/websocket over a pseudo-version.

Pinning github.com/gorilla/websocket to a commit pseudo-version can hurt reproducibility and vulnerability tracking. Prefer the latest stable tag unless you need an unreleased fix.

Check for a suitable tag providing your fix and update if possible.
pkg/pipelinerunmetrics/metrics_test.go (3)
183-189: Optional: Assert recorded values using a manual reader (like below tests).

Here you only check initialization. For stronger coverage, wire a metric.NewManualReader and assert one duration datapoint and counter increment as done in the “AtAllLevels” test.

301-315: Isolate global state in tests (MeterProvider/recorder).

This test mutates package globals (otel provider and r). Add t.Cleanup to restore the original MeterProvider and reset the global recorder/once to avoid cross‑test bleed.

Example:
orig := otel.GetMeterProvider()
t.Cleanup(func() {
    otel.SetMeterProvider(orig)
    r = nil
    once = sync.Once{}
})
Note: import sync and ensure once is accessible in this package.

364-383: This test doesn’t assert anything about resolution-wait metrics.

It only checks initialization/no error. Consider adding a manual reader and asserting *_waiting_on_{pipeline,task}_resolution counters produce expected datapoints.
pkg/taskrunmetrics/metrics_test.go (2)
222-227: Clean up global state in OTEL tests.

These tests reset once and set a global MeterProvider but don’t restore it. Add t.Cleanup to restore the previous provider and ensure deterministic behavior across tests.

Example:
orig := otel.GetMeterProvider()
t.Cleanup(func() {
    otel.SetMeterProvider(orig)
    once = sync.Once{}
})
Also applies to: 318-323

431-449: Minor: fix typos in field names in table tests.

expetectedPipeline/expetectedPipelineRun are misspelled. Consider renaming to expectedPipeline/expectedPipelineRun for clarity.

Also applies to: 462-468
config/config-observability.yaml (3)
95-101: Document allowed values and set a safe default for metrics.running-pipelinerun.level

Empty string is ambiguous and can surprise users. Recommend documenting allowed values and defaulting to "namespace" to curb cardinality.
-    # Tekton-specific metrics configuration
+    # Tekton-specific metrics configuration
+    # metrics.running-pipelinerun.level controls attribute granularity for the running gauge.
+    # Allowed values: "pipelinerun" | "pipeline" | "namespace"
+    # Default: "namespace" (lowest cardinality)
-    metrics.running-pipelinerun.level: ""
+    metrics.running-pipelinerun.level: "namespace"
40-74: Clarify exporter semantics vs. controller reporting period

These keys configure OTEL exporter/SDK behavior but not the controller’s internal 30s reporting loop (ReportingPeriod). Add a note to avoid confusion.
     metrics-export-interval: ""
+    # Note: This is the OTEL SDK/export interval, not the Tekton controller's
+    # internal ReportingPeriod for periodic gauges. Those remain 30s unless
+    # wired explicitly in code.
95-101: Add example knob for namespace label on throttle metrics

TaskRun code reads cfg.ThrottleWithNamespace, but this example omits it. Add for discoverability.
     metrics.count.enable-reason: "false"
+    # When true, throttle metrics will include the "namespace" attribute.
+    metrics.throttle.with-namespace: "false"
pkg/pipelinerunmetrics/metrics.go (2)
245-254: Avoid holding the lock while building attr sets; capture level before loop

You only need cfg to decide attribute shape. Read it once, then loop without the mutex.
- switch r.cfg.RunningPipelinerunLevel {
+ level := ""
+ if r.cfg != nil {
+   level = r.cfg.RunningPipelinerunLevel
+ }
+ switch level {
292-319: getPipelineTagName: consider label-based fallback

If a resolved PipelineRef name is injected via labels (e.g., PipelineLabelKey), you might prefer that over “anonymous” to improve attribution. Optional.
pkg/taskrunmetrics/metrics.go (2)
389-407: Shrink lock scope in RecordPodLatency (optional)

Lock is not needed for computing timestamps/attrs; hold it only for the Record call.
- r.mutex.Lock()
- defer r.mutex.Unlock()
-
  scheduledTime := getScheduledTime(pod)
@@
- attrs := []attribute.KeyValue{
+ attrs := []attribute.KeyValue{
    attribute.String("namespace", tr.Namespace),
    attribute.String("pod", pod.Name),
  }
  attrs = append(attrs, r.insertTaskTag(taskName, tr.Name)...)
- r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...))
+ r.mutex.Lock()
+ r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...))
+ r.mutex.Unlock()
25-34: Prefer a single errors package for consistency

File uses github.com/pkg/errors while PipelineRun metrics uses stdlib errors. Consider standardizing (stdlib is sufficient here).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93faa72 and a2ea129.

📒 Files selected for processing (18)

cmd/webhook/main.go (0 hunks)
config/300-crds/300-pipelinerun.yaml (0 hunks)
config/300-crds/300-task.yaml (16 hunks)
config/300-crds/300-taskrun.yaml (7 hunks)
config/config-observability.yaml (2 hunks)
go.mod (8 hunks)
pkg/apis/pipeline/v1/openapi_generated.go (1 hunks)
pkg/apis/pipeline/v1/pipelinerun_types.go (1 hunks)
pkg/apis/pipeline/v1/swagger.json (1 hunks)
pkg/apis/pipeline/v1beta1/openapi_generated.go (1 hunks)
pkg/apis/pipeline/v1beta1/pipelinerun_types.go (1 hunks)
pkg/apis/pipeline/v1beta1/swagger.json (1 hunks)
pkg/apis/resolution/v1beta1/resolution_request_types.go (1 hunks)
pkg/pipelinerunmetrics/metrics.go (3 hunks)
pkg/pipelinerunmetrics/metrics_test.go (5 hunks)
pkg/taskrunmetrics/metrics.go (6 hunks)
pkg/taskrunmetrics/metrics_test.go (8 hunks)
test/util.go (0 hunks)

💤 Files with no reviewable changes (3)

config/300-crds/300-pipelinerun.yaml
cmd/webhook/main.go
test/util.go

🧰 Additional context used

🧬 Code graph analysis (6)

pkg/apis/pipeline/v1beta1/pipelinerun_types.go (1)

pkg/apis/pipeline/v1/pipelinerun_types.go (2)

PipelineRun (51-60)

PipelineRun (73-75)

pkg/apis/pipeline/v1/pipelinerun_types.go (1)

pkg/apis/pipeline/v1beta1/pipelinerun_types.go (2)

PipelineRun (50-59)

PipelineRun (72-74)

pkg/pipelinerunmetrics/metrics_test.go (3)

pkg/pipelinerunmetrics/metrics.go (2)

NewRecorder (72-88)

Recorder (47-63)

pkg/names/generate.go (1)

SimpleNameGenerator (48-48)

pkg/apis/config/metrics.go (4)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

Metrics (101-109)

pkg/taskrunmetrics/metrics.go (3)

pkg/apis/config/metrics.go (7)

Metrics (101-109)

TaskrunLevelAtTaskrun (52-52)

TaskrunLevelAtTask (54-54)

TaskrunLevelAtNS (56-56)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

Recorder (47-63)

NewRecorder (72-88)

OnStore (332-348)

pkg/apis/pipeline/register.go (4)

PipelineTaskLabelKey (47-47)

TaskLabelKey (29-29)

PipelineLabelKey (38-38)

PipelineRunLabelKey (41-41)

pkg/pipelinerunmetrics/metrics.go (3)

pkg/apis/pipeline/v1/pipelinerun_types.go (3)

PipelineRunReasonCancelled (345-345)

PipelineRun (51-60)

PipelineRun (73-75)

pkg/apis/config/metrics.go (4)

Metrics (101-109)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/taskrunmetrics/metrics.go (3)

Recorder (46-65)

NewRecorder (74-91)

OnStore (191-207)

pkg/taskrunmetrics/metrics_test.go (3)

pkg/taskrunmetrics/metrics.go (2)

Recorder (46-65)

NewRecorder (74-91)

pkg/apis/config/metrics.go (1)

Metrics (101-109)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (16)

pkg/apis/resolution/v1beta1/resolution_request_types.go (1)

54-54: LGTM: omitempty matches core K8s list conventions.

Making metadata optional keeps the JSON shape aligned with Kubernetes list types and avoids emitting empty objects. No further action needed.

pkg/apis/pipeline/v1/openapi_generated.go (1)

1253-1254: Require items in the schema

Good call on marking items as required—this mirrors the Go struct (json:"items") and matches Kubernetes list conventions so clients now get an accurate contract.

config/300-crds/300-taskrun.yaml (3)

4472-4512: EnvFromSource description broadened: LGTM

Accurately reflects ConfigMap/Secret sources; prefix wording matches K8s docs. No schema impact.

5835-5875: StepTemplate envFrom wording: LGTM

Clarifies CM/Secret usage and prefix; consistent with core API semantics.

6390-6430: Steps envFrom wording: LGTM

Same clarity improvements as above; consistent and non-breaking.

config/300-crds/300-task.yaml (1)

425-425: EnvFromSource description tweak reads well; repo-wide consistency confirmed. Found matching descriptions in config/300-crds/300-task.yaml (lines 425, 1790, 3122, 4855, 6218, 6773) and config/300-crds/300-taskrun.yaml (lines 4472, 5835, 6390). No other CRDs include this field.

pkg/pipelinerunmetrics/metrics_test.go (1)

265-276: Tests won’t compile: attribute.Set is not a valid map key; use a canonical string key.

Maps keyed by attribute.Set are invalid/unreliable. Convert expected/got to map[string]float64 using a stable key derived from the attribute set.

Apply:

@@
-        expected                map[attribute.Set]float64
+        expected                map[string]float64
@@
-        expected: map[attribute.Set]float64{
-            attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr2")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr3")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr6")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr7")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3"), attribute.String("pipelinerun", "pr8")): 1,
-        },
+        expected: map[string]float64{
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr2"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr3"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr6"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr7"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3"), attribute.String("pipelinerun", "pr8"))): 1,
+        },
@@
-        expected: map[attribute.Set]float64{
-            attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2")): 2,
-            attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1")): 2,
-            attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3")): 1,
-        },
+        expected: map[string]float64{
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"))): 2,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"))): 2,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3"))): 1,
+        },
@@
-        expected: map[attribute.Set]float64{
-            attribute.NewSet(attribute.String("namespace", "testns1")): 3,
-            attribute.NewSet(attribute.String("namespace", "testns2")): 3,
-        },
+        expected: map[string]float64{
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"))): 3,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 3,
+        },
@@
-        expected: map[attribute.Set]float64{
-            attribute.NewSet(): 6,
-        },
+        expected: map[string]float64{
+            attrKey(attribute.NewSet()): 6,
+        },
@@
-            got := make(map[attribute.Set]float64)
+            got := make(map[string]float64)
             sum, ok := runningPRsMetric.Data.(metricdata.Sum[float64])
             if !ok {
                 t.Fatalf("metric data is not a Sum[float64]: %T", runningPRsMetric.Data)
             }
             for _, dp := range sum.DataPoints {
-                got[dp.Attributes] = dp.Value
+                got[attrKey(dp.Attributes)] = dp.Value
             }

Add helper and imports:

@@
-import (
+import (
+    "sort"
+    "strings"
@@
 )

Place in this file:

// attrKey builds a deterministic key from an attribute set.
func attrKey(s attribute.Set) string {
	kvs := s.ToSlice()
	sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) })
	var b strings.Builder
	for _, kv := range kvs {
		b.WriteString(string(kv.Key))
		b.WriteByte('=')
		b.WriteString(kv.Value.Emit())
		b.WriteByte('|')
	}
	return b.String()
}

Also applies to: 279-285, 289-293, 296-299, 348-355

pkg/taskrunmetrics/metrics_test.go (2)

292-314: Tests won’t compile/rely on non-comparable attribute.Set keys; use canonical strings.

Convert expected/got maps to map[string]float64 and stringify attribute sets.

Apply:

@@
-        expectedQuota         map[attribute.Set]float64
-        expectedNode          map[attribute.Set]float64
+        expectedQuota         map[string]float64
+        expectedNode          map[string]float64
@@
-        expectedQuota: map[attribute.Set]float64{
-            attribute.NewSet(): 3,
-        },
+        expectedQuota: map[string]float64{
+            attrKey(attribute.NewSet()): 3,
+        },
-        expectedNode: map[attribute.Set]float64{
-            attribute.NewSet(): 2,
-        },
+        expectedNode: map[string]float64{
+            attrKey(attribute.NewSet()): 2,
+        },
@@
-        expectedQuota: map[attribute.Set]float64{
-            attribute.NewSet(attribute.String("namespace", "testns1")): 2,
-            attribute.NewSet(attribute.String("namespace", "testns2")): 1,
-        },
+        expectedQuota: map[string]float64{
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns1"))): 2,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1,
+        },
-        expectedNode: map[attribute.Set]float64{
-            attribute.NewSet(attribute.String("namespace", "testns2")): 1,
-            attribute.NewSet(attribute.String("namespace", "testns3")): 1,
-        },
+        expectedNode: map[string]float64{
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1,
+            attrKey(attribute.NewSet(attribute.String("namespace", "testns3"))): 1,
+        },
@@
-    gotQuota := make(map[attribute.Set]float64)
-    gotNode := make(map[attribute.Set]float64)
+    gotQuota := make(map[string]float64)
+    gotNode := make(map[string]float64)
@@
-                        gotQuota[dp.Attributes] = dp.Value
+                        gotQuota[attrKey(dp.Attributes)] = dp.Value
@@
-                        gotNode[dp.Attributes] = dp.Value
+                        gotNode[attrKey(dp.Attributes)] = dp.Value

Add helper/imports (if not already present in this file):

@@
-import (
+import (
+    "sort"
+    "strings"
@@
 )

func attrKey(s attribute.Set) string {
    kvs := s.ToSlice()
    sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) })
    var b strings.Builder
    for _, kv := range kvs {
        b.WriteString(string(kv.Key))
        b.WriteByte('=')
        b.WriteString(kv.Value.Emit())
        b.WriteByte('|')
    }
    return b.String()
}

Also applies to: 338-356, 358-364

418-423: Incorrect assumption about RecordPodLatency signature
The method in pkg/taskrunmetrics/metrics.go is defined as

func (r *Recorder) RecordPodLatency(ctx context.Context, pod *corev1.Pod, tr *v1.TaskRun) error

so the test calls using (ctx, pod, taskRun) already match the signature and require no change.

Likely an incorrect or invalid review comment.

go.mod (2)

141-145: Resolve AWS ECR replace/require divergence.

I ran
rg -n 'replace .*aws-sdk-go-v2' go.mod  
rg -n 'github.com/aws/aws-sdk-go-v2/service/(ecr|ecrpublic)' go.mod  
and found no other conflicts. Please manually verify there are no additional replace or require entries for these modules.

Either:

Keep the replace and remove the explicit require entries for service/ecr and service/ecrpublic, or

Drop the replace once upstreams are compatible and require the intended versions directly.

Proposed cleanup (if keeping replace):
 require (
@@
-    github.com/aws/aws-sdk-go-v2/service/ecr v1.18.11 // indirect
-    github.com/aws/aws-sdk-go-v2/service/ecrpublic v1.16.2 // indirect
+    # removed: versions controlled by replace block
 )
Add a short comment citing the upstream issue/PR and set a TODO with a re-evaluation date.

3-3: Ignore patch version warning for the go directive
The go directive in go.mod supports full release versions including patch (e.g., go 1.24.0) (tip.golang.org).

Likely an incorrect or invalid review comment.

pkg/pipelinerunmetrics/metrics.go (3)

205-218: Don’t hold the mutex while calling lister.List(); shorten the critical section

Move the expensive list outside the critical section to avoid blocking updates/dynamic reconfig.

- r.mutex.Lock()
- defer r.mutex.Unlock()
-
- prs, err := lister.List(labels.Everything())
+ prs, err := lister.List(labels.Everything())
  if err != nil {
    return fmt.Errorf("failed to list pipelineruns: %w", err)
  }
+ // lock only when reading/updating recorder state (below)

219-238: Count “waiting on task resolution” at most once per PipelineRun

Current loop increments per TaskRun child and inflates the count.

- if pr.Status.ChildReferences != nil {
-   for _, childRef := range pr.Status.ChildReferences {
-     if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" {
-       waitingOnTaskCount++
-     }
-   }
- }
+ if pr.Status.ChildReferences != nil {
+   counted := false
+   for _, cr := range pr.Status.ChildReferences {
+     if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" {
+       counted = true
+       break
+     }
+   }
+   if counted {
+     waitingOnTaskCount++
+   }
+ }

261-268: Fix UpDownCounter usage: add deltas, not totals (running + waiting gauges)

Adding absolute totals on each scrape makes gauges monotonically increase. Track the previous snapshot (per-attribute for running) and emit deltas; reset on instrument rebuild.

- // Report running_pipelineruns.
- for attrSet, value := range currentCounts {
-   r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
- }
- r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes())
- r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes())
+ r.mutex.Lock()
+ defer r.mutex.Unlock()
+
+ // Emit per-attr deltas and update cache.
+ if r.runningPrev == nil {
+   r.runningPrev = make(map[attribute.Set]float64)
+ }
+ // union of keys: negative deltas for disappeared keys
+ seen := make(map[attribute.Set]struct{})
+ for set, curr := range currentCounts {
+   prev := r.runningPrev[set]
+   delta := curr - prev
+   if delta != 0 {
+     r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(set.ToSlice()...))
+   }
+   r.runningPrev[set] = curr
+   seen[set] = struct{}{}
+ }
+ for set, prev := range r.runningPrev {
+   if _, ok := seen[set]; !ok && prev != 0 {
+     // went to zero, emit negative delta
+     r.runningPRsGauge.Add(ctx, -prev, metric.WithAttributes(set.ToSlice()...))
+     delete(r.runningPrev, set)
+   }
+ }
+
+ // cluster-wide waiting gauges
+ if delta := waitingOnPipelineCount - r.waitingOnPipelinePrev; delta != 0 {
+   r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, delta, metric.WithAttributes())
+   r.waitingOnPipelinePrev = waitingOnPipelineCount
+ }
+ if delta := waitingOnTaskCount - r.waitingOnTaskPrev; delta != 0 {
+   r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, delta, metric.WithAttributes())
+   r.waitingOnTaskPrev = waitingOnTaskCount
+ }

And extend Recorder + reset on init:

 type Recorder struct {
@@
   runningPRsWaitingOnTaskResolutionGauge     metric.Float64UpDownCounter
@@
+  // cached snapshots for delta-based UpDownCounter updates
+  runningPrev           map[attribute.Set]float64
+  waitingOnPipelinePrev float64
+  waitingOnTaskPrev     float64
 }
@@
 func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
@@
   r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge
+
+  // reset caches on rebuild
+  r.runningPrev = make(map[attribute.Set]float64)
+  r.waitingOnPipelinePrev = 0
+  r.waitingOnTaskPrev = 0

pkg/taskrunmetrics/metrics.go (2)

318-325: Move lister.List() outside the critical section

Reduce lock contention; only lock when reading/updating recorder state.

- r.mutex.Lock()
- defer r.mutex.Unlock()
-
- trs, err := lister.List(labels.Everything())
+ trs, err := lister.List(labels.Everything())
  if err != nil {
    return err
  }
+ // lock later only for shared state updates

326-351: Fix UpDownCounter usage for running/throttled TaskRuns: emit deltas and handle attr-set churn

Totals on each tick inflate gauges. Track previous snapshots per attribute set and add only the delta; reset caches on instrument rebuild.

- runningTrs := 0
- trsThrottledByQuota := make(map[attribute.Set]float64)
- trsThrottledByNode := make(map[attribute.Set]float64)
+ runningTrs := 0
+ trsThrottledByQuota := make(map[attribute.Set]float64)
+ trsThrottledByNode := make(map[attribute.Set]float64)
@@
- r.runningTRsGauge.Add(ctx, float64(runningTrs))
- for attrSet, value := range trsThrottledByQuota {
-   r.runningTRsThrottledByQuotaGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
- }
- for attrSet, value := range trsThrottledByNode {
-   r.runningTRsThrottledByNodeGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
- }
+ r.mutex.Lock()
+ defer r.mutex.Unlock()
+ // running
+ deltaRun := float64(runningTrs) - r.lastRunningTRs
+ if deltaRun != 0 {
+   r.runningTRsGauge.Add(ctx, deltaRun)
+   r.lastRunningTRs = float64(runningTrs)
+ }
+ // throttled by quota
+ if r.lastThrottledByQuota == nil {
+   r.lastThrottledByQuota = make(map[attribute.Set]float64)
+ }
+ seen := make(map[attribute.Set]struct{})
+ for set, curr := range trsThrottledByQuota {
+   prev := r.lastThrottledByQuota[set]
+   if d := curr - prev; d != 0 {
+     r.runningTRsThrottledByQuotaGauge.Add(ctx, d, metric.WithAttributes(set.ToSlice()...))
+   }
+   r.lastThrottledByQuota[set] = curr
+   seen[set] = struct{}{}
+ }
+ for set, prev := range r.lastThrottledByQuota {
+   if _, ok := seen[set]; !ok && prev != 0 {
+     r.runningTRsThrottledByQuotaGauge.Add(ctx, -prev, metric.WithAttributes(set.ToSlice()...))
+     delete(r.lastThrottledByQuota, set)
+   }
+ }
+ // throttled by node
+ if r.lastThrottledByNode == nil {
+   r.lastThrottledByNode = make(map[attribute.Set]float64)
+ }
+ seen = make(map[attribute.Set]struct{})
+ for set, curr := range trsThrottledByNode {
+   prev := r.lastThrottledByNode[set]
+   if d := curr - prev; d != 0 {
+     r.runningTRsThrottledByNodeGauge.Add(ctx, d, metric.WithAttributes(set.ToSlice()...))
+   }
+   r.lastThrottledByNode[set] = curr
+   seen[set] = struct{}{}
+ }
+ for set, prev := range r.lastThrottledByNode {
+   if _, ok := seen[set]; !ok && prev != 0 {
+     r.runningTRsThrottledByNodeGauge.Add(ctx, -prev, metric.WithAttributes(set.ToSlice()...))
+     delete(r.lastThrottledByNode, set)
+   }
+ }

And extend Recorder + reset on init:

 type Recorder struct {
@@
   podLatencyHistogram             metric.Float64Histogram
@@
+  // caches for delta-based updates
+  lastRunningTRs       float64
+  lastThrottledByQuota map[attribute.Set]float64
+  lastThrottledByNode  map[attribute.Set]float64
 }
@@
 func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
@@
   r.podLatencyHistogram = podLatencyHistogram
+
+  // reset caches on rebuild
+  r.lastRunningTRs = 0
+  r.lastThrottledByQuota = make(map[attribute.Set]float64)
+  r.lastThrottledByNode = make(map[attribute.Set]float64)

Also applies to: 353-360

coderabbitai · 2025-09-29T11:04:12Z

pkg/pipelinerunmetrics/metrics.go

+	duration := time.Since(pr.Status.StartTime.Time)
+	if pr.Status.CompletionTime != nil {
+		duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
 	}


⚠️ Potential issue | 🔴 Critical

Guard nil StartTime/Condition to avoid panics

Dereferencing StartTime or using a nil Condition can panic. Bail out early or skip until fields are set.

- duration := time.Since(pr.Status.StartTime.Time) - if pr.Status.CompletionTime != nil { - duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) - } + if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() { + return nil + } + duration := time.Since(pr.Status.StartTime.Time) + if pr.Status.CompletionTime != nil { + duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) + } @@ - cond := pr.Status.GetCondition(apis.ConditionSucceeded) + cond := pr.Status.GetCondition(apis.ConditionSucceeded) + if cond == nil { + return nil + } @@ - attrs := []attribute.KeyValue{ - attribute.String("namespace", pr.Namespace), - attribute.String("status", status), - attribute.String("reason", reason), - } + attrs := []attribute.KeyValue{ + attribute.String("namespace", pr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + attrs = append(attrs, attribute.String("reason", reason)) + }

Also applies to: 179-189

🤖 Prompt for AI Agents

In pkg/pipelinerunmetrics/metrics.go around lines 172-175 and 179-189, the code assumes pr.Status.StartTime and certain Conditions are non-nil and dereferences them, which can panic; add guards that check pr.Status.StartTime is set (not nil/zero) before calling Time/Since/Sub and check the Condition pointer for nil before accessing fields; if StartTime or the needed Condition is missing, bail out early or skip emitting/updating metrics for this PipelineRun to avoid panics.

- Updated `pipelinerunmetrics` and `taskrunmetrics` to use OpenTelemetry instruments (histograms, counters, gauges) for creating and recording metrics. Introduced new OpenTelemetry configurations in `config/config-observability.yaml` for exporters and protocols, while marking legacy OpenCensus settings as deprecated. Rewrote the test suites for `pipelinerunmetrics` and `taskrunmetrics` to be compatible with the new OpenTelemetry-based implementation.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

go.mod (1)

3-3: Invalid Go version directive

go.mod requires go 1.24 (no patch) or another valid released version. 1.24.0 will be rejected by go tooling.
-go 1.24.0
+go 1.24

pkg/pipelinerunmetrics/metrics.go (1)

172-201: Guard nil StartTime/Condition and gate reason cardinality

Avoid panics when StartTime/Condition are unset; include “reason” only when configured for counters (to control cardinality).

- duration := time.Since(pr.Status.StartTime.Time)
- if pr.Status.CompletionTime != nil {
-     duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
- }
+ if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() {
+     return nil
+ }
+ duration := time.Since(pr.Status.StartTime.Time)
+ if pr.Status.CompletionTime != nil {
+     duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
+ }
@@
- cond := pr.Status.GetCondition(apis.ConditionSucceeded)
+ cond := pr.Status.GetCondition(apis.ConditionSucceeded)
+ if cond == nil {
+     return nil
+ }
@@
- attrs := []attribute.KeyValue{
-     attribute.String("namespace", pr.Namespace),
-     attribute.String("status", status),
-     attribute.String("reason", reason),
- }
+ attrs := []attribute.KeyValue{
+     attribute.String("namespace", pr.Namespace),
+     attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+     attrs = append(attrs, attribute.String("reason", reason))
+ }
@@
- r.prDurationHistogram.Record(ctx, duration.Seconds(), metric.WithAttributes(attrs...))
- r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(attrs...))
+ r.prDurationHistogram.Record(ctx, duration.Seconds(), metric.WithAttributes(attrs...))
+ // For counters, keep reason optional to bound cardinality.
+ countAttrs := []attribute.KeyValue{
+     attribute.String("namespace", pr.Namespace),
+     attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+     countAttrs = append(countAttrs, attribute.String("reason", reason))
+ }
+ r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))

🧹 Nitpick comments (7)

go.mod (4)
123-131: Align OpenTelemetry contrib/exporter versions

Contrib is at v0.62.0, but exporters/prometheus is v0.59.1. Prefer matching contrib minor to avoid subtle API/ABI drift.
-	go.opentelemetry.io/otel/exporters/prometheus v0.59.1 // indirect
+	go.opentelemetry.io/otel/exporters/prometheus v0.62.0 // indirect
Please verify downstream compatibility before bumping.

23-23: Remove stale OpenCensus deps if fully migrated to OTEL

go.opencensus.io and ocagent/prometheus exporters remain. If no longer referenced, drop them to reduce surface and transitive deps.

Search and remove unused OC imports/usages; keep only if still required.

Also applies to: 147-149

95-96: Pin stable websocket release (optional)

A pseudo-version pinned to a commit can be brittle. If possible, prefer the latest stable tag (e.g., v1.5.x).

133-133: Deprecated package

golang.org/x/tools/go/expect v0.1.0-deprecated is marked deprecated. Confirm it’s intentionally required; otherwise, remove.
pkg/taskrunmetrics/metrics_test.go (1)
85-95: Test isolation: reset package singleton

NewRecorder uses a package-level sync.Once. Reset it at test start to avoid cross-test coupling when run with -shuffle.
 func TestUpdateConfig(t *testing.T) {
-    // Test that the config is updated when it changes, and not when it doesn't.
+    // Ensure a fresh recorder for this test.
+    once = sync.Once{}
+    // Test that the config is updated when it changes, and not when it doesn't.
pkg/pipelinerunmetrics/metrics_test.go (1)

374-383: Strengthen assertions (optional)

TestRecordRunningPipelineRunsResolutionWaitCounts only checks init. Consider wiring a ManualReader here too and asserting the two “waiting” gauges produced expected values.
pkg/pipelinerunmetrics/metrics.go (1)
19-37: Helper: canonical attribute key (and imports)

Add a small helper to stringify attribute sets deterministically.
 import (
   "context"
   "errors"
   "fmt"
+  "sort"
+  "strings"
   "sync"
   "time"
@@
 )
+
+// attrKey builds a deterministic key from an attribute set.
+func attrKey(s attribute.Set) string {
+    kvs := s.ToSlice()
+    sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) })
+    var b strings.Builder
+    for _, kv := range kvs {
+        b.WriteString(string(kv.Key))
+        b.WriteByte('=')
+        b.WriteString(kv.Value.Emit())
+        b.WriteByte('|')
+    }
+    return b.String()
+}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2ea129 and 4b5feb5.

📒 Files selected for processing (7)

config/config-observability.yaml (2 hunks)
go.mod (8 hunks)
pkg/pipelinerunmetrics/metrics.go (3 hunks)
pkg/pipelinerunmetrics/metrics_test.go (5 hunks)
pkg/taskrunmetrics/metrics.go (6 hunks)
pkg/taskrunmetrics/metrics_test.go (8 hunks)
test/util.go (0 hunks)

💤 Files with no reviewable changes (1)

test/util.go

🚧 Files skipped from review as they are similar to previous changes (1)

config/config-observability.yaml

🧰 Additional context used

🧬 Code graph analysis (4)

pkg/pipelinerunmetrics/metrics_test.go (3)

pkg/apis/pipeline/v1/pipelinerun_types.go (4)

PipelineRun (51-60)

PipelineRun (73-75)

PipelineRunSpec (250-286)

PipelineRunStatus (321-326)

pkg/pipelinerunmetrics/metrics.go (2)

NewRecorder (72-88)

Recorder (47-63)

pkg/apis/config/metrics.go (4)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

Metrics (101-109)

pkg/pipelinerunmetrics/metrics.go (2)

pkg/apis/config/metrics.go (4)

Metrics (101-109)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/taskrunmetrics/metrics.go (3)

Recorder (46-65)

NewRecorder (74-91)

OnStore (191-207)

pkg/taskrunmetrics/metrics_test.go (5)

pkg/pipelinerunmetrics/metrics.go (2)

Recorder (47-63)

NewRecorder (72-88)

pkg/taskrunmetrics/metrics.go (2)

Recorder (46-65)

NewRecorder (74-91)

pkg/names/generate.go (1)

SimpleNameGenerator (48-48)

pkg/apis/config/metrics.go (1)

Metrics (101-109)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

pkg/taskrunmetrics/metrics.go (4)

pkg/apis/config/metrics.go (7)

Metrics (101-109)

TaskrunLevelAtTaskrun (52-52)

TaskrunLevelAtTask (54-54)

TaskrunLevelAtNS (56-56)

PipelinerunLevelAtPipelinerun (65-65)

PipelinerunLevelAtPipeline (68-68)

PipelinerunLevelAtNS (71-71)

pkg/pipelinerunmetrics/metrics.go (3)

Recorder (47-63)

NewRecorder (72-88)

OnStore (332-348)

pkg/apis/pipeline/register.go (4)

PipelineTaskLabelKey (47-47)

TaskLabelKey (29-29)

PipelineLabelKey (38-38)

PipelineRunLabelKey (41-41)

pkg/pod/status.go (2)

ReasonExceededResourceQuota (60-60)

ReasonExceededNodeResources (64-64)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (6)

pkg/taskrunmetrics/metrics_test.go (1)

418-424: LGTM: corrected parameter order for RecordPodLatency

RecordPodLatency(ctx, pod, taskRun) matches the function signature.

pkg/taskrunmetrics/metrics.go (3)

318-325: Avoid holding the mutex across lister.List()

Move the list call outside the critical section to reduce lock contention and avoid blocking metrics updates.

-func (r *Recorder) RunningTaskRuns(ctx context.Context, lister listers.TaskRunLister) error {
+func (r *Recorder) RunningTaskRuns(ctx context.Context, lister listers.TaskRunLister) error {
   if !r.initialized {
     return errors.New("ignoring the metrics recording, failed to initialize the metrics recorder")
   }
-
-  r.mutex.Lock()
-  defer r.mutex.Unlock()
-
-  trs, err := lister.List(labels.Everything())
+  trs, err := lister.List(labels.Everything())
   if err != nil {
     return err
   }
+  r.mutex.Lock()
+  defer r.mutex.Unlock()

276-307: Guard nil StartTime/Condition and align counter attributes

Prevent panics and align trTotalCounter attributes with namespace and optional reason (per config).

- duration := time.Since(tr.Status.StartTime.Time)
- if tr.Status.CompletionTime != nil {
-   duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time)
- }
+ if tr.Status.StartTime == nil || tr.Status.StartTime.IsZero() {
+   return nil
+ }
+ duration := time.Since(tr.Status.StartTime.Time)
+ if tr.Status.CompletionTime != nil {
+   duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time)
+ }
@@
- cond := tr.Status.GetCondition(apis.ConditionSucceeded)
+ cond := tr.Status.GetCondition(apis.ConditionSucceeded)
+ if cond == nil {
+   return nil
+ }
@@
- attrs := []attribute.KeyValue{
-   attribute.String("namespace", tr.Namespace),
-   attribute.String("status", status),
-   attribute.String("reason", reason),
- }
+ attrs := []attribute.KeyValue{
+   attribute.String("namespace", tr.Namespace),
+   attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+   attrs = append(attrs, attribute.String("reason", reason))
+ }
@@
- r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))
+ countAttrs := []attribute.KeyValue{
+   attribute.String("namespace", tr.Namespace),
+   attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+   countAttrs = append(countAttrs, attribute.String("reason", reason))
+ }
+ r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))

353-360: Fix cumulative gauges: use deltas and stable keys

Adds full totals each scrape; UpDownCounters require deltas. Also replace map[attribute.Set] with stable string keys and retain attrs for reporting.

-  r.runningTRsGauge.Add(ctx, float64(runningTrs))
-  for attrSet, value := range trsThrottledByQuota {
-    r.runningTRsThrottledByQuotaGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
-  }
-  for attrSet, value := range trsThrottledByNode {
-    r.runningTRsThrottledByNodeGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
-  }
+  // running taskruns delta
+  if delta := float64(runningTrs) - r.prevRunningTRs; delta != 0 {
+    r.runningTRsGauge.Add(ctx, delta)
+    r.prevRunningTRs = float64(runningTrs)
+  }
+  // throttled-by-quota deltas
+  for key, cur := range trQuotaCur {
+    prev := r.prevTrsThrottledByQuota[key]
+    if d := cur - prev; d != 0 {
+      r.runningTRsThrottledByQuotaGauge.Add(ctx, d, metric.WithAttributes(trQuotaAttrs[key]...))
+    }
+  }
+  for key, prev := range r.prevTrsThrottledByQuota {
+    if _, ok := trQuotaCur[key]; !ok && prev != 0 {
+      r.runningTRsThrottledByQuotaGauge.Add(ctx, -prev, metric.WithAttributes(r.prevTrsThrottledByQuotaAttrs[key]...))
+    }
+  }
+  r.prevTrsThrottledByQuota = trQuotaCur
+  r.prevTrsThrottledByQuotaAttrs = trQuotaAttrs
+  // throttled-by-node deltas
+  for key, cur := range trNodeCur {
+    prev := r.prevTrsThrottledByNode[key]
+    if d := cur - prev; d != 0 {
+      r.runningTRsThrottledByNodeGauge.Add(ctx, d, metric.WithAttributes(trNodeAttrs[key]...))
+    }
+  }
+  for key, prev := range r.prevTrsThrottledByNode {
+    if _, ok := trNodeCur[key]; !ok && prev != 0 {
+      r.runningTRsThrottledByNodeGauge.Add(ctx, -prev, metric.WithAttributes(r.prevTrsThrottledByNodeAttrs[key]...))
+    }
+  }
+  r.prevTrsThrottledByNode = trNodeCur
+  r.prevTrsThrottledByNodeAttrs = trNodeAttrs

Precompute current maps earlier:

@@
-  trsThrottledByQuota := make(map[attribute.Set]float64)
-  trsThrottledByNode := make(map[attribute.Set]float64)
+  trQuotaCur := make(map[string]float64)
+  trQuotaAttrs := make(map[string][]attribute.KeyValue)
+  trNodeCur := make(map[string]float64)
+  trNodeAttrs := make(map[string][]attribute.KeyValue)
@@
-        attrSet := attribute.NewSet(attrs...)
+        attrSet := attribute.NewSet(attrs...)
+        key := attrKey(attrSet)
@@
-          trsThrottledByQuota[attrSet]++
+          trQuotaCur[key]++
+          trQuotaAttrs[key] = attrs
@@
-          trsThrottledByNode[attrSet]++
+          trNodeCur[key]++
+          trNodeAttrs[key] = attrs

pkg/pipelinerunmetrics/metrics_test.go (1)

262-299: Fix: attribute.Set is not a map key; use canonical string keys

Replace all map[attribute.Set]float64 with map[string]float64 and stringify attribute sets for both expected and got.

@@
-    expected                map[attribute.Set]float64
+    expected                map[string]float64
@@
-    expected: map[attribute.Set]float64{
-        attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1")): 1,
-        ...
-    },
+    expected: map[string]float64{
+        attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1"))): 1,
+        ...
+    },
@@
-    got := make(map[attribute.Set]float64)
+    got := make(map[string]float64)
@@
-    for _, dp := range sum.DataPoints {
-        got[dp.Attributes] = dp.Value
-    }
+    for _, dp := range sum.DataPoints {
+        got[attrKey(dp.Attributes)] = dp.Value
+    }

Add helper + imports if not present:

@@
-import (
+import (
   "context"
   "testing"
   "time"
+  "sort"
+  "strings"
@@
 )
@@
+func attrKey(s attribute.Set) string {
+    kvs := s.ToSlice()
+    sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) })
+    var b strings.Builder
+    for _, kv := range kvs {
+        b.WriteString(string(kv.Key))
+        b.WriteByte('=')
+        b.WriteString(kv.Value.Emit())
+        b.WriteByte('|')
+    }
+    return b.String()
+}

Also applies to: 348-360

pkg/pipelinerunmetrics/metrics.go (1)

228-238: Count “waiting on task resolution” once per PipelineRun

Currently increments per child TaskRun; count each PipelineRun at most once.

- if pr.Status.ChildReferences != nil {
-   for _, childRef := range pr.Status.ChildReferences {
-     if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" {
-       waitingOnTaskCount++
-     }
-   }
- }
+ if pr.Status.ChildReferences != nil {
+   counted := false
+   for _, cr := range pr.Status.ChildReferences {
+     if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" {
+       counted = true
+       break
+     }
+   }
+   if counted {
+     waitingOnTaskCount++
+   }
+ }

coderabbitai · 2025-09-29T13:04:37Z

pkg/pipelinerunmetrics/metrics.go

+// Recorder holds OpenTelemetry instruments for PipelineRun metrics
 type Recorder struct {
 	mutex       sync.Mutex
 	initialized bool
 	cfg         *config.Metrics

-	insertTag func(pipeline,
-		pipelinerun string) []tag.Mutator
+	meter metric.Meter

-	ReportingPeriod time.Duration
+	prDurationHistogram                        metric.Float64Histogram
+	prTotalCounter                             metric.Int64Counter
+	runningPRsGauge                            metric.Float64UpDownCounter
+	runningPRsWaitingOnPipelineResolutionGauge metric.Float64UpDownCounter
+	runningPRsWaitingOnTaskResolutionGauge     metric.Float64UpDownCounter
+
+	insertTag func(pipeline, pipelinerun string) []attribute.KeyValue

-	hash string
+	ReportingPeriod time.Duration
 }


🛠️ Refactor suggestion | 🟠 Major

Cache previous snapshot for deltas

Add fields to support delta computation and attribute retention.

type Recorder struct { @@ - insertTag func(pipeline, pipelinerun string) []attribute.KeyValue + insertTag func(pipeline, pipelinerun string) []attribute.KeyValue + + // Cached snapshot for delta-based UpDownCounter updates. + prevRunningPRCounts map[string]float64 + prevRunningPRAttrs map[string][]attribute.KeyValue + prevWaitingOnPipeline float64 + prevWaitingOnTask float64

coderabbitai · 2025-09-29T13:04:37Z

pkg/pipelinerunmetrics/metrics.go

+func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
 	r.mutex.Lock()
 	defer r.mutex.Unlock()

-	var prunTag []tag.Key
+	r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller")
+
 	switch cfg.PipelinerunLevel {
 	case config.PipelinerunLevelAtPipelinerun:
-		prunTag = []tag.Key{pipelinerunTag, pipelineTag}
 		r.insertTag = pipelinerunInsertTag
 	case config.PipelinerunLevelAtPipeline:
-		prunTag = []tag.Key{pipelineTag}
 		r.insertTag = pipelineInsertTag
 	case config.PipelinerunLevelAtNS:
-		prunTag = []tag.Key{}
 		r.insertTag = nilInsertTag
 	default:
 		return errors.New("invalid config for PipelinerunLevel: " + cfg.PipelinerunLevel)
 	}

-	var runningPRTag []tag.Key
-	switch cfg.RunningPipelinerunLevel {
-	case config.PipelinerunLevelAtPipelinerun:
-		runningPRTag = []tag.Key{pipelinerunTag, pipelineTag, namespaceTag}
-	case config.PipelinerunLevelAtPipeline:
-		runningPRTag = []tag.Key{pipelineTag, namespaceTag}
-	case config.PipelinerunLevelAtNS:
-		runningPRTag = []tag.Key{namespaceTag}
-	default:
-		runningPRTag = []tag.Key{}
-	}
-
-	distribution := view.Distribution(10, 30, 60, 300, 900, 1800, 3600, 5400, 10800, 21600, 43200, 86400)
-
-	if cfg.PipelinerunLevel == config.PipelinerunLevelAtPipelinerun {
-		distribution = view.LastValue()
-	} else {
-		switch cfg.DurationPipelinerunType {
-		case config.DurationTaskrunTypeHistogram:
-		case config.DurationTaskrunTypeLastValue:
-			distribution = view.LastValue()
-		default:
-			return errors.New("invalid config for DurationTaskrunType: " + cfg.DurationTaskrunType)
-		}
-	}
-
-	if cfg.CountWithReason {
-		prunTag = append(prunTag, reasonTag)
-	}
-
-	prDurationView = &view.View{
-		Description: prDuration.Description(),
-		Measure:     prDuration,
-		Aggregation: distribution,
-		TagKeys:     append([]tag.Key{statusTag, namespaceTag}, prunTag...),
-	}
-
-	prTotalView = &view.View{
-		Description: prTotal.Description(),
-		Measure:     prTotal,
-		Aggregation: view.Count(),
-		TagKeys:     []tag.Key{statusTag},
-	}
-
-	runningPRsView = &view.View{
-		Description: runningPRs.Description(),
-		Measure:     runningPRs,
-		Aggregation: view.LastValue(),
-		TagKeys:     runningPRTag,
-	}
-
-	runningPRsWaitingOnPipelineResolutionView = &view.View{
-		Description: runningPRsWaitingOnPipelineResolution.Description(),
-		Measure:     runningPRsWaitingOnPipelineResolution,
-		Aggregation: view.LastValue(),
-	}
-
-	runningPRsWaitingOnTaskResolutionView = &view.View{
-		Description: runningPRsWaitingOnTaskResolution.Description(),
-		Measure:     runningPRsWaitingOnTaskResolution,
-		Aggregation: view.LastValue(),
-	}
-
-	return view.Register(
-		prDurationView,
-		prTotalView,
-		runningPRsView,
-		runningPRsWaitingOnPipelineResolutionView,
-		runningPRsWaitingOnTaskResolutionView,
+	prDurationHistogram, err := r.meter.Float64Histogram(
+		"tekton_pipelines_controller_pipelinerun_duration_seconds",
+		metric.WithDescription("The pipelinerun execution time in seconds"),
+		metric.WithUnit("s"),
+		metric.WithExplicitBucketBoundaries(10, 30, 60, 300, 900, 1800, 3600, 5400, 10800, 21600, 43200, 86400),
 	)
-}
-
-func viewUnregister() {
-	view.Unregister(prDurationView,
-		prTotalView,
-		runningPRsView,
-		runningPRsWaitingOnPipelineResolutionView,
-		runningPRsWaitingOnTaskResolutionView)
-}
-
-// OnStore returns a function that checks if metrics are configured for a config.Store, and registers it if so
-func OnStore(logger *zap.SugaredLogger, r *Recorder) func(name string,
-	value interface{}) {
-	return func(name string, value interface{}) {
-		if name == config.GetMetricsConfigName() {
-			cfg, ok := value.(*config.Metrics)
-			if !ok {
-				logger.Error("Failed to do type insertion for extracting metrics config")
-				return
-			}
-			updated := r.updateConfig(cfg)
-			if !updated {
-				return
-			}
-			// Update metrics according to configuration
-			viewUnregister()
-			err := viewRegister(cfg)
-			if err != nil {
-				logger.Errorf("Failed to register View %v ", err)
-				return
-			}
-		}
+	if err != nil {
+		return fmt.Errorf("failed to create pipelinerun duration histogram: %w", err)
 	}
-}
+	r.prDurationHistogram = prDurationHistogram


🛠️ Refactor suggestion | 🟠 Major

Initialize/reset caches during instrument setup

Ensure snapshot is reset when instruments are rebuilt.

func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { @@ - r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller") + r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller") + // reset delta caches + r.prevRunningPRCounts = make(map[string]float64) + r.prevRunningPRAttrs = make(map[string][]attribute.KeyValue) + r.prevWaitingOnPipeline = 0 + r.prevWaitingOnTask = 0

Also applies to: 136-155

🤖 Prompt for AI Agents

In pkg/pipelinerunmetrics/metrics.go around lines 90-116 (and likewise apply the same change at 136-155), when rebuilding instruments under the mutex you must reset the recorder's cached snapshot so stale data isn't reused; after creating/assigning new instruments (e.g. after r.prDurationHistogram = prDurationHistogram) set r.snapshot to an empty/zero value (for example r.snapshot = metricsSnapshot{} or nil if the type allows) and clear any related caches so the recorder starts with a fresh snapshot whenever instruments are reinitialized.

coderabbitai · 2025-09-29T13:04:37Z

pkg/pipelinerunmetrics/metrics.go

 // RunningPipelineRuns logs the number of PipelineRuns running right now
-// returns an error if it fails to log the metrics
-func (r *Recorder) RunningPipelineRuns(lister listers.PipelineRunLister) error {
-	r.mutex.Lock()
-	defer r.mutex.Unlock()
+func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error {
 	if !r.initialized {
-		return errors.New("ignoring the metrics recording, failed to initialize the metrics recorder")
+		return fmt.Errorf("ignoring the metrics recording, failed to initialize the metrics recorder")
 	}

+	r.mutex.Lock()
+	defer r.mutex.Unlock()
+
 	prs, err := lister.List(labels.Everything())
 	if err != nil {
-		return fmt.Errorf("failed to list pipelineruns while generating metrics : %w", err)
+		return fmt.Errorf("failed to list pipelineruns: %w", err)
 	}

-	var runningPipelineRuns int
-	var trsWaitResolvingTaskRef int
-	var prsWaitResolvingPipelineRef int
-	countMap := map[string]int{}
+	// This map holds the new counts for the current interval.
+	currentCounts := make(map[attribute.Set]float64)
+	var waitingOnPipelineCount float64
+	var waitingOnTaskCount float64

 	for _, pr := range prs {
-		pipelineName := getPipelineTagName(pr)
-		pipelineRunKey := ""
-		mutators := []tag.Mutator{
-			tag.Insert(namespaceTag, pr.Namespace),
-			tag.Insert(pipelineTag, pipelineName),
-			tag.Insert(pipelinerunTag, pr.Name),
-		}
-		if r.cfg != nil {
-			switch r.cfg.RunningPipelinerunLevel {
-			case runningPRLevelPipelinerun:
-				pipelineRunKey = pipelineRunKey + "#" + pr.Name
-				fallthrough
-			case runningPRLevelPipeline:
-				pipelineRunKey = pipelineRunKey + "#" + pipelineName
-				fallthrough
-			case runningPRLevelNamespace:
-				pipelineRunKey = pipelineRunKey + "#" + pr.Namespace
-			case runningPRLevelCluster:
-			default:
-				return fmt.Errorf("RunningPipelineRunLevel value \"%s\" is not valid ", r.cfg.RunningPipelinerunLevel)
+		if pr.Status.GetCondition(apis.ConditionSucceeded).IsUnknown() {
+			// This PR is running.
+
+			// Handle waiting metrics (these are cluster-wide, no extra attributes).
+			if pr.Status.PipelineSpec == nil && pr.Spec.PipelineRef != nil {
+				waitingOnPipelineCount++
 			}
-		}
-		ctx_, err_ := tag.New(context.Background(), mutators...)
-		if err_ != nil {
-			return err
-		}
-		if !pr.IsDone() && !pr.IsPending() {
-			countMap[pipelineRunKey]++
-			metrics.Record(ctx_, runningPRs.M(float64(countMap[pipelineRunKey])))
-			runningPipelineRuns++
-			succeedCondition := pr.Status.GetCondition(apis.ConditionSucceeded)
-			if succeedCondition != nil && succeedCondition.Status == corev1.ConditionUnknown {
-				switch succeedCondition.Reason {
-				case v1.TaskRunReasonResolvingTaskRef:
-					trsWaitResolvingTaskRef++
-				case v1.PipelineRunReasonResolvingPipelineRef.String():
-					prsWaitResolvingPipelineRef++
+			if pr.Status.ChildReferences != nil {
+				for _, childRef := range pr.Status.ChildReferences {
+					if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" {
+						waitingOnTaskCount++
+					}
 				}
 			}
-		} else {
-			// In case there are no running PipelineRuns for the pipelineRunKey, set the metric value to 0 to ensure
-			//  the metric is set for the key.
-			if _, exists := countMap[pipelineRunKey]; !exists {
-				countMap[pipelineRunKey] = 0
-				metrics.Record(ctx_, runningPRs.M(0))
+
+			// Handle running_pipelineruns metric with per-level aggregation.
+			pipelineName := getPipelineTagName(pr)
+			var attrs []attribute.KeyValue
+
+			// Build attributes based on configured level.
+			switch r.cfg.RunningPipelinerunLevel {
+			case config.PipelinerunLevelAtPipelinerun:
+				attrs = append(attrs, attribute.String("pipelinerun", pr.Name))
+				fallthrough
+			case config.PipelinerunLevelAtPipeline:
+				attrs = append(attrs, attribute.String("pipeline", pipelineName))
+				fallthrough
+			case config.PipelinerunLevelAtNS:
+				attrs = append(attrs, attribute.String("namespace", pr.Namespace))
 			}
+
+			attrSet := attribute.NewSet(attrs...)
+			currentCounts[attrSet]++
 		}
 	}

-	ctx, err := tag.New(context.Background())
-	if err != nil {
-		return err
+	// Report running_pipelineruns.
+	for attrSet, value := range currentCounts {
+		r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
 	}
-	metrics.Record(ctx, runningPRsWaitingOnPipelineResolution.M(float64(prsWaitResolvingPipelineRef)))
-	metrics.Record(ctx, runningPRsWaitingOnTaskResolution.M(float64(trsWaitResolvingTaskRef)))
-	metrics.Record(ctx, runningPRs.M(float64(runningPipelineRuns)))
+
+	r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes())
+	r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes())

 	return nil
 }


⚠️ Potential issue | 🔴 Critical

Fix cumulative gauges: report deltas and avoid map[attribute.Set]

Current code:

Adds full totals on every scrape to UpDownCounters (cumulative inflation).

Uses map[attribute.Set]float64 (fragile; Set is not a safe map key).

Holds the mutex during lister.List().

Address by:

Computing deltas vs. a cached previous snapshot (per-attribute-key), including emitting negative deltas for disappeared series.

Moving lister.List() outside the critical section.

Keying by a canonical string and retaining attributes for reporting.

@@ -func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error { +func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error { if !r.initialized { return fmt.Errorf("ignoring the metrics recording, failed to initialize the metrics recorder") } - - r.mutex.Lock() - defer r.mutex.Unlock() - - prs, err := lister.List(labels.Everything()) + prs, err := lister.List(labels.Everything()) if err != nil { return fmt.Errorf("failed to list pipelineruns: %w", err) } - // This map holds the new counts for the current interval. - currentCounts := make(map[attribute.Set]float64) - var waitingOnPipelineCount float64 - var waitingOnTaskCount float64 + // Current snapshot (keyed by canonical attr key). + currentCounts := make(map[string]float64) + attrsByKey := make(map[string][]attribute.KeyValue) + var waitingOnPipelineCount float64 + var waitingOnTaskCount float64 @@ - // Handle running_pipelineruns metric with per-level aggregation. + // Handle running_pipelineruns metric with per-level aggregation. pipelineName := getPipelineTagName(pr) var attrs []attribute.KeyValue @@ - attrSet := attribute.NewSet(attrs...) - currentCounts[attrSet]++ + key := attrKey(attribute.NewSet(attrs...)) + currentCounts[key]++ + attrsByKey[key] = attrs } } - // Report running_pipelineruns. - for attrSet, value := range currentCounts { - r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } - - r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes()) - r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes()) + r.mutex.Lock() + defer r.mutex.Unlock() + + // Emit deltas for running counts (including negatives for vanished series). + for key, cur := range currentCounts { + prev := r.prevRunningPRCounts[key] + if delta := cur - prev; delta != 0 { + r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(attrsByKey[key]...)) + } + } + for key, prev := range r.prevRunningPRCounts { + if _, still := currentCounts[key]; !still && prev != 0 { + // series disappeared -> emit negative delta using previous attrs + r.runningPRsGauge.Add(ctx, -prev, metric.WithAttributes(r.prevRunningPRAttrs[key]...)) + } + } + r.prevRunningPRCounts = currentCounts + r.prevRunningPRAttrs = attrsByKey + + if delta := waitingOnPipelineCount - r.prevWaitingOnPipeline; delta != 0 { + r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, delta, metric.WithAttributes()) + r.prevWaitingOnPipeline = waitingOnPipelineCount + } + if delta := waitingOnTaskCount - r.prevWaitingOnTask; delta != 0 { + r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, delta, metric.WithAttributes()) + r.prevWaitingOnTask = waitingOnTaskCount + } return nil }

Add helpers/state (see struct and init comments below).

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In pkg/pipelinerunmetrics/metrics.go around lines 205 to 270, the current implementation reads the lister under the recorder mutex, uses map[attribute.Set]float64 as the snapshot key, and calls Add with full totals on each scrape (causing cumulative inflation and unsafe map keys). Change to: call lister.List(...) before acquiring r.mutex; build a canonical string key (e.g., join sorted key=value pairs) and keep a map[string]struct{attrs []attribute.KeyValue; value float64} for the currentCounts; compute deltas against a recorder-level previousSnapshot (protected by r.mutex) so you call r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(attrs...)) and emit negative deltas for keys that disappeared; keep attribute slices to pass to metric.WithAttributes and update previousSnapshot after emitting deltas; add small helper(s) to build the canonical key and to convert stored attrs to metric options; and only hold r.mutex while reading/updating previousSnapshot (not during List).

coderabbitai · 2025-09-29T13:04:37Z

pkg/taskrunmetrics/metrics_test.go

 	for _, tc := range []struct {
-		status     corev1.ConditionStatus
-		reason     string
-		nodeCount  float64
-		quotaCount float64
-		waitCount  float64
-		addNS      bool
-	}{
-		{
-			status: corev1.ConditionTrue,
-			reason: "",
-		},
-		{
-			status: corev1.ConditionTrue,
-			reason: pod.ReasonExceededResourceQuota,
-		},
-		{
-			status: corev1.ConditionTrue,
-			reason: pod.ReasonExceededResourceQuota,
-			addNS:  true,
-		},
-		{
-			status: corev1.ConditionTrue,
-			reason: pod.ReasonExceededNodeResources,
-		},
-		{
-			status: corev1.ConditionTrue,
-			reason: pod.ReasonExceededNodeResources,
-			addNS:  true,
-		},
-		{
-			status: corev1.ConditionTrue,
-			reason: v1.TaskRunReasonResolvingTaskRef,
-		},
-		{
-			status: corev1.ConditionFalse,
-			reason: "",
-		},
-		{
-			status: corev1.ConditionFalse,
-			reason: pod.ReasonExceededResourceQuota,
-		},
-		{
-			status: corev1.ConditionFalse,
-			reason: pod.ReasonExceededNodeResources,
-		},
-		{
-			status: corev1.ConditionFalse,
-			reason: v1.TaskRunReasonResolvingTaskRef,
-		},
-		{
-			status: corev1.ConditionUnknown,
-			reason: "",
-		},
-		{
-			status:     corev1.ConditionUnknown,
-			reason:     pod.ReasonExceededResourceQuota,
-			quotaCount: 3,
-		},
-		{
-			status:    corev1.ConditionUnknown,
-			reason:    pod.ReasonExceededNodeResources,
-			nodeCount: 3,
+		name                  string
+		throttleWithNamespace bool
+		expectedQuota         map[attribute.Set]float64
+		expectedNode          map[attribute.Set]float64
+	}{{
+		name:                  "without namespace",
+		throttleWithNamespace: false,
+		expectedQuota: map[attribute.Set]float64{
+			attribute.NewSet(): 3,
 		},
-		{
-			status:     corev1.ConditionUnknown,
-			reason:     pod.ReasonExceededResourceQuota,
-			quotaCount: 3,
-			addNS:      true,
+		expectedNode: map[attribute.Set]float64{
+			attribute.NewSet(): 2,
 		},
-		{
-			status:    corev1.ConditionUnknown,
-			reason:    pod.ReasonExceededNodeResources,
-			nodeCount: 3,
-			addNS:     true,
+	}, {
+		name:                  "with namespace",
+		throttleWithNamespace: true,
+		expectedQuota: map[attribute.Set]float64{
+			attribute.NewSet(attribute.String("namespace", "testns1")): 2,
+			attribute.NewSet(attribute.String("namespace", "testns2")): 1,
 		},
-		{
-			status:    corev1.ConditionUnknown,
-			reason:    v1.TaskRunReasonResolvingTaskRef,
-			waitCount: 3,
+		expectedNode: map[attribute.Set]float64{
+			attribute.NewSet(attribute.String("namespace", "testns2")): 1,
+			attribute.NewSet(attribute.String("namespace", "testns3")): 1,
 		},
-	} {
-		unregisterMetrics()
-		ctx, _ := ttesting.SetupFakeContext(t)
-		informer := faketaskruninformer.Get(ctx)
-		for range multiplier {
-			tr := &v1.TaskRun{
-				ObjectMeta: metav1.ObjectMeta{Name: names.SimpleNameGenerator.RestrictLengthWithRandomSuffix("taskrun-"), Namespace: "test"},
-				Status: v1.TaskRunStatus{
-					Status: duckv1.Status{
-						Conditions: duckv1.Conditions{{
-							Type:   apis.ConditionSucceeded,
-							Status: tc.status,
-							Reason: tc.reason,
-						}},
-					},
-				},
+	}} {


⚠️ Potential issue | 🔴 Critical

Fix: attribute.Set is not a map key; use canonical string keys

attribute.Set isn’t comparable; map[attribute.Set]float64 won’t compile. Convert attribute sets to deterministic strings for map keys in both expected and got.

@@ - expectedQuota map[attribute.Set]float64 - expectedNode map[attribute.Set]float64 + expectedQuota map[string]float64 + expectedNode map[string]float64 @@ - expectedQuota: map[attribute.Set]float64{ - attribute.NewSet(): 3, - }, + expectedQuota: map[string]float64{ + attrKey(attribute.NewSet()): 3, + }, @@ - expectedNode: map[attribute.Set]float64{ - attribute.NewSet(): 2, - }, + expectedNode: map[string]float64{ + attrKey(attribute.NewSet()): 2, + }, @@ - expectedQuota: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns1")): 2, - attribute.NewSet(attribute.String("namespace", "testns2")): 1, - }, + expectedQuota: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"))): 2, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1, + }, @@ - expectedNode: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns2")): 1, - attribute.NewSet(attribute.String("namespace", "testns3")): 1, - }, + expectedNode: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns3"))): 1, + }, @@ - gotQuota := make(map[attribute.Set]float64) - gotNode := make(map[attribute.Set]float64) + gotQuota := make(map[string]float64) + gotNode := make(map[string]float64) @@ - case "tekton_pipelines_controller_running_taskruns_throttled_by_quota": + case "tekton_pipelines_controller_running_taskruns_throttled_by_quota": for _, dp := range sum.DataPoints { - gotQuota[dp.Attributes] = dp.Value + gotQuota[attrKey(dp.Attributes)] = dp.Value } case "tekton_pipelines_controller_running_taskruns_throttled_by_node": for _, dp := range sum.DataPoints { - gotNode[dp.Attributes] = dp.Value + gotNode[attrKey(dp.Attributes)] = dp.Value }

Add helper + imports:

@@ import ( "context" "sync" "testing" "time" + "sort" + "strings" @@ ) @@ +// attrKey builds a deterministic key from an attribute set. +func attrKey(s attribute.Set) string { + kvs := s.ToSlice() + sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) }) + var b strings.Builder + for _, kv := range kvs { + b.WriteString(string(kv.Key)) + b.WriteByte('=') + b.WriteString(kv.Value.Emit()) + b.WriteByte('|') + } + return b.String() +}

Also applies to: 336-365

🤖 Prompt for AI Agents

In pkg/taskrunmetrics/metrics_test.go around lines 289 to 314 (and also apply same change to lines 336–365), the test currently declares maps using map[attribute.Set]float64 which fails because attribute.Set is not comparable; change those maps to map[string]float64 and convert attribute.Set keys to a deterministic canonical string. Add a small helper function (and necessary imports) to serialize an attribute.Set into a stable string (e.g., sort attributes and format as key=value pairs) and use that helper when building both expectedQuota/expectedNode and when constructing the actual result keys before comparing maps.

coderabbitai · 2025-09-29T13:04:37Z

pkg/taskrunmetrics/metrics.go

+// Recorder holds OpenTelemetry instruments for TaskRun metrics
 type Recorder struct {
 	mutex       sync.Mutex
 	initialized bool
 	cfg         *config.Metrics

-	ReportingPeriod time.Duration
+	meter metric.Meter

-	insertTaskTag func(task,
-		taskrun string) []tag.Mutator
+	trDurationHistogram             metric.Float64Histogram
+	prTRDurationHistogram           metric.Float64Histogram
+	trTotalCounter                  metric.Int64Counter
+	runningTRsGauge                 metric.Float64UpDownCounter
+	runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter
+	runningTRsThrottledByNodeGauge  metric.Float64UpDownCounter
+	podLatencyHistogram             metric.Float64Histogram

-	insertPipelineTag func(pipeline,
-		pipelinerun string) []tag.Mutator
+	insertTaskTag     func(task, taskrun string) []attribute.KeyValue
+	insertPipelineTag func(pipeline, pipelinerun string) []attribute.KeyValue

-	hash string
+	ReportingPeriod time.Duration
 }



🛠️ Refactor suggestion | 🟠 Major

Add delta caches to Recorder

type Recorder struct { @@ podLatencyHistogram metric.Float64Histogram @@ ReportingPeriod time.Duration + + // Delta caches + prevRunningTRs float64 + prevTrsThrottledByQuota map[string]float64 + prevTrsThrottledByQuotaAttrs map[string][]attribute.KeyValue + prevTrsThrottledByNode map[string]float64 + prevTrsThrottledByNodeAttrs map[string][]attribute.KeyValue }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Recorder holds OpenTelemetry instruments for TaskRun metrics

type Recorder struct {

mutex sync.Mutex

initialized bool

cfg *config.Metrics

ReportingPeriod time.Duration

meter metric.Meter

insertTaskTag func(task,

taskrun string) []tag.Mutator

trDurationHistogram metric.Float64Histogram

prTRDurationHistogram metric.Float64Histogram

trTotalCounter metric.Int64Counter

runningTRsGauge metric.Float64UpDownCounter

runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter

runningTRsThrottledByNodeGauge metric.Float64UpDownCounter

podLatencyHistogram metric.Float64Histogram

insertPipelineTag func(pipeline,

pipelinerun string) []tag.Mutator

insertTaskTag func(task, taskrun string) []attribute.KeyValue

insertPipelineTag func(pipeline, pipelinerun string) []attribute.KeyValue

hash string

ReportingPeriod time.Duration

}

// Recorder holds OpenTelemetry instruments for TaskRun metrics

type Recorder struct {

mutex sync.Mutex

initialized bool

cfg *config.Metrics

meter metric.Meter

trDurationHistogram metric.Float64Histogram

prTRDurationHistogram metric.Float64Histogram

trTotalCounter metric.Int64Counter

runningTRsGauge metric.Float64UpDownCounter

runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter

runningTRsThrottledByNodeGauge metric.Float64UpDownCounter

podLatencyHistogram metric.Float64Histogram

insertTaskTag func(task, taskrun string) []attribute.KeyValue

insertPipelineTag func(pipeline, pipelinerun string) []attribute.KeyValue

ReportingPeriod time.Duration

// Delta caches

prevRunningTRs float64

prevTrsThrottledByQuota map[string]float64

prevTrsThrottledByQuotaAttrs map[string][]attribute.KeyValue

prevTrsThrottledByNode map[string]float64

prevTrsThrottledByNodeAttrs map[string][]attribute.KeyValue

}

🤖 Prompt for AI Agents

In pkg/taskrunmetrics/metrics.go around lines 45 to 66, the Recorder struct needs delta caches added; add fields to store deltas and protect them (e.g., deltaCache map[string]float64 and deltaCacheMu sync.Mutex or similar typed cache and mutex for thread-safety), and ensure callers that construct Recorder initialize the cache (make(map[string]float64)) and use the mutex when reading/updating the cache.

coderabbitai · 2025-09-29T13:04:38Z

pkg/taskrunmetrics/metrics.go

+func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
 	r.mutex.Lock()
 	defer r.mutex.Unlock()

-	var prunTag []tag.Key
+	r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller")
+
+	switch cfg.TaskrunLevel {
+	case config.TaskrunLevelAtTaskrun:
+		r.insertTaskTag = taskrunInsertTag
+	case config.TaskrunLevelAtTask:
+		r.insertTaskTag = taskInsertTag
+	case config.TaskrunLevelAtNS:
+		r.insertTaskTag = nilInsertTag
+	default:
+		return errors.New("invalid config for TaskrunLevel: " + cfg.TaskrunLevel)
+	}
+


🛠️ Refactor suggestion | 🟠 Major

Initialize/reset delta caches and add attrKey helper/imports

Reset caches when instruments are (re)initialized and add the key helper.

import ( "context" "fmt" + "sort" + "strings" "sync" "time" @@ func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { @@ r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller") + // reset delta caches + r.prevRunningTRs = 0 + r.prevTrsThrottledByQuota = make(map[string]float64) + r.prevTrsThrottledByQuotaAttrs = make(map[string][]attribute.KeyValue) + r.prevTrsThrottledByNode = make(map[string]float64) + r.prevTrsThrottledByNodeAttrs = make(map[string][]attribute.KeyValue) @@ r.podLatencyHistogram = podLatencyHistogram return nil } + +// attrKey builds a deterministic key from an attribute set. +func attrKey(s attribute.Set) string { + kvs := s.ToSlice() + sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) }) + var b strings.Builder + for _, kv := range kvs { + b.WriteString(string(kv.Key)) + b.WriteByte('=') + b.WriteString(kv.Value.Emit()) + b.WriteByte('|') + } + return b.String() +}

Also applies to: 121-188

🤖 Prompt for AI Agents

In pkg/taskrunmetrics/metrics.go around lines 93 to 109 (also apply same changes to 121-188), when instruments are initialized or re-initialized you must reset the delta caches and add the attrKey helper/imports: clear or reassign the maps used to store deltaTaskrunCounts, deltaTaskCounts (and any other per-interval cache fields) so they start empty on re-init, and add a small helper function attrKey that converts label names/values to the attribute.Key/attribute.KeyValue form used by Otel along with the required import (go.opentelemetry.io/otel/attribute); ensure the cache reset code is placed inside the same mutex-protected block after meter setup so re-init is thread-safe and duplicate the same cache-reset and helper-import usage in the other instrument initialization block at lines 121-188.

khrm force-pushed the mig-otel branch from e10bd0b to 99d4286 Compare September 29, 2025 00:35

khrm requested a review from Copilot September 29, 2025 00:35

Copilot AI reviewed Sep 29, 2025

View reviewed changes

khrm force-pushed the mig-otel branch 3 times, most recently from 19e4c65 to 103b041 Compare September 29, 2025 01:11

coderabbitai bot reviewed Sep 29, 2025

View reviewed changes

khrm force-pushed the mig-otel branch from 103b041 to 5a9bdab Compare September 29, 2025 08:17

coderabbitai bot reviewed Sep 29, 2025

View reviewed changes

pkg/pipelinerunmetrics/metrics_test.go Show resolved Hide resolved

khrm force-pushed the mig-otel branch from 5a9bdab to 74b736e Compare September 29, 2025 09:39

Update to knative 1.19

886ba15

This upgrades knative to 1.19 and other dependent dependencies.

coderabbitai bot reviewed Sep 29, 2025

View reviewed changes

khrm force-pushed the mig-otel branch from 74b736e to 93faa72 Compare September 29, 2025 09:49

coderabbitai bot reviewed Sep 29, 2025

View reviewed changes

khrm added 3 commits September 29, 2025 16:21

Remove old metrics code from watcher

26a3d1e

This isn't supported after knative otel's migration.

Remove opencensus related vendored code

84a1b3f

khrm force-pushed the mig-otel branch 2 times, most recently from bab5a44 to a2ea129 Compare September 29, 2025 10:53

coderabbitai bot reviewed Sep 29, 2025

View reviewed changes

khrm force-pushed the mig-otel branch from a2ea129 to 4b5feb5 Compare September 29, 2025 12:53

coderabbitai bot reviewed Sep 29, 2025

View reviewed changes

-	r.runningTRsGauge.Add(ctx, float64(runningTrs))
-	r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))
-	r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode))
+// pkg/taskrunmetrics/metrics.go
+type Recorder struct {
+    // existing fields...
+    runningTRsGauge                  metric.Float64UpDownCounter
+    runningTRsThrottledByQuotaGauge  metric.Float64UpDownCounter
+    runningTRsThrottledByNodeGauge   metric.Float64UpDownCounter
+   lastRunningTRs                   int
+   lastTrsThrottledByQuota          int
+   lastTrsThrottledByNode           int
+    // other fields...
+}
+func NewRecorder(/* params */) *Recorder {
+    r := &Recorder{
+        // existing initializations...
+        runningTRsGauge:                 /* ... */,
+        runningTRsThrottledByQuotaGauge: /* ... */,
+        runningTRsThrottledByNodeGauge:  /* ... */,
+        podLatencyHistogram:             podLatencyHistogram,
+       lastRunningTRs:                  0,
+       lastTrsThrottledByQuota:         0,
+       lastTrsThrottledByNode:          0,
+    }
+    return r
+}
+func (r *Recorder) Record(ctx context.Context, /* params */) error {
+    // ... prior code ...
+-   r.runningTRsGauge.Add(ctx, float64(runningTrs))
+-   r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))
+   r.runningTRsGauge.Add(ctx, float64(runningTrs - r.lastRunningTRs))
+   r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota - r.lastTrsThrottledByQuota))
+   r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode - r.lastTrsThrottledByNode))
+   r.lastRunningTRs = runningTrs
+   r.lastTrsThrottledByQuota = trsThrottledByQuota
+   r.lastTrsThrottledByNode = trsThrottledByNode
+    // ... remaining code ...
+}

-func ParseDurationAllowNegative(s string) (Duration, error) {
-	if s == "" || s[0] != '-' {
-		return ParseDuration(s)
-	}
-	d, err := ParseDuration(s[1:])
-	return -d, err
-}
+func (d Duration) MarshalJSON() ([]byte, error) {
+    return json.Marshal(d.String())
+}
+func (d *Duration) UnmarshalJSON(bytes []byte) error {
+    var s string
+    if err := json.Unmarshal(bytes, &s); err != nil {
+        return err
+    }
+    dur, err := ParseDurationAllowNegative(s)
+    if err != nil {
+        return err
+    }
+    *d = dur
+    return nil
+}
+func (d *Duration) UnmarshalText(text []byte) error {
+    var err error
+    *d, err = ParseDurationAllowNegative(string(text))
+    return err
+}
+func (d *Duration) UnmarshalYAML(unmarshal func(interface{}) error) error {
+    var s string
+    if err := unmarshal(&s); err != nil {
+        return err
+    }
+    dur, err := ParseDurationAllowNegative(s)
+    if err != nil {
+        return err
+    }
+    *d = dur
+    return nil
+}

-	if namespace != "" {
-		nameTokens = append([]string{namespace}, nameTokens...)
-	}
+    if namespace != "" {
+        namespaceTokens := strings.FieldsFunc(namespace, func(r rune) bool {
+            return !isValidCompliantMetricChar(r)
+        })
+        if len(namespaceTokens) > 0 {
+            nameTokens = append(namespaceTokens, nameTokens...)
+        }
+    }

		r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))

		//go:build !windows && !js && !wasip1 && !darwin
		// +build !windows,!js,!wasip1,!darwin

Mig otel #6

Are you sure you want to change the base?

Mig otel #6

Uh oh!

Conversation

khrm commented Sep 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Submitter Checklist

Release Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 29, 2025

khrm commented Sep 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 29, 2025 •

edited

Loading