-
Notifications
You must be signed in to change notification settings - Fork 0
Mig otel #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughRemoved namespace gating for webhook metrics; migrated in-repo metrics from OpenCensus to OpenTelemetry (meters, instruments, attributes); added OTEL observability config keys; updated many dependencies and vendored packages; propagated context through informer/metrics call sites and adapted tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Controller as Reconciler/Controller
participant Recorder as Recorder (pkg/*metrics)
participant Meter as OTEL Meter
participant Exporter as OTEL Exporter
Controller->>Recorder: NewRecorder(ctx)
Recorder->>Meter: Get Meter from MeterProvider
Recorder->>Recorder: initializeInstruments(cfg) (create histograms/counters/gauges)
Controller->>Recorder: DurationAndCount(ctx, pr, before)
Recorder->>Recorder: compute duration & attributes
Recorder->>Meter: prDurationHistogram.Record(ctx, value, attrs)
Recorder->>Meter: prTotalCounter.Add(ctx, 1, attrs)
Controller->>Recorder: RunningPipelineRuns(ctx, lister)
Recorder->>Recorder: aggregate running counts & attrs
Recorder->>Meter: runningPRsGauge.Add(ctx, delta, attrs)
Note over Meter,Exporter: MeterProvider exports asynchronously to configured exporters
Meter-->>Exporter: push metrics
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR migrates Tekton Pipeline metrics from OpenCensus to OpenTelemetry while improving configuration and test structure.
- Migration from OpenCensus to OpenTelemetry for metrics collection and reporting
- Simplified test implementations replacing complex metric validation with basic API verification
- Updated configuration structure for observability settings
Reviewed Changes
Copilot reviewed 7 out of 983 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
pkg/taskrunmetrics/metrics.go | Migrated from OpenCensus to OpenTelemetry metric instruments |
pkg/taskrunmetrics/metrics_test.go | Simplified tests to focus on API functionality rather than metric validation |
pkg/pipelinerunmetrics/metrics.go | Migrated from OpenCensus to OpenTelemetry metric instruments |
pkg/pipelinerunmetrics/metrics_test.go | Simplified tests to focus on API functionality rather than metric validation |
go.mod | Updated dependencies for OpenTelemetry support |
config/config-observability.yaml | Added OpenTelemetry configuration options |
cmd/webhook/main.go | Removed legacy metrics configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
pkg/taskrunmetrics/metrics_test.go
Outdated
t.Error("Current TaskRunsCount recording expected to return error but got nil") | ||
} | ||
if err := metrics.RecordPodLatency(ctx, nil, nil); err == nil { | ||
if err := r.RecordPodLatency(ctx, &v1.TaskRun{}, &corev1.Pod{}); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter order in the test call doesn't match the actual function signature. The function signature shows RecordPodLatency(ctx context.Context, tr *v1.TaskRun, pod *corev1.Pod)
but the test is calling with empty objects. This should use the correct parameter order.
Copilot uses AI. Check for mistakes.
pkg/taskrunmetrics/metrics_test.go
Outdated
} | ||
|
||
if err := metrics.RecordPodLatency(ctx, td.pod, td.taskRun); td.expectingError && err == nil { | ||
if err := metrics.RecordPodLatency(ctx, td.taskRun, td.pod); td.expectingError && err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter order is incorrect here. Based on the function signature change in the main file, this should be RecordPodLatency(ctx, td.taskRun, td.pod)
but the original code was RecordPodLatency(ctx, td.pod, td.taskRun)
. The test should match the actual function signature.
Copilot uses AI. Check for mistakes.
r.mutex.Lock() | ||
defer r.mutex.Unlock() | ||
|
||
trs, err := lister.List(labels.Everything()) | ||
if err != nil { | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mutex is locked at the beginning of the function but the lister.List()
call could potentially be expensive and doesn't need to be protected by the mutex. Consider moving the mutex lock after the list operation to reduce lock contention.
r.mutex.Lock() | |
defer r.mutex.Unlock() | |
trs, err := lister.List(labels.Everything()) | |
if err != nil { | |
return err | |
} | |
trs, err := lister.List(labels.Everything()) | |
if err != nil { | |
return err | |
} | |
r.mutex.Lock() | |
defer r.mutex.Unlock() |
Copilot uses AI. Check for mistakes.
r.mutex.Lock() | ||
defer r.mutex.Unlock() | ||
|
||
prs, err := lister.List(labels.Everything()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the TaskRun metrics, the expensive lister.List()
operation is being performed while holding the mutex lock. This could cause unnecessary blocking and should be moved outside the critical section.
Copilot uses AI. Check for mistakes.
19e4c65
to
103b041
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
vendor/github.com/prometheus/procfs/Makefile.common (1)
284-289
: Pingovulncheck
for reproducibility.
install-govulncheck
pullsgolang.org/x/vuln/cmd/govulncheck@latest
, so every new upstream release changes tool behavior under us. This breaks the reproducible-toolchain guarantee the rest of the Makefile provides (note howpromu
andgolangci-lint
are version-pinned). Please lockgovulncheck
to a specific tag and update intentionally when needed.vendor/github.com/gorilla/websocket/client.go (1)
303-323
: Don't reuse TLSClientConfig for proxy handshakesWhen the proxy URL is HTTPS we now clone
d.TLSClientConfig
both while establishing the tunnel (vianetDialFromURL
) and again here for the post-CONNECT TLS. In many deploymentsTLSClientConfig.RootCAs
is trimmed down to only trust the backend service; reusing that same config for the proxy TLS session causes the CONNECT step to fail with “x509: unknown authority” (or forces operators to add the proxy’s CA, widening trust). This is a regression compared to the previous HTTP-proxy path where the custom CA bundle only applied to the origin server. We need a separate (or default) TLS config for the proxy leg so that backend-specific settings don’t break HTTPS proxies.(go.dev)vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (1)
473-491
: Error messages should list the actually offered formats, not the defaults; unify wording.When zstd is requested but unavailable (or an unknown encoding is selected), the error currently prints defaultCompressionFormats(), which may differ from the user-provided OfferedCompressions. Also the message mixes “not implemented yet” and “not recognized”. Prefer “not supported” and show the current offers.
-import ( +import ( "compress/gzip" "errors" "fmt" "io" "net/http" "strconv" + "strings" "sync" "time" ) @@ switch selected { case "zstd": if internal.NewZstdWriter == nil { - // The content encoding was not implemented yet. - return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats()) + // The requested encoding is not supported in this build. + return nil, "", func() {}, fmt.Errorf("content compression format not supported: %s. Valid formats are: %s", selected, strings.Join(compressions, ", ")) } writer, closeWriter, err := internal.NewZstdWriter(rw) return writer, selected, closeWriter, err @@ default: - // The content encoding was not implemented yet. - return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats()) + // The requested encoding is not supported. + return nil, "", func() {}, fmt.Errorf("content compression format not supported: %s. Valid formats are: %s", selected, strings.Join(compressions, ", ")) }
🧹 Nitpick comments (11)
vendor/github.com/prometheus/procfs/internal/util/sysreadfile.go (1)
60-60
: Trim once to avoid redundant work.
SysReadFile
already returns a trimmed string, so the extrastrings.TrimSpace(string(data))
re-trims the same value. Dropping the second trim keeps the helpers lean without changing behavior.- return strconv.ParseUint(strings.TrimSpace(string(data)), 10, 64) + return strconv.ParseUint(data, 10, 64)- return strconv.ParseInt(strings.TrimSpace(string(data)), 10, 64) + return strconv.ParseInt(data, 10, 64)Also applies to: 69-69
vendor/github.com/go-logr/zapr/README.md (1)
39-62
: Add Go language hint and swap hard tabs for spaces in slog example.The slog example still triggers
markdownlint
(missing fenced language and hard tabs). Let’s tag the block as Go and reindent with spaces so CI stays clean.-``` +```go package main import ( - "fmt" - "log/slog" - - "github.com/go-logr/logr/slogr" - "github.com/go-logr/zapr" - "go.uber.org/zap" + "fmt" + "log/slog" + + "github.com/go-logr/logr/slogr" + "github.com/go-logr/zapr" + "go.uber.org/zap" ) func main() { - var log *slog.Logger + var log *slog.Logger - zapLog, err := zap.NewDevelopment() - if err != nil { - panic(fmt.Sprintf("who watches the watchmen (%v)?", err)) - } - log = slog.New(slogr.NewSlogHandler(zapr.NewLogger(zapLog))) + zapLog, err := zap.NewDevelopment() + if err != nil { + panic(fmt.Sprintf("who watches the watchmen (%v)?", err)) + } + log = slog.New(slogr.NewSlogHandler(zapr.NewLogger(zapLog))) - log.Info("Logr in action!", "the answer", 42) + log.Info("Logr in action!", "the answer", 42) } -``` +```pkg/pipelinerunmetrics/metrics_test.go (1)
220-247
: Trim unused expectations from the table.
expectedPipelineTag
,expectedPipelinerunTag
,expectedNamespaceTag
, andexpectedCount
are no longer referenced after the OTEL refactor, so they now act as dead weight in the table definition. Dropping them keeps the test intent obvious and avoids implying we still verify those values.- expectedPipelineTag string - expectedPipelinerunTag string - expectedNamespaceTag string - expectedCount int64vendor/github.com/prometheus/client_golang/prometheus/promhttp/internal/compression.go (1)
20-21
: Document the NewZstdWriter contract (nil semantics, lifecycle) and consider a named factory type.Make it explicit that it is nil unless zstd is compiled in, and that callers must always invoke closeWriter(). A small named type also improves readability.
package internal import ( "io" ) -// NewZstdWriter enables zstd write support if non-nil. -var NewZstdWriter func(rw io.Writer) (_ io.Writer, closeWriter func(), _ error) +// ZstdWriterFactory returns a writer that applies Zstandard compression to rw. +// The returned closeWriter must be called exactly once to flush and release resources. +type ZstdWriterFactory func(rw io.Writer) (_ io.Writer, closeWriter func(), _ error) + +// NewZstdWriter is nil unless zstd support is built in (set by platform/tag-specific files). +// When non-nil, use it to wrap the response writer; always call the returned closeWriter(). +var NewZstdWriter ZstdWriterFactoryvendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (4)
68-74
: defaultCompressionFormats: clarify zstd availability in doc or name.Since zstd is included only when internal.NewZstdWriter != nil, add a short comment to reflect that to avoid confusion at call sites.
-func defaultCompressionFormats() []Compression { +// defaultCompressionFormats returns formats supported by this build (zstd only if available). +func defaultCompressionFormats() []Compression {
145-154
: Filter out “zstd” from OfferedCompressions when zstd is unavailable.If users configure OfferedCompressions to include zstd while internal.NewZstdWriter is nil, negotiation will pick zstd, then we error and fall back to identity. Proactively filtering avoids noisy logs and an error path in normal operation.
- if !opts.DisableCompression { - offers := defaultCompressionFormats() - if len(opts.OfferedCompressions) > 0 { - offers = opts.OfferedCompressions - } - for _, comp := range offers { - compressions = append(compressions, string(comp)) - } - } + if !opts.DisableCompression { + offers := defaultCompressionFormats() + if len(opts.OfferedCompressions) > 0 { + offers = opts.OfferedCompressions + } + for _, comp := range offers { + if comp == Zstd && internal.NewZstdWriter == nil { + // Skip zstd if the factory is not available in this build. + continue + } + compressions = append(compressions, string(comp)) + } + }
391-398
: Doc nit: default OfferedCompressions now depends on zstd availability.Comment says “defaults to identity, gzip and zstd,” but zstd is only available when the factory is present. Suggest reflecting this to avoid surprising users.
- // OfferedCompressions is a set of encodings (compressions) handler will - // try to offer when negotiating with the client. This defaults to identity, gzip - // and zstd. + // OfferedCompressions is a set of encodings (compressions) the handler will + // try to offer when negotiating with the client. Defaults to identity and gzip, + // and zstd only when this build includes zstd support.
423-437
: New option: add brief operational guidance.Consider noting that enabling created samples can increase series cardinality and scrape size; recommend enabling only with scrapers that support zero-ingestion of created timestamps (e.g., Prometheus >= 2.50 with the feature flag).
vendor/github.com/grafana/regexp/exec.go (1)
275-279
: Panics on unexpected opcodes lack context.panic("bad inst") / "unhandled" provide little diagnostics in production crashes. Consider enriching the panic with opcode/pc (or guard with build tags for debug-only). Given vendoring, this can be deferred.
Also applies to: 335-337, 448-450
vendor/github.com/grafana/regexp/regexp.go (2)
1261-1283
: Avoid shadowing the strings package in Split().Local variable named strings shadows the imported package, reducing readability. Rename to parts.
- strings := make([]string, 0, len(matches)) + parts := make([]string, 0, len(matches)) @@ - if n > 0 && len(strings) >= n-1 { + if n > 0 && len(parts) >= n-1 { break } @@ - if match[1] != 0 { - strings = append(strings, s[beg:end]) + if match[1] != 0 { + parts = append(parts, s[beg:end]) } beg = match[1] @@ - if end != len(s) { - strings = append(strings, s[beg:]) + if end != len(s) { + parts = append(parts, s[beg:]) } - return strings + return parts
989-996
: Prefer a non-reserved identifier instead of ‘rune’ variable.Using rune as a variable name shadows the widely-used type alias, hurting readability. Rename to r and use r, size := utf8.DecodeRuneInString(...).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (8)
go.sum
is excluded by!**/*.sum
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.go
is excluded by!**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.gw.go
is excluded by!**/*.pb.gw.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service_grpc.pb.go
is excluded by!**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/metrics/v1/metrics.pb.go
is excluded by!**/*.pb.go
vendor/k8s.io/api/apps/v1/generated.pb.go
is excluded by!**/*.pb.go
vendor/k8s.io/api/apps/v1beta1/generated.pb.go
is excluded by!**/*.pb.go
vendor/k8s.io/api/apps/v1beta2/generated.pb.go
is excluded by!**/*.pb.go
📒 Files selected for processing (107)
cmd/webhook/main.go
(0 hunks)config/config-observability.yaml
(2 hunks)go.mod
(8 hunks)pkg/pipelinerunmetrics/metrics.go
(3 hunks)pkg/pipelinerunmetrics/metrics_test.go
(5 hunks)pkg/reconciler/pipelinerun/pipelinerun.go
(1 hunks)pkg/taskrunmetrics/metrics.go
(7 hunks)pkg/taskrunmetrics/metrics_test.go
(9 hunks)vendor/github.com/go-logr/logr/slogr/slogr.go
(1 hunks)vendor/github.com/go-logr/zapr/.gitignore
(1 hunks)vendor/github.com/go-logr/zapr/.golangci.yaml
(1 hunks)vendor/github.com/go-logr/zapr/LICENSE
(1 hunks)vendor/github.com/go-logr/zapr/README.md
(1 hunks)vendor/github.com/go-logr/zapr/slogzapr.go
(1 hunks)vendor/github.com/go-logr/zapr/zapr.go
(1 hunks)vendor/github.com/go-logr/zapr/zapr_noslog.go
(1 hunks)vendor/github.com/go-logr/zapr/zapr_slog.go
(1 hunks)vendor/github.com/golang/protobuf/proto/buffer.go
(0 hunks)vendor/github.com/golang/protobuf/proto/defaults.go
(0 hunks)vendor/github.com/golang/protobuf/proto/deprecated.go
(0 hunks)vendor/github.com/golang/protobuf/proto/discard.go
(0 hunks)vendor/github.com/golang/protobuf/proto/extensions.go
(0 hunks)vendor/github.com/golang/protobuf/proto/properties.go
(0 hunks)vendor/github.com/golang/protobuf/proto/proto.go
(0 hunks)vendor/github.com/golang/protobuf/proto/registry.go
(0 hunks)vendor/github.com/golang/protobuf/proto/text_decode.go
(0 hunks)vendor/github.com/golang/protobuf/proto/text_encode.go
(0 hunks)vendor/github.com/golang/protobuf/proto/wire.go
(0 hunks)vendor/github.com/golang/protobuf/proto/wrappers.go
(0 hunks)vendor/github.com/google/gofuzz/.travis.yml
(0 hunks)vendor/github.com/google/gofuzz/CONTRIBUTING.md
(0 hunks)vendor/github.com/google/gofuzz/fuzz.go
(0 hunks)vendor/github.com/gorilla/websocket/README.md
(2 hunks)vendor/github.com/gorilla/websocket/client.go
(7 hunks)vendor/github.com/gorilla/websocket/compression.go
(1 hunks)vendor/github.com/gorilla/websocket/conn.go
(19 hunks)vendor/github.com/gorilla/websocket/proxy.go
(3 hunks)vendor/github.com/gorilla/websocket/server.go
(6 hunks)vendor/github.com/gorilla/websocket/tls_handshake.go
(0 hunks)vendor/github.com/gorilla/websocket/tls_handshake_116.go
(0 hunks)vendor/github.com/gorilla/websocket/x_net_proxy.go
(0 hunks)vendor/github.com/grafana/regexp/.gitignore
(1 hunks)vendor/github.com/grafana/regexp/LICENSE
(1 hunks)vendor/github.com/grafana/regexp/README.md
(1 hunks)vendor/github.com/grafana/regexp/backtrack.go
(1 hunks)vendor/github.com/grafana/regexp/exec.go
(1 hunks)vendor/github.com/grafana/regexp/onepass.go
(1 hunks)vendor/github.com/grafana/regexp/regexp.go
(1 hunks)vendor/github.com/openzipkin/zipkin-go/LICENSE
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/annotation.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/doc.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/endpoint.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/kind.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/span.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/span_id.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/traceid.go
(0 hunks)vendor/github.com/prometheus/client_golang/prometheus/collectorfunc.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/desc.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/histogram.go
(6 hunks)vendor/github.com/prometheus/client_golang/prometheus/internal/difflib.go
(6 hunks)vendor/github.com/prometheus/client_golang/prometheus/internal/go_runtime_metrics.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/metric.go
(3 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.c
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_not_supported.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_windows.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go
(7 hunks)vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/promhttp/internal/compression.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/summary.go
(4 hunks)vendor/github.com/prometheus/client_golang/prometheus/vec.go
(5 hunks)vendor/github.com/prometheus/client_golang/prometheus/wrap.go
(2 hunks)vendor/github.com/prometheus/common/expfmt/text_parse.go
(1 hunks)vendor/github.com/prometheus/common/model/alert.go
(1 hunks)vendor/github.com/prometheus/common/model/labels.go
(2 hunks)vendor/github.com/prometheus/common/model/metric.go
(2 hunks)vendor/github.com/prometheus/common/model/time.go
(3 hunks)vendor/github.com/prometheus/otlptranslator/.gitignore
(1 hunks)vendor/github.com/prometheus/otlptranslator/.golangci.yml
(1 hunks)vendor/github.com/prometheus/otlptranslator/CODE_OF_CONDUCT.md
(1 hunks)vendor/github.com/prometheus/otlptranslator/LICENSE
(0 hunks)vendor/github.com/prometheus/otlptranslator/MAINTAINERS.md
(1 hunks)vendor/github.com/prometheus/otlptranslator/README.md
(1 hunks)vendor/github.com/prometheus/otlptranslator/SECURITY.md
(1 hunks)vendor/github.com/prometheus/otlptranslator/constants.go
(1 hunks)vendor/github.com/prometheus/otlptranslator/metric_namer.go
(1 hunks)vendor/github.com/prometheus/otlptranslator/metric_type.go
(1 hunks)vendor/github.com/prometheus/otlptranslator/normalize_label.go
(1 hunks)vendor/github.com/prometheus/otlptranslator/strconv.go
(1 hunks)vendor/github.com/prometheus/otlptranslator/unit_namer.go
(1 hunks)vendor/github.com/prometheus/procfs/.golangci.yml
(1 hunks)vendor/github.com/prometheus/procfs/Makefile.common
(4 hunks)vendor/github.com/prometheus/procfs/README.md
(1 hunks)vendor/github.com/prometheus/procfs/arp.go
(1 hunks)vendor/github.com/prometheus/procfs/fs.go
(1 hunks)vendor/github.com/prometheus/procfs/fs_statfs_notype.go
(1 hunks)vendor/github.com/prometheus/procfs/fscache.go
(2 hunks)vendor/github.com/prometheus/procfs/internal/fs/fs.go
(1 hunks)vendor/github.com/prometheus/procfs/internal/util/parse.go
(2 hunks)vendor/github.com/prometheus/procfs/internal/util/sysreadfile.go
(2 hunks)vendor/github.com/prometheus/procfs/mdstat.go
(1 hunks)vendor/github.com/prometheus/procfs/meminfo.go
(11 hunks)
⛔ Files not processed due to max files limit (41)
- vendor/github.com/prometheus/procfs/mountstats.go
- vendor/github.com/prometheus/procfs/net_dev_snmp6.go
- vendor/github.com/prometheus/procfs/net_ip_socket.go
- vendor/github.com/prometheus/procfs/net_protocols.go
- vendor/github.com/prometheus/procfs/net_tcp.go
- vendor/github.com/prometheus/procfs/net_unix.go
- vendor/github.com/prometheus/procfs/proc.go
- vendor/github.com/prometheus/procfs/proc_cgroup.go
- vendor/github.com/prometheus/procfs/proc_io.go
- vendor/github.com/prometheus/procfs/proc_netstat.go
- vendor/github.com/prometheus/procfs/proc_smaps.go
- vendor/github.com/prometheus/procfs/proc_snmp.go
- vendor/github.com/prometheus/procfs/proc_snmp6.go
- vendor/github.com/prometheus/procfs/proc_stat.go
- vendor/github.com/prometheus/procfs/proc_statm.go
- vendor/github.com/prometheus/procfs/proc_status.go
- vendor/github.com/prometheus/procfs/proc_sys.go
- vendor/github.com/prometheus/procfs/softirqs.go
- vendor/go.opencensus.io/plugin/ochttp/client.go
- vendor/go.opencensus.io/plugin/ochttp/client_stats.go
- vendor/go.opencensus.io/plugin/ochttp/doc.go
- vendor/go.opencensus.io/plugin/ochttp/propagation/b3/b3.go
- vendor/go.opencensus.io/plugin/ochttp/propagation/tracecontext/propagation.go
- vendor/go.opencensus.io/plugin/ochttp/route.go
- vendor/go.opencensus.io/plugin/ochttp/server.go
- vendor/go.opencensus.io/plugin/ochttp/span_annotating_client_trace.go
- vendor/go.opencensus.io/plugin/ochttp/stats.go
- vendor/go.opencensus.io/plugin/ochttp/trace.go
- vendor/go.opencensus.io/plugin/ochttp/wrapped_body.go
- vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/env.go
- vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/gen.go
- vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/httpconv.go
- vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconv/util.go
- vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/transport.go
- vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/version.go
- vendor/go.opentelemetry.io/contrib/instrumentation/runtime/LICENSE
- vendor/go.opentelemetry.io/contrib/instrumentation/runtime/doc.go
- vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/deprecatedruntime/doc.go
- vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/deprecatedruntime/runtime.go
- vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/x/README.md
- vendor/go.opentelemetry.io/contrib/instrumentation/runtime/internal/x/x.go
💤 Files with no reviewable changes (28)
- vendor/github.com/golang/protobuf/proto/discard.go
- vendor/github.com/prometheus/otlptranslator/LICENSE
- vendor/github.com/openzipkin/zipkin-go/model/span_id.go
- vendor/github.com/openzipkin/zipkin-go/model/traceid.go
- vendor/github.com/openzipkin/zipkin-go/model/annotation.go
- vendor/github.com/google/gofuzz/CONTRIBUTING.md
- vendor/github.com/openzipkin/zipkin-go/model/span.go
- vendor/github.com/golang/protobuf/proto/deprecated.go
- vendor/github.com/golang/protobuf/proto/wrappers.go
- cmd/webhook/main.go
- vendor/github.com/openzipkin/zipkin-go/LICENSE
- vendor/github.com/golang/protobuf/proto/proto.go
- vendor/github.com/gorilla/websocket/tls_handshake_116.go
- vendor/github.com/golang/protobuf/proto/text_encode.go
- vendor/github.com/openzipkin/zipkin-go/model/kind.go
- vendor/github.com/gorilla/websocket/tls_handshake.go
- vendor/github.com/golang/protobuf/proto/registry.go
- vendor/github.com/openzipkin/zipkin-go/model/endpoint.go
- vendor/github.com/golang/protobuf/proto/extensions.go
- vendor/github.com/golang/protobuf/proto/text_decode.go
- vendor/github.com/golang/protobuf/proto/defaults.go
- vendor/github.com/golang/protobuf/proto/properties.go
- vendor/github.com/openzipkin/zipkin-go/model/doc.go
- vendor/github.com/golang/protobuf/proto/wire.go
- vendor/github.com/gorilla/websocket/x_net_proxy.go
- vendor/github.com/google/gofuzz/.travis.yml
- vendor/github.com/google/gofuzz/fuzz.go
- vendor/github.com/golang/protobuf/proto/buffer.go
🧰 Additional context used
🧬 Code graph analysis (6)
vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)
Histogram
(249-261)
pkg/pipelinerunmetrics/metrics.go (3)
pkg/apis/pipeline/v1/pipelinerun_types.go (3)
PipelineRunReasonCancelled
(345-345)PipelineRun
(51-60)PipelineRun
(73-75)pkg/apis/config/metrics.go (1)
Metrics
(101-109)pkg/taskrunmetrics/metrics.go (3)
Recorder
(46-65)NewRecorder
(74-91)OnStore
(191-207)
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)
vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)
Metric
(33-59)
pkg/taskrunmetrics/metrics_test.go (2)
pkg/pipelinerunmetrics/metrics.go (2)
Recorder
(47-63)NewRecorder
(72-88)pkg/taskrunmetrics/metrics.go (2)
Recorder
(46-65)NewRecorder
(74-91)
pkg/taskrunmetrics/metrics.go (4)
pkg/apis/config/metrics.go (7)
Metrics
(101-109)TaskrunLevelAtTaskrun
(52-52)TaskrunLevelAtTask
(54-54)TaskrunLevelAtNS
(56-56)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/pipelinerunmetrics/metrics.go (3)
Recorder
(47-63)NewRecorder
(72-88)OnStore
(312-328)pkg/apis/pipeline/register.go (4)
PipelineTaskLabelKey
(47-47)TaskLabelKey
(29-29)PipelineLabelKey
(38-38)PipelineRunLabelKey
(41-41)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/pipelinerunmetrics/metrics_test.go (2)
pkg/pipelinerunmetrics/metrics.go (1)
NewRecorder
(72-88)pkg/apis/config/metrics.go (3)
PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)
🪛 ast-grep (0.39.5)
vendor/github.com/gorilla/websocket/client.go
[warning] 501-501: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. Add
MinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(missing-ssl-minversion-go)
🪛 markdownlint-cli2 (0.18.1)
vendor/github.com/go-logr/zapr/README.md
39-39: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
43-43: Hard tabs
Column: 1
(MD010, no-hard-tabs)
44-44: Hard tabs
Column: 1
(MD010, no-hard-tabs)
46-46: Hard tabs
Column: 1
(MD010, no-hard-tabs)
47-47: Hard tabs
Column: 1
(MD010, no-hard-tabs)
48-48: Hard tabs
Column: 1
(MD010, no-hard-tabs)
52-52: Hard tabs
Column: 1
(MD010, no-hard-tabs)
54-54: Hard tabs
Column: 1
(MD010, no-hard-tabs)
55-55: Hard tabs
Column: 1
(MD010, no-hard-tabs)
56-56: Hard tabs
Column: 1
(MD010, no-hard-tabs)
57-57: Hard tabs
Column: 1
(MD010, no-hard-tabs)
58-58: Hard tabs
Column: 1
(MD010, no-hard-tabs)
60-60: Hard tabs
Column: 1
(MD010, no-hard-tabs)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (27)
vendor/github.com/prometheus/procfs/README.md (1)
50-61
: Documentation update looks correct.The refreshed fixture paths now point to
testdata/fixtures
, matching the relocated test assets and keeping the workflow instructions consistent.vendor/github.com/prometheus/common/model/alert.go (1)
68-69
: Comment clarification looks good.The updated wording makes the ResolvedAt contract clearer without changing behavior.
vendor/github.com/prometheus/otlptranslator/.gitignore (1)
1-25
: Gitignore patterns look goodMatches the standard Go template; no concerns here.
vendor/github.com/prometheus/procfs/fscache.go (1)
165-179
: Comment spelling aligns with US English.Thanks for standardizing the comment spelling. No functional impact, and this keeps terminology consistent.
vendor/github.com/prometheus/client_golang/prometheus/desc.go (1)
192-200
: Nil-guard tightensDesc.String()
safetyThe extra guard keeps us from dereferencing
d.variableLabels
when it’s nil, soString()
can no longer panic for descriptors without variable labels. Nice defensive hardening.vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go (1)
291-292
: Help text punctuation looks goodAppending the trailing period gives the help string a tidy, sentence-like ending without affecting behavior. 👍
vendor/github.com/prometheus/common/expfmt/text_parse.go (1)
348-349
: Logic equivalence preserved.Rewriting the guard with De Morgan’s law keeps the filtering behavior identical while improving readability. Looks good.
vendor/github.com/prometheus/client_golang/prometheus/vec.go (1)
82-83
: Embedded forwarding reads cleanlySwitching to the promoted metricMap helpers keeps the behavior identical while dropping the extra field hop, so MetricVec still exposes Delete* and GetMetricWith* exactly as documented. (pkg.go.dev)
Also applies to: 104-105, 117-118, 219-220, 247-248
vendor/github.com/prometheus/procfs/mdstat.go (1)
126-135
: Recognition of reshape state looks good.Adding the reshape detection keeps the parser aligned with the kernel’s mdstat output and preserves the existing state precedence logic.
vendor/github.com/prometheus/procfs/Makefile.common (2)
160-166
: Do not hard-fail when golangci-lint is skipped.
common-format
now always depends on$(GOLANGCI_LINT)
. WhenSKIP_GOLANGCI_LINT
or CI detection blanks this variable, the dependency list collapses to empty, so the target still runs. However, inside the recipe we call$(GOLANGCI_LINT) fmt
only if the variable is non-empty—good. No action required.
256-259
: Target rename is compatible.Defining
common-proto
keepsmake proto
working thanks to the%: common-%
forwarding rule earlier in the file. The new name also matches the convention used throughout this Makefile. Looks good.vendor/github.com/prometheus/client_golang/prometheus/summary.go (1)
229-247
: Time source injection keeps default behavior intact.Defaulting
opts.now
totime.Now
preserves prior semantics, and persisting it on the struct gives tests a clean hook for deterministic timing. Looks solid.pkg/taskrunmetrics/metrics.go (1)
318-321
: Release the mutex before calling the lister.Locking
r.mutex
whilelister.List(...)
walks the informer cache needlessly serializes every caller and keeps the metrics instruments/config locked through a potentially heavy read. Please keep the existing pattern from the earlier review: fetchtrs
first, then lock just long enough to update the recorder state. (Same concern was flagged previously.)pkg/taskrunmetrics/metrics_test.go (2)
256-260
: Fix RecordPodLatency argument orderSame issue as above: the test should call
RecordPodLatency
with theTaskRun
before thePod
. Leaving it as-is misorders the parameters and breaks compilation.- if err := metrics.RecordPodLatency(ctx, td.pod, td.taskRun); td.expectingError && err == nil { + if err := metrics.RecordPodLatency(ctx, td.taskRun, td.pod); td.expectingError && err == nil {
72-73
: Fix RecordPodLatency argument order
Recorder.RecordPodLatency
takes(ctx, taskRun, pod)
. Passing the pod first violates the signature and will not compile once wired up against the implementation. Swap the arguments here.- if err := r.RecordPodLatency(ctx, &corev1.Pod{}, &v1.TaskRun{}); err == nil { + if err := r.RecordPodLatency(ctx, &v1.TaskRun{}, &corev1.Pod{}); err == nil {pkg/pipelinerunmetrics/metrics.go (1)
205-218
: Keeplister.List()
outside the critical sectionThis mirrors the earlier review feedback: we still invoke
lister.List(labels.Everything())
while holdingr.mutex
, which can block other recorders waiting on the same lock while we hit the informer cache. Please gather the list before taking the mutex and only lock while updating recorder state.vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go (1)
18-20
: Graceful no-cgo fallback confirmed.Returning
errNotImplemented
here cleanly lines up with theprocessCollect
guard, so Darwin builds without cgo skip the memory gauges without spurious errors.vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.go (1)
24-32
: Cgo-backed memory collection looks solid.Bridging through
get_memory_info
and normalizing intomemoryInfo
aligns with the caller’s expectations, giving accurate RSS/VSize data when cgo is available.vendor/github.com/prometheus/otlptranslator/.golangci.yml (1)
17-37
: nilnesserr is supported by the CI’s golangci-lint version
CI installs golangci-lint v2.1.6 (via the GitHub Actionversion: v2.1.6
), which includes thenilnesserr
linter, so no version bump or hold-off is needed.vendor/github.com/prometheus/otlptranslator/metric_type.go (1)
21-35
: Metric type enumeration aligns with OTel model.The exported constants mirror the OpenTelemetry metric kinds and look good.
vendor/github.com/gorilla/websocket/compression.go (1)
36-40
: Pooling fallback looks solid.The extra defensive Reset handling keeps the pooled reader usable without risking panics if the contract changes; nice catch.
vendor/github.com/gorilla/websocket/conn.go (1)
183-193
: Mask key generation upgrade looks good.Switching to
crypto/rand
for mask generation is a nice improvement for compliance without affecting tests thanks to the overridable reader.vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go (2)
201-212
: Minor: ensure closeWriter is always set to a no-op on error.negotiateEncodingWriter already returns a no-op closer on error, so this is safe. No action needed; just calling out that this invariant is relied upon.
If you prefer defensive code, you can set closeWriter = func(){} in the error branch.
217-222
: No gating required: WithCreatedLines exists and is no-op outside OpenMetrics formats
Vendored expfmt defines WithCreatedLines (vendor/github.com/prometheus/common/expfmt/openmetrics_create.go:49) and its encode logic ignores extra options for non-OpenMetrics formats (vendor/github.com/prometheus/common/expfmt/encode.go:148–149).vendor/github.com/grafana/regexp/exec.go (2)
333-374
: Sparse queue indexing relies on pre-sized slices.add() indexes q.sparse[pc] without bounds checks. This is fine only if q.sparse is sized to len(prog.Inst). get() appears to ensure that, but please keep these coupled: any future changes to machine sizing must preserve this invariant.
521-550
: Verified referenced symbols and LICENSE presenceConfirmed
backtrack
,compileOnePass
,onePassPrefix
,onePassNext
, typeonePassProg
, and theLICENSE
file are all present invendor/github.com/grafana/regexp
. No further action needed.vendor/github.com/grafana/regexp/regexp.go (1)
820-841
: Parity check with stdlib regexp (behavioral regressions).Given this reimplementation mirrors Go’s regexp, please run a small parity suite comparing Find*/Split/Replace* against stdlib for representative cases (ASCII/UTF‑8, empty matches, prefixes, named groups). This will catch subtle divergences. I can provide scaffolding if helpful.
Also applies to: 843-867, 1028-1073, 1149-1190, 1191-1233
r.runningTRsGauge.Add(ctx, float64(runningTrs)) | ||
r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota)) | ||
r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix running TaskRun gauges to add deltas, not totals.
Float64UpDownCounter.Add
expects per-call deltas. Adding the full count on every scrape makes the exported value soar cumulatively (e.g., 5 running TaskRuns sampled twice yields 10). That breaks the “current number running” metric and any alerts derived from it. Please either switch to an asynchronous gauge/observable callback or maintain the previous sample and add only the delta before updating the cached value. For example:
type Recorder struct {
runningTRsThrottledByNodeGauge metric.Float64UpDownCounter
+ lastRunningTRs int
+ lastTrsThrottledByQuota int
+ lastTrsThrottledByNode int
}
@@
r.podLatencyHistogram = podLatencyHistogram
+ r.lastRunningTRs = 0
+ r.lastTrsThrottledByQuota = 0
+ r.lastTrsThrottledByNode = 0
@@
- r.runningTRsGauge.Add(ctx, float64(runningTrs))
- r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota))
- r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode))
+ r.runningTRsGauge.Add(ctx, float64(runningTrs-r.lastRunningTRs))
+ r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota-r.lastTrsThrottledByQuota))
+ r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode-r.lastTrsThrottledByNode))
+
+ r.lastRunningTRs = runningTrs
+ r.lastTrsThrottledByQuota = trsThrottledByQuota
+ r.lastTrsThrottledByNode = trsThrottledByNode
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
r.runningTRsGauge.Add(ctx, float64(runningTrs)) | |
r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota)) | |
r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode)) | |
// pkg/taskrunmetrics/metrics.go | |
type Recorder struct { | |
// existing fields... | |
runningTRsGauge metric.Float64UpDownCounter | |
runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter | |
runningTRsThrottledByNodeGauge metric.Float64UpDownCounter | |
lastRunningTRs int | |
lastTrsThrottledByQuota int | |
lastTrsThrottledByNode int | |
// other fields... | |
} | |
func NewRecorder(/* params */) *Recorder { | |
r := &Recorder{ | |
// existing initializations... | |
runningTRsGauge: /* ... */, | |
runningTRsThrottledByQuotaGauge: /* ... */, | |
runningTRsThrottledByNodeGauge: /* ... */, | |
podLatencyHistogram: podLatencyHistogram, | |
lastRunningTRs: 0, | |
lastTrsThrottledByQuota: 0, | |
lastTrsThrottledByNode: 0, | |
} | |
return r | |
} | |
func (r *Recorder) Record(ctx context.Context, /* params */) error { | |
// ... prior code ... | |
- r.runningTRsGauge.Add(ctx, float64(runningTrs)) | |
- r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota)) | |
r.runningTRsGauge.Add(ctx, float64(runningTrs - r.lastRunningTRs)) | |
r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota - r.lastTrsThrottledByQuota)) | |
r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode - r.lastTrsThrottledByNode)) | |
r.lastRunningTRs = runningTrs | |
r.lastTrsThrottledByQuota = trsThrottledByQuota | |
r.lastTrsThrottledByNode = trsThrottledByNode | |
// ... remaining code ... | |
} |
🤖 Prompt for AI Agents
In pkg/taskrunmetrics/metrics.go around lines 344-347, the code is calling
Float64UpDownCounter.Add with the full totals which results in cumulative
inflation; change this to emit deltas instead: either replace these counters
with asynchronous observable gauges (register callbacks that return the current
running/trhottled counts) or keep a small cached previous value per metric and
call Add(ctx, current-prev) then update the cache; implement one of these two
approaches and remove the existing direct Add(ctx, float64(total)) calls so only
the per-scrape delta is added.
fields = append(fields, zapIt(keyStr, val)) | ||
i += 2 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix go1.21 build: provide zapIt
implementation.
Under Go 1.21, zapr_noslog.go
is excluded, so there is no definition of zapIt
. This line and other call sites leave zapr.go
uncompilable on the very toolchain that introduces slogzapr.go
. Please add a go1.21-compatible definition (for example alongside the slog adapter) so the package builds across supported Go versions.
🤖 Prompt for AI Agents
In vendor/github.com/go-logr/zapr/zapr.go around lines 164-166, calls to zapIt
are failing under Go 1.21 because the zapIt implementation in zapr_noslog.go is
excluded; add a go1.21-compatible implementation (either directly in this file
or in a new file without build tags) named zapIt(key string, val interface{})
zap.Field that mirrors expected behavior: convert common types and the
slog.Value wrapper into appropriate zap.Field values (fallback to zap.Any for
unknown types) and preserve existing semantics used by other call sites so the
package compiles under Go 1.21.
func (b *brNetConn) Read(p []byte) (n int, err error) { | ||
if b.br != nil { | ||
// Limit read to buferred data. | ||
if n := b.br.Buffered(); len(p) > n { | ||
p = p[:n] | ||
} | ||
n, err = b.br.Read(p) | ||
if b.br.Buffered() == 0 { | ||
b.br = nil | ||
} | ||
return n, err | ||
} | ||
return b.Conn.Read(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid returning 0 bytes with nil error.
When br.Buffered()
is zero we now return 0, nil
even though the caller’s buffer length is >0. That violates the io.Reader
contract and can make consumers spin forever, depending on how they treat (0, nil)
reads.(typeerror.org)
We should drop through to the underlying connection in that case instead of handing back a zero-length read.
func (b *brNetConn) Read(p []byte) (n int, err error) {
if b.br != nil {
- // Limit read to buferred data.
- if n := b.br.Buffered(); len(p) > n {
- p = p[:n]
- }
- n, err = b.br.Read(p)
- if b.br.Buffered() == 0 {
- b.br = nil
- }
- return n, err
+ if buffered := b.br.Buffered(); buffered > 0 {
+ // Limit read to buffered data.
+ if len(p) > buffered {
+ p = p[:buffered]
+ }
+ n, err = b.br.Read(p)
+ if b.br.Buffered() == 0 {
+ b.br = nil
+ }
+ return n, err
+ }
+ b.br = nil
}
return b.Conn.Read(p)
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
func (b *brNetConn) Read(p []byte) (n int, err error) { | |
if b.br != nil { | |
// Limit read to buferred data. | |
if n := b.br.Buffered(); len(p) > n { | |
p = p[:n] | |
} | |
n, err = b.br.Read(p) | |
if b.br.Buffered() == 0 { | |
b.br = nil | |
} | |
return n, err | |
} | |
return b.Conn.Read(p) | |
func (b *brNetConn) Read(p []byte) (n int, err error) { | |
if b.br != nil { | |
// Only read from the buffer if there's data available. | |
if buffered := b.br.Buffered(); buffered > 0 { | |
// Limit read to buffered data. | |
if len(p) > buffered { | |
p = p[:buffered] | |
} | |
n, err = b.br.Read(p) | |
if b.br.Buffered() == 0 { | |
b.br = nil | |
} | |
return n, err | |
} | |
// No buffered data—fall back to the underlying connection. | |
b.br = nil | |
} | |
return b.Conn.Read(p) | |
} |
🤖 Prompt for AI Agents
In vendor/github.com/gorilla/websocket/server.go around lines 354 to 366 the
Read method can return (0, nil) when b.br.Buffered() == 0 which violates
io.Reader expectations; change the logic so that if Buffered() is zero you clear
b.br and fall through to call and return from the underlying b.Conn.Read(p)
instead of returning 0,nil; otherwise keep the existing behavior of slicing p to
the buffered size, reading from b.br, and clearing b.br when drained.
matched = true | ||
if len(m.matchcap) > 0 { | ||
m.matchcap[0] = 0 | ||
m.matchcap[1] = pos | ||
} | ||
goto Return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Possible wrong match start index in one-pass path (start set to 0).
When InstMatch fires, m.matchcap[0] is set to 0, not the search start pos. For unanchored scans or after prefix skipping, this can report an incorrect start offset.
Suggest:
- Initialize start once to pos before the loop.
- On InstMatch, only set end.
@@
- matched := false
+ matched := false
for i := range m.matchcap {
m.matchcap[i] = -1
}
@@
- i, _ := m.inputs.init(ir, ib, is)
+ i, _ := m.inputs.init(ir, ib, is)
+ if ncap > 0 {
+ m.matchcap[0] = pos
+ }
@@
- case syntax.InstMatch:
- matched = true
- if len(m.matchcap) > 0 {
- m.matchcap[0] = 0
- m.matchcap[1] = pos
- }
+ case syntax.InstMatch:
+ matched = true
+ if len(m.matchcap) > 0 {
+ m.matchcap[1] = pos
+ }
goto Return
Please add a quick test comparing FindStringIndex/FindIndex results vs stdlib regexp for cases where the match starts after index 0 and where a literal prefix is skipped.
Also applies to: 415-417
🏁 Script executed:
#!/usr/bin/env bash
set -e
echo "=== doOnePass definition & context ==="
rg -n 'func doOnePass' -g 'vendor/github.com/grafana/regexp/exec.go' -C5
echo "=== InstMatch case ==="
rg -n 'case .*InstMatch' -g 'vendor/github.com/grafana/regexp/exec.go' -C5
echo "=== matchcap initialization ==="
rg -n 'matchcap\[' -g 'vendor/github.com/grafana/regexp/exec.go' -C5
Length of output: 38
🏁 Script executed:
#!/usr/bin/env bash
set -e
echo "=== snippet m.matchcap[0] ==="
rg -n "m.matchcap\\[0\\] = 0" -C3 .
echo "=== locate doOnePass ==="
rg -n "doOnePass" -C3 .
echo "=== locate InstMatch ==="
rg -n "InstMatch" -C3 .
Length of output: 10485
🏁 Script executed:
#!/usr/bin/env bash
sed -n '390,480p' vendor/github.com/grafana/regexp/exec.go
Length of output: 2088
🏁 Script executed:
#!/usr/bin/env bash
sed -n '480,550p' vendor/github.com/grafana/regexp/exec.go
Length of output: 1527
Record correct start offset in one-pass engine
- After any literal-prefix skip (i.e. immediately after
pos += len(re.prefix)
), ifncap > 0
setm.matchcap[0] = pos
; do not initialize it before the skip. - In the
case syntax.InstMatch
block removem.matchcap[0] = 0
and only assignm.matchcap[1] = pos
. - Add a test verifying that
FindStringIndex
/FindIndex
on a pattern with a literal prefix reports the same start/end indices as the stdlibregexp
for matches beyond index 0.
🤖 Prompt for AI Agents
In vendor/github.com/grafana/regexp/exec.go around lines 451-456, the one-pass
engine records the match start offset too early and resets it incorrectly: after
any literal-prefix skip (i.e. after pos += len(re.prefix)) if ncap > 0 set
m.matchcap[0] = pos (do not initialize it before the skip); in the case
syntax.InstMatch remove the line that sets m.matchcap[0] = 0 and only assign
m.matchcap[1] = pos; also add a unit test that runs FindStringIndex/FindIndex on
a pattern with a literal prefix and asserts the start/end indices match the
stdlib regexp for matches starting beyond index 0.
func ParseDurationAllowNegative(s string) (Duration, error) { | ||
if s == "" || s[0] != '-' { | ||
return ParseDuration(s) | ||
} | ||
|
||
d, err := ParseDuration(s[1:]) | ||
|
||
return -d, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure negative durations round-trip through JSON/YAML.
ParseDurationAllowNegative
exposes negative values, and Duration.String
now produces "-1m"
(and MarshalJSON
wraps that), but UnmarshalJSON
, UnmarshalText
, and UnmarshalYAML
still invoke ParseDuration
, which bails out on any leading -
. As soon as a negative Duration
is marshaled or read from config, unmarshalling fails, making the new feature unusable. Please update the unmarshal paths to call the negative-aware helper so serialization remains symmetric.
Apply this diff:
func (d Duration) MarshalJSON() ([]byte, error) {
return json.Marshal(d.String())
}
func (d *Duration) UnmarshalJSON(bytes []byte) error {
var s string
if err := json.Unmarshal(bytes, &s); err != nil {
return err
}
- dur, err := ParseDuration(s)
+ dur, err := ParseDurationAllowNegative(s)
if err != nil {
return err
}
*d = dur
return nil
}
@@
func (d *Duration) UnmarshalText(text []byte) error {
var err error
- *d, err = ParseDuration(string(text))
+ *d, err = ParseDurationAllowNegative(string(text))
return err
}
@@
func (d *Duration) UnmarshalYAML(unmarshal func(interface{}) error) error {
var s string
if err := unmarshal(&s); err != nil {
return err
}
- dur, err := ParseDuration(s)
+ dur, err := ParseDurationAllowNegative(s)
if err != nil {
return err
}
*d = dur
return nil
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
func ParseDurationAllowNegative(s string) (Duration, error) { | |
if s == "" || s[0] != '-' { | |
return ParseDuration(s) | |
} | |
d, err := ParseDuration(s[1:]) | |
return -d, err | |
} | |
func (d Duration) MarshalJSON() ([]byte, error) { | |
return json.Marshal(d.String()) | |
} | |
func (d *Duration) UnmarshalJSON(bytes []byte) error { | |
var s string | |
if err := json.Unmarshal(bytes, &s); err != nil { | |
return err | |
} | |
dur, err := ParseDurationAllowNegative(s) | |
if err != nil { | |
return err | |
} | |
*d = dur | |
return nil | |
} | |
func (d *Duration) UnmarshalText(text []byte) error { | |
var err error | |
*d, err = ParseDurationAllowNegative(string(text)) | |
return err | |
} | |
func (d *Duration) UnmarshalYAML(unmarshal func(interface{}) error) error { | |
var s string | |
if err := unmarshal(&s); err != nil { | |
return err | |
} | |
dur, err := ParseDurationAllowNegative(s) | |
if err != nil { | |
return err | |
} | |
*d = dur | |
return nil | |
} |
🤖 Prompt for AI Agents
vendor/github.com/prometheus/common/model/time.go around lines 262-270 and the
UnmarshalJSON/UnmarshalText/UnmarshalYAML methods: currently parsing during
unmarshal uses ParseDuration which rejects leading '-' so negative durations
fail to round-trip; update the unmarshal paths to call
ParseDurationAllowNegative instead of ParseDuration (ensure you strip any
surrounding quotes or whitespace the same way as before) so negative values are
accepted; keep error handling and type assignments identical, only swap the
parser call to the negative-aware helper to restore symmetric marshal/unmarshal
behavior.
if mn.Namespace != "" { | ||
namespace := strings.Join(strings.FieldsFunc(mn.Namespace, func(r rune) bool { | ||
return invalidMetricCharRE.MatchString(string(r)) | ||
}), "_") | ||
return namespace + "_" + metricName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still guard namespace-prefixed names against leading digits.
When mn.Namespace
resolves to something starting with a digit (e.g., "123"
), return namespace + "_" + metricName
emits 123_metric
, which violates Prometheus’ [a-zA-Z_:][a-zA-Z0-9_:]*
constraint. The digit guard below never runs because we return early. Please funnel the combined name through the same unicode.IsDigit
check before returning.
- return namespace + "_" + metricName
+ fullName := namespace + "_" + metricName
+ if fullName != "" && unicode.IsDigit(rune(fullName[0])) {
+ fullName = "_" + fullName
+ }
+ return fullName
🤖 Prompt for AI Agents
In vendor/github.com/prometheus/otlptranslator/metric_namer.go around lines 122
to 126, the function returns the combined namespace + "_" + metricName
immediately which skips the leading-digit guard; change the flow to build the
combined name into a variable (e.g., combined := namespace + "_" + metricName)
and then run the same leading-digit check used elsewhere (inspect the first rune
with unicode.IsDigit and if true prefix an underscore) before returning the
combined value so namespace-prefixed names that start with a digit become valid
Prometheus metric names.
if namespace != "" { | ||
nameTokens = append([]string{namespace}, nameTokens...) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalize namespace tokens before prefixing.
Here we prepend the raw namespace string, so values like "otel-collector"
inject -
into the supposedly “normalized” name, again breaking the Prometheus regex. We need to run the namespace through the same tokenization as the metric name before appending.
- if namespace != "" {
- nameTokens = append([]string{namespace}, nameTokens...)
- }
+ if namespace != "" {
+ namespaceTokens := strings.FieldsFunc(namespace, func(r rune) bool {
+ return !isValidCompliantMetricChar(r)
+ })
+ if len(namespaceTokens) > 0 {
+ nameTokens = append(namespaceTokens, nameTokens...)
+ }
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if namespace != "" { | |
nameTokens = append([]string{namespace}, nameTokens...) | |
} | |
if namespace != "" { | |
namespaceTokens := strings.FieldsFunc(namespace, func(r rune) bool { | |
return !isValidCompliantMetricChar(r) | |
}) | |
if len(namespaceTokens) > 0 { | |
nameTokens = append(namespaceTokens, nameTokens...) | |
} | |
} |
🤖 Prompt for AI Agents
In vendor/github.com/prometheus/otlptranslator/metric_namer.go around lines 187
to 189, the code currently prepends the raw namespace string which can contain
invalid characters; instead, run the namespace through the same
tokenization/normalization path used for the metric name (split into tokens,
normalize/lowercase/strip or replace invalid chars, and filter out empty tokens)
and then prepend those normalized namespace tokens to nameTokens; if
tokenization yields no tokens, skip prepending to avoid injecting invalid
characters into the final Prometheus metric name.
if !strings.HasPrefix(hexString, "0x") { | ||
return 0, errors.New("invalid format: hex string does not start with '0x'") | ||
} | ||
return strconv.ParseUint(hexString[2:], 16, 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow both 0x/0X hex prefixes.
Procfs writers frequently emit uppercase 0X
in addition to lowercase 0x
; rejecting the uppercase form turns otherwise valid kernel-provided values into hard failures.(lkml.iu.edu) Please accept both cases before parsing.
- if !strings.HasPrefix(hexString, "0x") {
+ if len(hexString) < 2 || (hexString[:2] != "0x" && hexString[:2] != "0X") {
return 0, errors.New("invalid format: hex string does not start with '0x'")
}
- return strconv.ParseUint(hexString[2:], 16, 64)
+ return strconv.ParseUint(hexString[2:], 16, 64)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if !strings.HasPrefix(hexString, "0x") { | |
return 0, errors.New("invalid format: hex string does not start with '0x'") | |
} | |
return strconv.ParseUint(hexString[2:], 16, 64) | |
if len(hexString) < 2 || (hexString[:2] != "0x" && hexString[:2] != "0X") { | |
return 0, errors.New("invalid format: hex string does not start with '0x'") | |
} | |
return strconv.ParseUint(hexString[2:], 16, 64) |
🤖 Prompt for AI Agents
In vendor/github.com/prometheus/procfs/internal/util/parse.go around lines 122
to 125, the code only accepts a lowercase "0x" prefix and rejects "0X" which
valid procfs writers may emit; change the prefix check to accept both "0x" and
"0X" (e.g., check hexString has length >=2 and hexString[0]=='0' and
(hexString[1]=='x' || hexString[1]=='X') or normalize the prefix case) and then
call strconv.ParseUint on the slice starting after the two-character prefix;
ensure you still validate length before slicing to avoid panics.
m.Zswapped = &val | ||
m.ZswapBytes = &valBytes | ||
case "Dirty:": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix assignment for ZswappedBytes
Line 311 stores the byte-normalized value for Zswapped:
into m.ZswapBytes
, leaving ZswappedBytes
nil and overwriting the bytes value captured for Zswap:
. This breaks consumers that expect ZswappedBytes
to be populated. Please assign the value to the correct field.
- case "Zswapped:":
- m.Zswapped = &val
- m.ZswapBytes = &valBytes
+ case "Zswapped:":
+ m.Zswapped = &val
+ m.ZswappedBytes = &valBytes
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
m.Zswapped = &val | |
m.ZswapBytes = &valBytes | |
case "Dirty:": | |
m.Zswapped = &val | |
m.ZswappedBytes = &valBytes | |
case "Dirty:": |
🤖 Prompt for AI Agents
In vendor/github.com/prometheus/procfs/meminfo.go around lines 309 to 311, the
code assigns the byte-normalized value for "Zswapped:" to m.ZswapBytes instead
of m.ZswappedBytes, leaving m.ZswappedBytes nil and overwriting the Zswap bytes
value; change the assignment so the normalized bytes value is stored in
m.ZswappedBytes (and ensure m.ZswapBytes remains assigned only for the "Zswap:"
key).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
config/config-observability.yaml
(2 hunks)pkg/pipelinerunmetrics/metrics.go
(3 hunks)pkg/pipelinerunmetrics/metrics_test.go
(5 hunks)pkg/reconciler/pipelinerun/pipelinerun.go
(1 hunks)pkg/taskrunmetrics/metrics.go
(7 hunks)pkg/taskrunmetrics/metrics_test.go
(9 hunks)test/util.go
(0 hunks)
💤 Files with no reviewable changes (1)
- test/util.go
🧰 Additional context used
🧬 Code graph analysis (4)
pkg/pipelinerunmetrics/metrics_test.go (3)
pkg/pipelinerunmetrics/metrics.go (1)
NewRecorder
(72-88)pkg/taskrunmetrics/metrics.go (1)
NewRecorder
(74-91)pkg/apis/config/metrics.go (3)
PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)
pkg/pipelinerunmetrics/metrics.go (3)
pkg/apis/pipeline/v1/pipelinerun_types.go (3)
PipelineRunReasonCancelled
(345-345)PipelineRun
(51-60)PipelineRun
(73-75)pkg/apis/config/metrics.go (4)
Metrics
(101-109)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/taskrunmetrics/metrics.go (3)
Recorder
(46-65)NewRecorder
(74-91)OnStore
(191-207)
pkg/taskrunmetrics/metrics.go (4)
pkg/apis/config/metrics.go (7)
Metrics
(101-109)TaskrunLevelAtTaskrun
(52-52)TaskrunLevelAtTask
(54-54)TaskrunLevelAtNS
(56-56)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/pipelinerunmetrics/metrics.go (3)
Recorder
(47-63)NewRecorder
(72-88)OnStore
(312-328)pkg/apis/pipeline/register.go (4)
PipelineTaskLabelKey
(47-47)TaskLabelKey
(29-29)PipelineLabelKey
(38-38)PipelineRunLabelKey
(41-41)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/taskrunmetrics/metrics_test.go (2)
pkg/pipelinerunmetrics/metrics.go (2)
Recorder
(47-63)NewRecorder
(72-88)pkg/taskrunmetrics/metrics.go (2)
Recorder
(46-65)NewRecorder
(74-91)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (5)
pkg/reconciler/pipelinerun/pipelinerun.go (1)
311-313
: Nice context propagation for metrics.Threading the reconcile context through to
DurationAndCount
keeps span data and cancellation semantics intact for metrics emission. Thanks for tightening that up.pkg/taskrunmetrics/metrics_test.go (1)
175-200
: Appreciate the lightweight lister double.The minimal
mockTaskRunLister
keeps this test focused on recorder wiring without pulling informer plumbing into the mix—good call.config/config-observability.yaml (1)
40-101
: Clearer OTel guidance in the sample config.The expanded example docs make it much easier for operators to flip on OTLP exporters while still seeing the legacy knobs—nice documentation upgrade.
pkg/pipelinerunmetrics/metrics.go (1)
206-248
: FixRunningPipelineRuns
gauge semantics and trim the critical section.
Float64UpDownCounter.Add
expects per-call deltas. Feeding it the full count on every scrape (and trying to “reset” withAdd(ctx, 0)
) makes the exported gauge march upward forever, so alerts will scream even when the number of running PipelineRuns stays flat. This also contradicts the OTEL guidance that synchronous UpDownCounters should receive incremental updates.(opentelemetry.netlify.app)On top of that, we grab
r.mutex
before invokinglister.List(...)
. Listing can block on the informer cache, so we end up serializing every other metric call behind a potentially expensive read.Please compute the totals first, then lock only while updating recorder state and emit the delta since the last snapshot. Initializing the cached values when (re)building instruments keeps the gauges consistent after config changes.
@@ - runningPRsWaitingOnPipelineResolutionGauge metric.Float64UpDownCounter - runningPRsWaitingOnTaskResolutionGauge metric.Float64UpDownCounter + runningPRsWaitingOnPipelineResolutionGauge metric.Float64UpDownCounter + runningPRsWaitingOnTaskResolutionGauge metric.Float64UpDownCounter + + runningPRsLastValue float64 + runningPRsWaitingOnPipelineLastValue float64 + runningPRsWaitingOnTaskLastValue float64 @@ - runningPRsWaitingOnTaskResolutionGauge, err := r.meter.Float64UpDownCounter( + runningPRsWaitingOnTaskResolutionGauge, err := r.meter.Float64UpDownCounter( @@ - r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge + r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge + + r.runningPRsLastValue = 0 + r.runningPRsWaitingOnPipelineLastValue = 0 + r.runningPRsWaitingOnTaskLastValue = 0 @@ - r.mutex.Lock() - defer r.mutex.Unlock() - - prs, err := lister.List(labels.Everything()) + prs, err := lister.List(labels.Everything()) @@ - r.runningPRsGauge.Add(ctx, 0, metric.WithAttributes()) - r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, 0, metric.WithAttributes()) - r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, 0, metric.WithAttributes()) - runningCount := 0 @@ - r.runningPRsGauge.Add(ctx, float64(runningCount), metric.WithAttributes()) - r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, float64(waitingOnPipelineCount), metric.WithAttributes()) - r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, float64(waitingOnTaskCount), metric.WithAttributes()) + r.mutex.Lock() + defer r.mutex.Unlock() + + runningDelta := float64(runningCount) - r.runningPRsLastValue + pipelineDelta := float64(waitingOnPipelineCount) - r.runningPRsWaitingOnPipelineLastValue + taskDelta := float64(waitingOnTaskCount) - r.runningPRsWaitingOnTaskLastValue + + if runningDelta != 0 { + r.runningPRsGauge.Add(ctx, runningDelta) + } + if pipelineDelta != 0 { + r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, pipelineDelta) + } + if taskDelta != 0 { + r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, taskDelta) + } + + r.runningPRsLastValue = float64(runningCount) + r.runningPRsWaitingOnPipelineLastValue = float64(waitingOnPipelineCount) + r.runningPRsWaitingOnTaskLastValue = float64(waitingOnTaskCount)pkg/taskrunmetrics/metrics.go (1)
312-347
: Make TaskRun running gauges emit deltas and movelister.List
outside the lock.Just like the PipelineRun gauges, we currently add the full counts to
Float64UpDownCounter
on every scrape. That makes the exported value balloon cumulatively instead of reflecting “current number of running TaskRuns,” which breaks dashboards and alerting. OTEL’s synchronous UpDownCounter contract is to add positive/negative deltas, not absolute totals.(opentelemetry.netlify.app)We also enter the critical section before calling
lister.List
, so a slow informer read will block every other metric update.Capture the counts first, lock only while mutating recorder state, and publish the delta relative to the last snapshot (resetting that cache whenever instruments are rebuilt):
@@ - runningTRsThrottledByNodeGauge metric.Float64UpDownCounter - podLatencyHistogram metric.Float64Histogram + runningTRsThrottledByNodeGauge metric.Float64UpDownCounter + podLatencyHistogram metric.Float64Histogram + + runningTRsLastValue float64 + trsThrottledByQuotaLastValue float64 + trsThrottledByNodeLastValue float64 @@ - r.runningTRsThrottledByNodeGauge = runningTRsThrottledByNodeGauge + r.runningTRsThrottledByNodeGauge = runningTRsThrottledByNodeGauge + + r.runningTRsLastValue = 0 + r.trsThrottledByQuotaLastValue = 0 + r.trsThrottledByNodeLastValue = 0 @@ - r.mutex.Lock() - defer r.mutex.Unlock() - - trs, err := lister.List(labels.Everything()) + trs, err := lister.List(labels.Everything()) @@ - r.runningTRsGauge.Add(ctx, float64(runningTrs)) - r.runningTRsThrottledByQuotaGauge.Add(ctx, float64(trsThrottledByQuota)) - r.runningTRsThrottledByNodeGauge.Add(ctx, float64(trsThrottledByNode)) + r.mutex.Lock() + defer r.mutex.Unlock() + + runningDelta := float64(runningTrs) - r.runningTRsLastValue + quotaDelta := float64(trsThrottledByQuota) - r.trsThrottledByQuotaLastValue + nodeDelta := float64(trsThrottledByNode) - r.trsThrottledByNodeLastValue + + if runningDelta != 0 { + r.runningTRsGauge.Add(ctx, runningDelta) + } + if quotaDelta != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, quotaDelta) + } + if nodeDelta != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, nodeDelta) + } + + r.runningTRsLastValue = float64(runningTrs) + r.trsThrottledByQuotaLastValue = float64(trsThrottledByQuota) + r.trsThrottledByNodeLastValue = float64(trsThrottledByNode)
This upgrades knative to 1.19 and other dependent dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
go.mod (1)
3-3
: go directive should be a minor version (usego 1.24
)The
go
directive typically uses the minor version (e.g.,go 1.24
). If you need to pin an exact toolchain patch, add a separatetoolchain go1.24.x
line.Apply:
-go 1.24.0 +go 1.24 +toolchain go1.24.0
🧹 Nitpick comments (16)
go.mod (1)
137-137
: Dual gengo module paths present (k8s.io/gengo
andk8s.io/gengo/v2
)
go.mod includes bothk8s.io/gengo v0.0.0-20240404160639-a0386bf69313
(line 243) andk8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7
(line 137) as indirect dependencies. This is expected for transitive deps on different major versions. If you control or can update upstream modules, consider consolidating on the/v2
path to reduce confusion.pkg/pipelinerunmetrics/metrics_test.go (4)
303-305
: Isolate OTEL provider per-test with cleanup.Setting a global meter provider without restoring it can leak across tests. Capture the old provider and restore it in t.Cleanup.
- provider := metric.NewMeterProvider(metric.WithReader(reader)) - otel.SetMeterProvider(provider) + provider := metric.NewMeterProvider(metric.WithReader(reader)) + prev := otel.GetMeterProvider() + otel.SetMeterProvider(provider) + t.Cleanup(func() { otel.SetMeterProvider(prev) })
307-316
: Avoid mutating the package-global recorder in tests.Directly assigning to r couples tests to internals and can race with once.Do. Prefer resetting once and using NewRecorder with the test provider.
- // Create a new recorder with the test meter provider. - r = &Recorder{ - initialized: true, - ReportingPeriod: 30 * time.Second, - meter: provider.Meter("tekton_pipelines_controller"), - } - cfg := config.FromContextOrDefaults(ctx) - r.cfg = cfg.Metrics - if err := r.initializeInstruments(cfg.Metrics); err != nil { - t.Fatalf("initializeInstruments: %v", err) - } + once = sync.Once{} + rec, err := NewRecorder(ctx) + if err != nil { + t.Fatalf("NewRecorder: %v", err) + } + r = recNote: add import "sync" at top.
179-189
: Consider asserting histogram/count emission for DurationAndCount.You already wire OTEL readers elsewhere; doing the same here lets you assert at least one datapoint was recorded for duration/total count.
364-383
: Add assertions for “waiting on resolution” gauges.This test currently only checks initialization. Drive cases that should increment:
- waiting_on_pipeline_resolution
- waiting_on_task_resolution
Then collect via ManualReader and assert expected attribute sets/values, similar to CountAtAllLevels.
pkg/taskrunmetrics/metrics_test.go (6)
224-227
: Restore OTEL provider after test to prevent cross-test leakage.Add t.Cleanup to put the previous provider back.
reader := metric.NewManualReader() provider := metric.NewMeterProvider(metric.WithReader(reader)) -otel.SetMeterProvider(provider) +prev := otel.GetMeterProvider() +otel.SetMeterProvider(provider) +t.Cleanup(func() { otel.SetMeterProvider(prev) })
222-223
: Reset singleton once per-test using t.Cleanup.Make the reset self-contained and future-proof.
-once = sync.Once{} +t.Cleanup(func() { once = sync.Once{} }) +once = sync.Once{}
251-272
: Optional: also assert expected attributes.Beyond the value 1, you can validate whether a namespace attribute is present or cluster-level is used, depending on config. Helps catch tag regression.
85-96
: Ensure NewRecorder sees the test provider.Reset once and add provider restore here too, like in other tests, to avoid ordering flakiness.
119-184
: Consider validating duration/total count emission.Mirroring the OTEL reader pattern from other tests would let you assert a histogram/count was recorded for TaskRun durations.
431-469
: Fix typos in test field names.Rename expetected* to expected* for clarity.
- expetectedPipeline string - expetectedPipelineRun string + expectedPipeline string + expectedPipelineRun string ... - if pipeline != test.expetectedPipeline { + if pipeline != test.expectedPipeline { ... - if pipelineRun != test.expetectedPipelineRun { + if pipelineRun != test.expectedPipelineRun {config/config-observability.yaml (1)
40-74
: Clarify supported values for OTEL config keys
Updateconfig/config-observability.yaml
to list each key’s allowed values and map them to the exporter setup (e.g.,
- metrics-protocol: prometheus | grpc | http/protobuf | none
- metrics-endpoint: host:port or host%zone:port (or bracketed equivalents)
- metrics-export-interval: duration string (e.g., “30s”, “1m”)
- tracing-protocol: grpc | http/protobuf | none | stdout
- tracing-endpoint: same format as metrics-endpoint
- tracing-sampling-rate: 0.0–1.0
- runtime-profiling: enabled | disabled
- runtime-export-interval: duration string)
And add a reference or link to the parsing logic in vendor/knative.dev/pkg/observability/metrics/config.go, tracing/config.go, and runtime/config.go.pkg/pipelinerunmetrics/metrics.go (2)
284-287
: Log errors during periodic reportingCurrently errors are swallowed. Mirror taskrun reporter and log warnings to aid ops.
case <-delay.C: // Every 30s surface a metric for the number of running pipelines. if err := r.RunningPipelineRuns(ctx, lister); err != nil { - // Log error but continue reporting - // In a real implementation, you might want to use a logger here + // TODO: plumb logger; for now use zap via a helper if available + // logger.Warnw("Failed to log running pipelineruns metric", zap.Error(err)) }
172-175
: Guard against nil/zero StartTime before computing durationDefensive check avoids panics for edge states.
duration := time.Since(pr.Status.StartTime.Time) if pr.Status.CompletionTime != nil { duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) } +// Optional: if StartTime is zero, skip recording +if pr.Status.StartTime.IsZero() { + return nil +}Please confirm StartTime is always set before this path in your controller.
pkg/taskrunmetrics/metrics.go (2)
307-307
: Consider aligning counter attributes with duration metrics
trTotalCounter
currently records only status. For better cardinality control and parity with durations, consider adding namespace (and optionally task/pipeline labels when configured).- r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status))) + r.trTotalCounter.Add(ctx, 1, metric.WithAttributes( + attribute.String("status", status), + attribute.String("namespace", tr.Namespace), + ))
25-33
: Unify error handling packagesThis file uses github.com/pkg/errors, while pipelinerunmetrics uses std
errors
. Prefer one strategy (stdliberrors
+fmt.Errorf("…: %w", err)
) for consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (33)
config/config-observability.yaml
(2 hunks)go.mod
(8 hunks)pkg/client/clientset/versioned/fake/clientset_generated.go
(2 hunks)pkg/client/clientset/versioned/typed/pipeline/v1/pipeline_client.go
(3 hunks)pkg/client/clientset/versioned/typed/pipeline/v1alpha1/pipeline_client.go
(3 hunks)pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go
(3 hunks)pkg/client/informers/externalversions/pipeline/v1/pipeline.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1/pipelinerun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1/task.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1/taskrun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1alpha1/run.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1alpha1/stepaction.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1alpha1/verificationpolicy.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/customrun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/pipeline.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/pipelinerun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/task.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/taskrun.go
(1 hunks)pkg/client/resolution/clientset/versioned/fake/clientset_generated.go
(2 hunks)pkg/client/resolution/clientset/versioned/typed/resolution/v1alpha1/resolution_client.go
(3 hunks)pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go
(3 hunks)pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go
(1 hunks)pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go
(1 hunks)pkg/client/resource/clientset/versioned/fake/clientset_generated.go
(2 hunks)pkg/client/resource/clientset/versioned/typed/resource/v1alpha1/resource_client.go
(3 hunks)pkg/client/resource/informers/externalversions/resource/v1alpha1/pipelineresource.go
(1 hunks)pkg/pipelinerunmetrics/metrics.go
(3 hunks)pkg/pipelinerunmetrics/metrics_test.go
(5 hunks)pkg/reconciler/pipelinerun/pipelinerun.go
(1 hunks)pkg/taskrunmetrics/metrics.go
(7 hunks)pkg/taskrunmetrics/metrics_test.go
(8 hunks)test/util.go
(0 hunks)
💤 Files with no reviewable changes (1)
- test/util.go
🧰 Additional context used
🧬 Code graph analysis (5)
pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go (2)
pkg/client/informers/externalversions/pipeline/v1beta1/interface.go (1)
Interface
(26-39)pkg/client/injection/reconciler/pipeline/v1beta1/stepaction/reconciler.go (1)
Interface
(44-52)
pkg/taskrunmetrics/metrics.go (5)
pkg/apis/config/metrics.go (7)
Metrics
(101-109)TaskrunLevelAtTaskrun
(52-52)TaskrunLevelAtTask
(54-54)TaskrunLevelAtNS
(56-56)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/pipelinerunmetrics/metrics.go (3)
Recorder
(47-63)NewRecorder
(72-88)OnStore
(332-348)pkg/apis/pipeline/register.go (4)
PipelineTaskLabelKey
(47-47)TaskLabelKey
(29-29)PipelineLabelKey
(38-38)PipelineRunLabelKey
(41-41)vendor/go.opentelemetry.io/otel/attribute/set.go (1)
NewSet
(183-186)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/pipelinerunmetrics/metrics_test.go (5)
pkg/apis/pipeline/v1/pipelinerun_types.go (4)
PipelineRun
(51-60)PipelineRun
(73-75)PipelineRunSpec
(250-286)PipelineRunStatus
(321-326)pkg/pipelinerunmetrics/metrics.go (2)
NewRecorder
(72-88)Recorder
(47-63)pkg/names/generate.go (1)
SimpleNameGenerator
(48-48)pkg/apis/config/metrics.go (4)
PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)Metrics
(101-109)vendor/go.opentelemetry.io/otel/attribute/set.go (1)
NewSet
(183-186)
pkg/taskrunmetrics/metrics_test.go (5)
pkg/pipelinerunmetrics/metrics.go (2)
Recorder
(47-63)NewRecorder
(72-88)pkg/taskrunmetrics/metrics.go (2)
Recorder
(46-65)NewRecorder
(74-91)pkg/names/generate.go (1)
SimpleNameGenerator
(48-48)pkg/apis/config/metrics.go (1)
Metrics
(101-109)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/pipelinerunmetrics/metrics.go (4)
pkg/apis/config/metrics.go (4)
Metrics
(101-109)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/taskrunmetrics/metrics.go (3)
Recorder
(46-65)NewRecorder
(74-91)OnStore
(191-207)vendor/go.opentelemetry.io/otel/metric.go (1)
GetMeterProvider
(35-37)vendor/go.opentelemetry.io/otel/metric/instrument.go (3)
WithDescription
(116-116)WithUnit
(197-197)WithExplicitBucketBoundaries
(202-202)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (24)
pkg/client/resource/informers/externalversions/resource/v1alpha1/pipelineresource.go (1)
73-84
: Context-aware List/Watch wiring looks solidThanks for threading the caller-supplied context through to the client while preserving the existing tweak hooks; this keeps the generated informer compatible with the newer
cache.ListWatch
APIs without changing behavior for legacy callers.pkg/client/resource/clientset/versioned/typed/resource/v1alpha1/resource_client.go (1)
48-61
: Dropping the no-op error path is correct.
setConfigDefaults
cannot fail with the generated codec setup, so removing the unused error plumbing in both constructors keeps the flow in sync with upstream client-gen output. Looks good.pkg/client/informers/externalversions/pipeline/v1alpha1/verificationpolicy.go (1)
79-83
: FixListWatch
field name to avoid compile failure. Thecache.ListWatch
struct exposes the field asWatchWithContextFunc
, notWatchFuncWithContext
; this code as written won’t compile. Rename the field to match the API (the existing function body can stay the same).(pkg.go.dev)- WatchFuncWithContext: func(ctx context.Context, options v1.ListOptions) (watch.Interface, error) { + WatchWithContextFunc: func(ctx context.Context, options v1.ListOptions) (watch.Interface, error) {Likely an incorrect or invalid review comment.
pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go (1)
73-84
: Nice addition of context-aware hooks.Wiring through
ListWithContextFunc
/WatchFuncWithContext
keeps the tweakListOptions behavior while finally letting callers honor cancellation deadlines—exactly what we’ve needed for informer parity.pkg/client/informers/externalversions/pipeline/v1/pipelinerun.go (1)
61-84
: Context-aware hooks look goodThanks for wiring both the legacy
ListFunc
/WatchFunc
paths and the newListWithContextFunc
/WatchFuncWithContext
variants. That keeps existing callers happy while letting context-aware code avoidcontext.TODO()
. Nicely done.pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go (1)
83-92
: Config defaults updated in step with upstream generatorSwitching to
rest.CodecFactoryForGeneratedClient(...).WithoutConversion()
and dropping the unused error return mirrors the newer client-go generator output, so the client keeps the same behavior without the redundant plumbing. Nice cleanup.pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go (1)
73-84
: Context-aware ListWatch wiring looks solidWiring both
ListWithContextFunc
andWatchFuncWithContext
while retaining the legacy fallbacks ensures cancellation can flow through informers without breaking older consumers. Good upgrade.pkg/client/resolution/clientset/versioned/typed/resolution/v1alpha1/resolution_client.go (1)
83-92
: Client defaults kept consistent with new scheme helpersMirroring the v1beta1 change here keeps both clients aligned with the current client-go defaults, so everything builds against the updated codec factory without vestigial error handling. Looks good.
pkg/client/clientset/versioned/typed/pipeline/v1alpha1/pipeline_client.go (1)
58-64
: Approve non-error defaulting consistency Verified consistent NewForConfig/AndClient signatures and implementations across v1alpha1, v1beta1, and v1 clientsets.pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go (1)
73-75
: setConfigDefaults update verified — definitions no longer return an error and no callers capture one.pkg/reconciler/pipelinerun/pipelinerun.go (1)
306-315
: All DurationAndCount call sites updated. Verified every invocation (including tests) accepts the newctx
parameter.go.mod (2)
132-132
: Drop the incorrectgo.yaml.in/yaml/v2
indirect dependencyRemove the line at go.mod:132 (
go.yaml.in/yaml/v2 v2.4.2 // indirect
); the project already usesgopkg.in/yaml.v2 v2.4.0 // indirect
and deleting the typo prevents unknown module path errors.
23-23
: Drop OpenCensus from go.mod after confirming no usage
No non-vendor imports ofgo.opencensus.io
or related packages were found—please manually verify there are truly no remaining call sites before removinggo.opencensus.io v0.24.0
(line 23 in go.mod).pkg/client/clientset/versioned/typed/pipeline/v1/pipeline_client.go (1)
63-69
: Verify all generated clients use HTTPClientFor→NewForConfigAndClient
- Pipeline v1/v1beta1/v1alpha1 wrappers correctly call
setConfigDefaults
→HTTPClientFor
→NewForConfigAndClient
.- Confirm resource and resolution client wrappers follow the same pattern:
rg -nP --type=go 'func NewForConfig\(' pkg/client/clientset/versioned/typed
pkg/pipelinerunmetrics/metrics_test.go (1)
324-360
: Great end-to-end OTEL assertions; resolves earlier feedback.Capturing tekton_pipelines_controller_running_pipelineruns via ManualReader and diffing attribute sets meaningfully tests the per-level aggregation. This addresses the prior “restore meaningful assertions” comment.
pkg/taskrunmetrics/metrics_test.go (3)
418-423
: Align RecordPodLatency argument order.Match the function signature (ctx, tr, pod) if confirmed; current call passes (ctx, pod, tr).
- if err := metrics.RecordPodLatency(ctx, td.pod, td.taskRun); td.expectingError && err == nil { + if err := metrics.RecordPodLatency(ctx, td.taskRun, td.pod); td.expectingError && err == nil {
315-321
: Good structure and coverage for throttled metrics.End-to-end capture with attribute.Set maps for quota/node throttling looks solid.
72-81
: Ignore parameter order suggestion; call sites match the current signature. The method is defined as RecordPodLatency(ctx, pod, tr) and the test calls exactly that, so no changes are needed.Likely an incorrect or invalid review comment.
config/config-observability.yaml (1)
95-101
: Tekton metrics keys align with struct fields.The keys mirror config.Metrics options (levels, duration types, reason, running-pipelinerun level). Looks consistent.
pkg/pipelinerunmetrics/metrics.go (3)
211-217
: Move lister.List() outside the mutex to reduce contentionHolding the lock while listing is unnecessary and can block other recorders. Acquire the lock only when computing and publishing deltas.
- r.mutex.Lock() - defer r.mutex.Unlock() - prs, err := lister.List(labels.Everything()) if err != nil { return fmt.Errorf("failed to list pipelineruns: %w", err) } +// lock later only for publishing/delta updates
232-237
: Verify “waiting on task resolution” heuristicIncrementing per TaskRun child with a PipelineTaskName may over-count (counts children, not PRs) and may not reflect “waiting on resolution” specifically.
- Is there a status/condition that directly indicates a task resolution wait you can check?
- If the intent is per-PR, consider incrementing once per PR when any child qualifies.
261-268
: UpDownCounter must receive deltas, not totals — current logic inflates values each scrapeAdding the full count on each run makes the exported series grow cumulatively. Compute per-attribute deltas or switch to an async observable gauge. Below adds a small cache to emit deltas and resets it on re-init.
@@ type Recorder struct { @@ - runningPRsWaitingOnTaskResolutionGauge metric.Float64UpDownCounter + runningPRsWaitingOnTaskResolutionGauge metric.Float64UpDownCounter + + // caches for delta computation + lastRunningPRsCounts map[attribute.Set]float64 + lastWaitingOnPipeline float64 + lastWaitingOnTask float64 @@ func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { @@ r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge - - return nil + // reset caches on (re)initialization + r.lastRunningPRsCounts = make(map[attribute.Set]float64) + r.lastWaitingOnPipeline = 0 + r.lastWaitingOnTask = 0 + return nil } @@ -func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error { +func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error { @@ - r.mutex.Lock() - defer r.mutex.Unlock() - prs, err := lister.List(labels.Everything()) if err != nil { return fmt.Errorf("failed to list pipelineruns: %w", err) } @@ - // Report running_pipelineruns. - for attrSet, value := range currentCounts { - r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } - - r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes()) - r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes()) + // Emit deltas under lock. + r.mutex.Lock() + defer r.mutex.Unlock() + + // per-attribute deltas (new minus previous) + for attrSet, curr := range currentCounts { + prev := r.lastRunningPRsCounts[attrSet] + if delta := curr - prev; delta != 0 { + r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(attrSet.ToSlice()...)) + } + } + // attributes that disappeared need a negative delta to zero them + for attrSet, prev := range r.lastRunningPRsCounts { + if _, ok := currentCounts[attrSet]; !ok && prev != 0 { + r.runningPRsGauge.Add(ctx, -prev, metric.WithAttributes(attrSet.ToSlice()...)) + } + } + r.lastRunningPRsCounts = currentCounts + + if d := waitingOnPipelineCount - r.lastWaitingOnPipeline; d != 0 { + r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, d) + } + if d := waitingOnTaskCount - r.lastWaitingOnTask; d != 0 { + r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, d) + } + r.lastWaitingOnPipeline = waitingOnPipelineCount + r.lastWaitingOnTask = waitingOnTaskCountAlternative (recommended long-term): replace these with ObservableUpDownCounter/ObservableGauge callbacks that observe the current values directly. I can draft that if you prefer.
pkg/taskrunmetrics/metrics.go (2)
318-324
: Avoid holding the mutex while calling lister.List()Listing can be expensive and doesn’t need the lock. Lock only for publishing.
- r.mutex.Lock() - defer r.mutex.Unlock() - trs, err := lister.List(labels.Everything()) if err != nil { return err } +// lock later only for publishing/delta updates
353-359
: UpDownCounters must add deltas; current code adds totals and inflates metricsEmit the difference from the previous sample, and clear caches on re-init. Also compute per-attribute deltas for throttled metrics.
@@ type Recorder struct { @@ runningTRsThrottledByNodeGauge metric.Float64UpDownCounter podLatencyHistogram metric.Float64Histogram @@ ReportingPeriod time.Duration + + // caches for delta computation + lastRunningTRs int + lastTrsThrottledByQuota map[attribute.Set]float64 + lastTrsThrottledByNode map[attribute.Set]float64 } @@ func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { @@ r.podLatencyHistogram = podLatencyHistogram - - return nil + // reset caches on (re)initialization + r.lastRunningTRs = 0 + r.lastTrsThrottledByQuota = make(map[attribute.Set]float64) + r.lastTrsThrottledByNode = make(map[attribute.Set]float64) + return nil } @@ func (r *Recorder) RunningTaskRuns(ctx context.Context, lister listers.TaskRunLister) error { @@ - r.mutex.Lock() - defer r.mutex.Unlock() - trs, err := lister.List(labels.Everything()) if err != nil { return err } @@ - r.runningTRsGauge.Add(ctx, float64(runningTrs)) - for attrSet, value := range trsThrottledByQuota { - r.runningTRsThrottledByQuotaGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } - for attrSet, value := range trsThrottledByNode { - r.runningTRsThrottledByNodeGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } + r.mutex.Lock() + defer r.mutex.Unlock() + + // running taskruns (single series) + if d := runningTrs - r.lastRunningTRs; d != 0 { + r.runningTRsGauge.Add(ctx, float64(d)) + } + r.lastRunningTRs = runningTrs + + // throttled by quota (per-attr deltas) + for attrSet, curr := range trsThrottledByQuota { + prev := r.lastTrsThrottledByQuota[attrSet] + if delta := curr - prev; delta != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, delta, metric.WithAttributes(attrSet.ToSlice()...)) + } + } + for attrSet, prev := range r.lastTrsThrottledByQuota { + if _, ok := trsThrottledByQuota[attrSet]; !ok && prev != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, -prev, metric.WithAttributes(attrSet.ToSlice()...)) + } + } + r.lastTrsThrottledByQuota = trsThrottledByQuota + + // throttled by node (per-attr deltas) + for attrSet, curr := range trsThrottledByNode { + prev := r.lastTrsThrottledByNode[attrSet] + if delta := curr - prev; delta != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, delta, metric.WithAttributes(attrSet.ToSlice()...)) + } + } + for attrSet, prev := range r.lastTrsThrottledByNode { + if _, ok := trsThrottledByNode[attrSet]; !ok && prev != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, -prev, metric.WithAttributes(attrSet.ToSlice()...)) + } + } + r.lastTrsThrottledByNode = trsThrottledByNodeIf you prefer, I can sketch an ObservableGauge-based solution to observe current values directly.
var opts metav1.ListOptions | ||
if watchActcion, ok := action.(testing.WatchActionImpl); ok { | ||
opts = watchActcion.ListOptions | ||
} | ||
gvr := action.GetResource() | ||
ns := action.GetNamespace() | ||
watch, err := o.Watch(gvr, ns) | ||
watch, err := o.Watch(gvr, ns, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Same pointer handling gap here.
Please mirror the change above so both testing.WatchActionImpl
values and pointers feed through, otherwise cloned watch actions will drop their list options. Same diff applies.
🤖 Prompt for AI Agents
In pkg/client/clientset/versioned/fake/clientset_generated.go around lines 57 to
63, the code only type-asserts action to testing.WatchActionImpl, missing the
pointer variant, so cloned watch actions that are pointers lose their
ListOptions; update the type assertion to handle both testing.WatchActionImpl
and *testing.WatchActionImpl (e.g., try value assertion then pointer assertion)
and set opts from the matched value’s ListOptions so both pointer and
non-pointer watch actions preserve their ListOptions.
var opts metav1.ListOptions | ||
if watchActcion, ok := action.(testing.WatchActionImpl); ok { | ||
opts = watchActcion.ListOptions | ||
} | ||
gvr := action.GetResource() | ||
ns := action.GetNamespace() | ||
watch, err := o.Watch(gvr, ns) | ||
watch, err := o.Watch(gvr, ns, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Capture pointer WatchActionImpl
as well to avoid dropping options.
Right now we only propagate ListOptions
if the action is the concrete testing.WatchActionImpl
value. When the fake client clones actions via DeepCopy()
it can hand reactors a *testing.WatchActionImpl
, so the type assertion will fail and we silently fall back to the zero value (losing label/field selectors, RV, etc.). Please handle both forms, e.g.:
- if watchActcion, ok := action.(testing.WatchActionImpl); ok {
- opts = watchActcion.ListOptions
- }
+ switch act := action.(type) {
+ case testing.WatchActionImpl:
+ opts = act.ListOptions
+ case *testing.WatchActionImpl:
+ opts = act.ListOptions
+ }
🤖 Prompt for AI Agents
In pkg/client/resolution/clientset/versioned/fake/clientset_generated.go around
lines 55 to 61, the code only type-asserts a concrete testing.WatchActionImpl
and so drops ListOptions when the action is a *testing.WatchActionImpl; update
the logic to handle both the value and pointer forms (e.g. check for
testing.WatchActionImpl and *testing.WatchActionImpl and set opts from the
matched type) so ListOptions (label/field selectors, resourceVersion, etc.) are
preserved; keep the rest of the flow unchanged.
var opts metav1.ListOptions | ||
if watchActcion, ok := action.(testing.WatchActionImpl); ok { | ||
opts = watchActcion.ListOptions | ||
} | ||
gvr := action.GetResource() | ||
ns := action.GetNamespace() | ||
watch, err := o.Watch(gvr, ns) | ||
watch, err := o.Watch(gvr, ns, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Propagate options for pointer watch actions.
As with the other fake clientsets, capture both the value and pointer forms of WatchActionImpl
so cloned watch actions don’t lose their ListOptions
.
🤖 Prompt for AI Agents
In pkg/client/resource/clientset/versioned/fake/clientset_generated.go around
lines 53 to 59, the code only type-asserts action to the value form
testing.WatchActionImpl and therefore loses ListOptions when the action is a
pointer; update the type assertion to handle both the value and pointer forms:
first attempt action.(testing.WatchActionImpl) and set opts from its
ListOptions, then attempt action.(*testing.WatchActionImpl) and set opts from
the pointer’s ListOptions if the first assertion fails, so cloned pointer watch
actions preserve their ListOptions before calling o.Watch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
vendor/github.com/prometheus/client_golang/prometheus/internal/go_runtime_metrics.go (1)
1-26
: Avoid patching vendored dependencies; move change upstream or use a fork/replace.Direct edits under vendor/ will be overwritten by go mod vendor and are hard to maintain. Prefer:
- Fork upstream and pin via go.mod replace, or
- Upstream the change and bump the module, or
- Maintain an automated patch step in the build.
Example replace snippet:
replace github.com/prometheus/client_golang => github.com/your-org/client_golang vX.Y.Z
vendor/github.com/gorilla/websocket/server.go (1)
100-114
: Honor server preference when selecting subprotocols.The code iterates client protocols outermost, which picks the client's preference. The doc string promises server-preferred order. Iterate server protocols first and return the first present in the client list.
Apply:
- clientProtocols := Subprotocols(r) - for _, clientProtocol := range clientProtocols { - for _, serverProtocol := range u.Subprotocols { - if clientProtocol == serverProtocol { - return clientProtocol - } - } - } + clientProtocols := Subprotocols(r) + for _, serverProtocol := range u.Subprotocols { + for _, clientProtocol := range clientProtocols { + if clientProtocol == serverProtocol { + return serverProtocol + } + } + }go.mod (2)
3-3
: Ensure Go toolchain versions align to ≥1.24
- Update test/gohelloworld/Dockerfile’s base image from
golang:1.15
to at leastgolang:1.24
.- If you use GitHub Actions (none found), add or update
actions/setup-go
to pingo-version: '1.24'
or newer.
23-24
: Migrate or remove remaining OpenCensus imports in tests
The only non-vendor references togo.opencensus.io
are in your test files. Update or delete these before dropping the module:
- test/wait.go:54 imports "go.opencensus.io/trace"
- test/custom_task_test.go:38 imports "go.opencensus.io/trace"
Ensure these tests no longer use OpenCensus (e.g. switch to OpenTelemetry) before removing the
go.opencensus.io
andcontrib.go.opencensus.io/*
deps.pkg/pipelinerunmetrics/metrics.go (1)
169-201
: Respect CountWithReason for prTotalCounter to avoid cardinality spikesCounter attributes always include reason; this can explode series. Make inclusion of reason conditional on cfg.CountWithReason and align with other dims.
- r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(attrs...)) + countAttrs := []attribute.KeyValue{ + attribute.String("namespace", pr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + countAttrs = append(countAttrs, attribute.String("reason", reason)) + } + if r.insertTag != nil { + countAttrs = append(countAttrs, r.insertTag(pipelineName, pr.Name)...) + } + r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))pkg/taskrunmetrics/metrics.go (1)
279-283
: Guard against nil StartTime in DurationAndCountAvoid panic when StartTime is unset.
- duration := time.Since(tr.Status.StartTime.Time) - if tr.Status.CompletionTime != nil { - duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time) - } + if tr.Status.StartTime == nil || tr.Status.StartTime.IsZero() { + return nil + } + duration := time.Since(tr.Status.StartTime.Time) + if tr.Status.CompletionTime != nil { + duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time) + }
🧹 Nitpick comments (22)
vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.c (2)
16-19
: Optional: include header that declares mach_port_deallocate.Some SDKs require
<mach/mach_port.h>
(or<mach/mach.h>
) explicitly.#include <mach/mach_init.h> #include <mach/task.h> #include <mach/mach_vm.h> +#include <mach/mach_port.h>
35-36
: Remove unused variable to avoid -Wunused-variable warning.
task
is never used.- task_t task = MACH_PORT_NULL;
vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go (1)
41-49
: Optional: prefer x/sys/unix.Getrlimit over syscall.Getrlimit.You already import
golang.org/x/sys/unix
; using it avoids deprecatedsyscall
usage on some platforms.-func getSoftLimit(which int) (uint64, error) { - rlimit := syscall.Rlimit{} - - if err := syscall.Getrlimit(which, &rlimit); err != nil { +func getSoftLimit(which int) (uint64, error) { + var rlimit unix.Rlimit + if err := unix.Getrlimit(which, &rlimit); err != nil { return 0, err } return rlimit.Cur, nil }vendor/github.com/go-logr/zapr/slogzapr.go (2)
56-59
: Only emit numeric verbosity for negative levels.The comment says “Record verbosity for info entries,” but the current code adds a 0 for info and warn. Usually we only attach the numeric verbosity for V-level (debug) logs.
- if !isError && zl.numericLevelKey != "" { - // Record verbosity for info entries. - fields = append(fields, zap.Int(zl.numericLevelKey, intLevel)) - } + if record.Level < 0 && zl.numericLevelKey != "" { + fields = append(fields, zap.Int(zl.numericLevelKey, intLevel)) + }
122-136
: Error from AddObject ignored (non-blocking).enc.AddObject returns an error that’s discarded. zap encoders rarely error, so this is fine, but if you want belt-and-suspenders, consider capturing once (debug-only) to aid troubleshooting.
vendor/github.com/gorilla/websocket/compression.go (1)
36-40
: Close pooled reader before falling back in decompressNoContextTakeover
The Reset-error path replacesfr
without closing the originally pooled reader, which—if ever triggered—will shrink the pool and leak buffers. Even thoughReset
never fails today, guard against future changes by closing the old reader first:vendor/github.com/gorilla/websocket/compression.go @@ -36,7 +36,10 @@ func decompressNoContextTakeover(r io.Reader) io.ReadCloser { mr := io.MultiReader(r, strings.NewReader(tail)) if err := fr.(flate.Resetter).Reset(mr, nil); err != nil { // Reset never fails, but handle error in case that changes. - fr = flate.NewReader(mr) + old := fr + fr = flate.NewReader(mr) + _ = old.Close() } return &flateReadWrapper{fr} }This vendor file is identical to upstream at commit e064f32e3674.
vendor/github.com/grafana/regexp/backtrack.go (1)
62-69
: Guard against zero-length programs in maxBitStateLen.Division by zero is possible if len(prog.Inst) == 0. Add a fast return for that case.
func maxBitStateLen(prog *syntax.Prog) int { - if !shouldBacktrack(prog) { + if !shouldBacktrack(prog) || len(prog.Inst) == 0 { return 0 } return maxBacktrackVector / len(prog.Inst) }vendor/github.com/grafana/regexp/onepass.go (1)
158-206
: Simplify mergeRuneSets: remove confusing defer, return early on failure.The deferred mutation to local vars is redundant since early returns already emit noRune/noNext. Removing it clarifies control flow and avoids confusion.
- ok := true - defer func() { - if !ok { - merged = nil - next = nil - } - }() + ok := true @@ - if !ok { - return noRune, noNext - } + if !ok { + return noRune, noNext + }vendor/github.com/grafana/regexp/exec.go (1)
512-550
: Parity check: add a quick test to compare indices vs stdlib.Add tests that assert FindStringIndex/FindIndex equality with stdlib for:
- anchored one‑pass patterns with literal prefix (match not at 0),
- capturing groups,
- case‑folded single‑rune prefixes.
I can draft a focused test file if helpful.
vendor/github.com/grafana/regexp/regexp.go (4)
1251-1283
: Avoid shadowing the importedstrings
package in SplitLocal var
strings
shadows the imported package name. Rename for clarity.- matches := re.FindAllStringIndex(s, n) - strings := make([]string, 0, len(matches)) + matches := re.FindAllStringIndex(s, n) + parts := make([]string, 0, len(matches)) @@ - if n > 0 && len(strings) >= n-1 { + if n > 0 && len(parts) >= n-1 { break } @@ - if match[1] != 0 { - strings = append(strings, s[beg:end]) + if match[1] != 0 { + parts = append(parts, s[beg:end]) } beg = match[1] } @@ - if end != len(s) { - strings = append(strings, s[beg:]) + if end != len(s) { + parts = append(parts, s[beg:]) } - return strings + return parts
9-12
: Modernize doc link (optional)Update golang.org shortlink to go.dev.
-// https://golang.org/s/re2syntax, except for \C. +// https://go.dev/s/re2syntax, except for \C.
540-571
: Hot‑path caution for convenience Match funcs*These compile the pattern on every call. For hot paths, prefer Compile/MustCompile once and reuse the Regexp.
68-69
: Compat tests vs stdlib to de‑risk forked behaviorRecommend adding a small test matrix comparing this package to stdlib regexp for core ops (MatchString, FindStringSubmatch, ReplaceAllString, Split) to catch divergences early.
// vendor/github.com/grafana/regexp/compat_test.go package regexp_test import ( std "regexp" graf "github.com/grafana/regexp" "testing" ) func TestCompat_Basic(t *testing.T) { cases := []string{`a+b?`, `^foo(\d+)bar$`, `(?P<name>\w+)`, `a*`, `[[:alpha:]]+`, `^$`} for _, pat := range cases { stdre := std.MustCompile(pat) gre := graf.MustCompile(pat) in := []string{"", "foo123bar", "aab", "xyz", "aaab?", "αβγ", "foo42bar baz"} for _, s := range in { if stdre.MatchString(s) != gre.MatchString(s) { t.Fatalf("MatchString diverged for %q on %q", pat, s) } if gotS, gotG := stdre.FindString(s), gre.FindString(s); gotS != gotG { t.Fatalf("FindString diverged for %q on %q: %q vs %q", pat, s, gotS, gotG) } if gotS, gotG := stdre.Split(s, -1), gre.Split(s, -1); len(gotS) != len(gotG) { t.Fatalf("Split len diverged for %q on %q: %v vs %v", pat, s, gotS, gotG) } } } }vendor/github.com/prometheus/client_golang/prometheus/wrap.go (1)
81-103
: Defensive copy of Labels to avoid concurrent map mutation/racesLabels is a map; callers might mutate it after passing in. Snapshot it when wrapping to prevent data races and surprising changes.
-func WrapCollectorWith(labels Labels, c Collector) Collector { - return &wrappingCollector{ - wrappedCollector: c, - labels: labels, - } -} +func WrapCollectorWith(labels Labels, c Collector) Collector { + var labelsCopy Labels + if labels != nil { + labelsCopy = make(Labels, len(labels)) + for k, v := range labels { + labelsCopy[k] = v + } + } + return &wrappingCollector{ + wrappedCollector: c, + labels: labelsCopy, + } +}vendor/github.com/prometheus/client_golang/prometheus/histogram.go (2)
1967-1994
: Nit: local variable naming.nativehistogram is verbose and stutters; use nh or m.
Apply this diff:
- nativehistogram, err := NewConstNativeHistogram(desc, + nh, err := NewConstNativeHistogram(desc, ... - if err != nil { + if err != nil { panic(err) } - return nativehistogram + return nh
2006-2056
: makeBucketsFromMap: mirror of runtime encoder — OK; watch for negative values.Given validateCount will now reject negatives, this should be safe. If input counts are per‑bucket, consider cumulating before appendDelta; otherwise, add a brief comment that counts are cumulative to avoid misuse.
vendor/github.com/gorilla/websocket/conn.go (1)
994-996
: Nit: fix comment typos ("best effor" → "best effort").Purely cosmetic.
-// Make a best effor to send a close message describing the problem. +// Make a best effort to send a close message describing the problem. ... -// Make a best effor to send the close message. +// Make a best effort to send the close message.Also applies to: 1151-1152
vendor/github.com/gorilla/websocket/server.go (1)
356-356
: Typo.“buferred” → “buffered”.
vendor/github.com/gorilla/websocket/client.go (1)
500-506
: Optionally set a stricter TLS minimum when cloning config.For hardened defaults, set
MinVersion
when not provided to avoid legacy protocol negotiation.Apply:
func cloneTLSConfig(cfg *tls.Config) *tls.Config { if cfg == nil { - return &tls.Config{} + return &tls.Config{MinVersion: tls.VersionTLS13} } - return cfg.Clone() + c := cfg.Clone() + if c.MinVersion == 0 { + c.MinVersion = tls.VersionTLS13 + } + return c }Note: adopt only if compatible with your supported endpoints.
pkg/pipelinerunmetrics/metrics_test.go (1)
117-191
: Consider asserting duration/count metrics too.Nice job wiring the OTEL
ManualReader
. You can mirror the approach here to validateDurationAndCount
emitted the expected histogram/counter with tags, not just “no error”.I can draft a small reader-based assertion for one success case if helpful.
pkg/taskrunmetrics/metrics_test.go (1)
88-91
: Restore OTEL global MeterProvider after testsSetting a global provider without restoring can leak state across tests. Capture the previous provider and restore it via t.Cleanup in each test that calls SetMeterProvider.
- reader := metric.NewManualReader() - provider := metric.NewMeterProvider(metric.WithReader(reader)) - otel.SetMeterProvider(provider) + reader := metric.NewManualReader() + provider := metric.NewMeterProvider(metric.WithReader(reader)) + prev := otel.GetMeterProvider() + otel.SetMeterProvider(provider) + t.Cleanup(func() { otel.SetMeterProvider(prev) })Also applies to: 224-227, 319-321
pkg/taskrunmetrics/metrics.go (1)
389-407
: Lock scope in RecordPodLatency can be reducedNo recorder state is mutated until the final Record call. Consider locking only around the Record to minimize contention.
- r.mutex.Lock() - defer r.mutex.Unlock() - scheduledTime := getScheduledTime(pod) ... - r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...)) + r.mutex.Lock() + r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...)) + r.mutex.Unlock()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
go.sum
is excluded by!**/*.sum
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.go
is excluded by!**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service.pb.gw.go
is excluded by!**/*.pb.gw.go
vendor/go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service_grpc.pb.go
is excluded by!**/*.pb.go
vendor/go.opentelemetry.io/proto/otlp/metrics/v1/metrics.pb.go
is excluded by!**/*.pb.go
📒 Files selected for processing (107)
cmd/webhook/main.go
(0 hunks)config/config-observability.yaml
(2 hunks)go.mod
(8 hunks)pkg/client/clientset/versioned/fake/clientset_generated.go
(2 hunks)pkg/client/clientset/versioned/typed/pipeline/v1/pipeline_client.go
(3 hunks)pkg/client/clientset/versioned/typed/pipeline/v1alpha1/pipeline_client.go
(3 hunks)pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go
(3 hunks)pkg/client/informers/externalversions/pipeline/v1/pipeline.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1/pipelinerun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1/task.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1/taskrun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1alpha1/run.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1alpha1/stepaction.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1alpha1/verificationpolicy.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/customrun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/pipeline.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/pipelinerun.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/task.go
(1 hunks)pkg/client/informers/externalversions/pipeline/v1beta1/taskrun.go
(1 hunks)pkg/client/resolution/clientset/versioned/fake/clientset_generated.go
(2 hunks)pkg/client/resolution/clientset/versioned/typed/resolution/v1alpha1/resolution_client.go
(3 hunks)pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go
(3 hunks)pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go
(1 hunks)pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go
(1 hunks)pkg/client/resource/clientset/versioned/fake/clientset_generated.go
(2 hunks)pkg/client/resource/clientset/versioned/typed/resource/v1alpha1/resource_client.go
(3 hunks)pkg/client/resource/informers/externalversions/resource/v1alpha1/pipelineresource.go
(1 hunks)pkg/pipelinerunmetrics/metrics.go
(3 hunks)pkg/pipelinerunmetrics/metrics_test.go
(5 hunks)pkg/reconciler/pipelinerun/pipelinerun.go
(1 hunks)pkg/taskrunmetrics/metrics.go
(7 hunks)pkg/taskrunmetrics/metrics_test.go
(8 hunks)test/util.go
(0 hunks)vendor/github.com/go-logr/logr/slogr/slogr.go
(1 hunks)vendor/github.com/go-logr/zapr/.gitignore
(1 hunks)vendor/github.com/go-logr/zapr/.golangci.yaml
(1 hunks)vendor/github.com/go-logr/zapr/LICENSE
(1 hunks)vendor/github.com/go-logr/zapr/README.md
(1 hunks)vendor/github.com/go-logr/zapr/slogzapr.go
(1 hunks)vendor/github.com/go-logr/zapr/zapr.go
(1 hunks)vendor/github.com/go-logr/zapr/zapr_noslog.go
(1 hunks)vendor/github.com/go-logr/zapr/zapr_slog.go
(1 hunks)vendor/github.com/golang/protobuf/proto/buffer.go
(0 hunks)vendor/github.com/golang/protobuf/proto/defaults.go
(0 hunks)vendor/github.com/golang/protobuf/proto/deprecated.go
(0 hunks)vendor/github.com/golang/protobuf/proto/discard.go
(0 hunks)vendor/github.com/golang/protobuf/proto/extensions.go
(0 hunks)vendor/github.com/golang/protobuf/proto/properties.go
(0 hunks)vendor/github.com/golang/protobuf/proto/proto.go
(0 hunks)vendor/github.com/golang/protobuf/proto/registry.go
(0 hunks)vendor/github.com/golang/protobuf/proto/text_decode.go
(0 hunks)vendor/github.com/golang/protobuf/proto/text_encode.go
(0 hunks)vendor/github.com/golang/protobuf/proto/wire.go
(0 hunks)vendor/github.com/golang/protobuf/proto/wrappers.go
(0 hunks)vendor/github.com/google/gofuzz/.travis.yml
(0 hunks)vendor/github.com/google/gofuzz/CONTRIBUTING.md
(0 hunks)vendor/github.com/google/gofuzz/fuzz.go
(0 hunks)vendor/github.com/gorilla/websocket/README.md
(2 hunks)vendor/github.com/gorilla/websocket/client.go
(7 hunks)vendor/github.com/gorilla/websocket/compression.go
(1 hunks)vendor/github.com/gorilla/websocket/conn.go
(19 hunks)vendor/github.com/gorilla/websocket/proxy.go
(3 hunks)vendor/github.com/gorilla/websocket/server.go
(6 hunks)vendor/github.com/gorilla/websocket/tls_handshake.go
(0 hunks)vendor/github.com/gorilla/websocket/tls_handshake_116.go
(0 hunks)vendor/github.com/gorilla/websocket/x_net_proxy.go
(0 hunks)vendor/github.com/grafana/regexp/.gitignore
(1 hunks)vendor/github.com/grafana/regexp/LICENSE
(1 hunks)vendor/github.com/grafana/regexp/README.md
(1 hunks)vendor/github.com/grafana/regexp/backtrack.go
(1 hunks)vendor/github.com/grafana/regexp/exec.go
(1 hunks)vendor/github.com/grafana/regexp/onepass.go
(1 hunks)vendor/github.com/grafana/regexp/regexp.go
(1 hunks)vendor/github.com/openzipkin/zipkin-go/LICENSE
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/annotation.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/doc.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/endpoint.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/kind.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/span.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/span_id.go
(0 hunks)vendor/github.com/openzipkin/zipkin-go/model/traceid.go
(0 hunks)vendor/github.com/prometheus/client_golang/prometheus/collectorfunc.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/desc.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/histogram.go
(6 hunks)vendor/github.com/prometheus/client_golang/prometheus/internal/difflib.go
(6 hunks)vendor/github.com/prometheus/client_golang/prometheus/internal/go_runtime_metrics.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/metric.go
(3 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.c
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_cgo_darwin.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_not_supported.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/process_collector_windows.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go
(7 hunks)vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go
(1 hunks)vendor/github.com/prometheus/client_golang/prometheus/promhttp/internal/compression.go
(2 hunks)vendor/github.com/prometheus/client_golang/prometheus/summary.go
(4 hunks)vendor/github.com/prometheus/client_golang/prometheus/vec.go
(5 hunks)vendor/github.com/prometheus/client_golang/prometheus/wrap.go
(2 hunks)vendor/github.com/prometheus/common/expfmt/text_parse.go
(1 hunks)vendor/github.com/prometheus/common/model/alert.go
(1 hunks)vendor/github.com/prometheus/common/model/labels.go
(2 hunks)vendor/github.com/prometheus/common/model/metric.go
(2 hunks)
⛔ Files not processed due to max files limit (54)
- vendor/github.com/prometheus/common/model/time.go
- vendor/github.com/prometheus/otlptranslator/.gitignore
- vendor/github.com/prometheus/otlptranslator/.golangci.yml
- vendor/github.com/prometheus/otlptranslator/CODE_OF_CONDUCT.md
- vendor/github.com/prometheus/otlptranslator/LICENSE
- vendor/github.com/prometheus/otlptranslator/MAINTAINERS.md
- vendor/github.com/prometheus/otlptranslator/README.md
- vendor/github.com/prometheus/otlptranslator/SECURITY.md
- vendor/github.com/prometheus/otlptranslator/constants.go
- vendor/github.com/prometheus/otlptranslator/metric_namer.go
- vendor/github.com/prometheus/otlptranslator/metric_type.go
- vendor/github.com/prometheus/otlptranslator/normalize_label.go
- vendor/github.com/prometheus/otlptranslator/strconv.go
- vendor/github.com/prometheus/otlptranslator/unit_namer.go
- vendor/github.com/prometheus/procfs/.golangci.yml
- vendor/github.com/prometheus/procfs/Makefile.common
- vendor/github.com/prometheus/procfs/README.md
- vendor/github.com/prometheus/procfs/arp.go
- vendor/github.com/prometheus/procfs/fs.go
- vendor/github.com/prometheus/procfs/fs_statfs_notype.go
- vendor/github.com/prometheus/procfs/fscache.go
- vendor/github.com/prometheus/procfs/internal/fs/fs.go
- vendor/github.com/prometheus/procfs/internal/util/parse.go
- vendor/github.com/prometheus/procfs/internal/util/sysreadfile.go
- vendor/github.com/prometheus/procfs/mdstat.go
- vendor/github.com/prometheus/procfs/meminfo.go
- vendor/github.com/prometheus/procfs/mountstats.go
- vendor/github.com/prometheus/procfs/net_dev_snmp6.go
- vendor/github.com/prometheus/procfs/net_ip_socket.go
- vendor/github.com/prometheus/procfs/net_protocols.go
- vendor/github.com/prometheus/procfs/net_tcp.go
- vendor/github.com/prometheus/procfs/net_unix.go
- vendor/github.com/prometheus/procfs/proc.go
- vendor/github.com/prometheus/procfs/proc_cgroup.go
- vendor/github.com/prometheus/procfs/proc_io.go
- vendor/github.com/prometheus/procfs/proc_netstat.go
- vendor/github.com/prometheus/procfs/proc_smaps.go
- vendor/github.com/prometheus/procfs/proc_snmp.go
- vendor/github.com/prometheus/procfs/proc_snmp6.go
- vendor/github.com/prometheus/procfs/proc_stat.go
- vendor/github.com/prometheus/procfs/proc_statm.go
- vendor/github.com/prometheus/procfs/proc_status.go
- vendor/github.com/prometheus/procfs/proc_sys.go
- vendor/github.com/prometheus/procfs/softirqs.go
- vendor/go.opencensus.io/plugin/ochttp/client.go
- vendor/go.opencensus.io/plugin/ochttp/client_stats.go
- vendor/go.opencensus.io/plugin/ochttp/doc.go
- vendor/go.opencensus.io/plugin/ochttp/propagation/b3/b3.go
- vendor/go.opencensus.io/plugin/ochttp/propagation/tracecontext/propagation.go
- vendor/go.opencensus.io/plugin/ochttp/route.go
- vendor/go.opencensus.io/plugin/ochttp/server.go
- vendor/go.opencensus.io/plugin/ochttp/span_annotating_client_trace.go
- vendor/go.opencensus.io/plugin/ochttp/stats.go
- vendor/go.opencensus.io/plugin/ochttp/trace.go
💤 Files with no reviewable changes (28)
- vendor/github.com/gorilla/websocket/tls_handshake.go
- vendor/github.com/openzipkin/zipkin-go/model/endpoint.go
- vendor/github.com/gorilla/websocket/tls_handshake_116.go
- vendor/github.com/openzipkin/zipkin-go/model/kind.go
- vendor/github.com/golang/protobuf/proto/registry.go
- vendor/github.com/golang/protobuf/proto/properties.go
- vendor/github.com/openzipkin/zipkin-go/model/traceid.go
- vendor/github.com/golang/protobuf/proto/defaults.go
- vendor/github.com/openzipkin/zipkin-go/model/span.go
- vendor/github.com/golang/protobuf/proto/text_decode.go
- vendor/github.com/openzipkin/zipkin-go/model/span_id.go
- vendor/github.com/golang/protobuf/proto/proto.go
- vendor/github.com/google/gofuzz/fuzz.go
- vendor/github.com/golang/protobuf/proto/deprecated.go
- test/util.go
- vendor/github.com/google/gofuzz/CONTRIBUTING.md
- vendor/github.com/openzipkin/zipkin-go/model/annotation.go
- vendor/github.com/golang/protobuf/proto/extensions.go
- cmd/webhook/main.go
- vendor/github.com/golang/protobuf/proto/wire.go
- vendor/github.com/openzipkin/zipkin-go/LICENSE
- vendor/github.com/golang/protobuf/proto/discard.go
- vendor/github.com/openzipkin/zipkin-go/model/doc.go
- vendor/github.com/golang/protobuf/proto/wrappers.go
- vendor/github.com/golang/protobuf/proto/text_encode.go
- vendor/github.com/gorilla/websocket/x_net_proxy.go
- vendor/github.com/google/gofuzz/.travis.yml
- vendor/github.com/golang/protobuf/proto/buffer.go
✅ Files skipped from review due to trivial changes (2)
- vendor/github.com/go-logr/zapr/.gitignore
- vendor/github.com/prometheus/common/model/metric.go
🚧 Files skipped from review as they are similar to previous changes (24)
- vendor/github.com/go-logr/zapr/.golangci.yaml
- vendor/github.com/prometheus/client_golang/prometheus/go_collector_latest.go
- vendor/github.com/prometheus/client_golang/prometheus/summary.go
- pkg/client/informers/externalversions/pipeline/v1alpha1/run.go
- pkg/client/resolution/informers/externalversions/resolution/v1beta1/resolutionrequest.go
- pkg/client/resolution/informers/externalversions/resolution/v1alpha1/resolutionrequest.go
- config/config-observability.yaml
- pkg/client/informers/externalversions/pipeline/v1/taskrun.go
- vendor/github.com/go-logr/zapr/zapr.go
- pkg/client/clientset/versioned/typed/pipeline/v1beta1/pipeline_client.go
- pkg/client/informers/externalversions/pipeline/v1beta1/task.go
- vendor/github.com/prometheus/client_golang/prometheus/desc.go
- vendor/github.com/go-logr/zapr/LICENSE
- pkg/reconciler/pipelinerun/pipelinerun.go
- vendor/github.com/grafana/regexp/LICENSE
- pkg/client/informers/externalversions/pipeline/v1beta1/pipelinerun.go
- pkg/client/resolution/clientset/versioned/fake/clientset_generated.go
- vendor/github.com/prometheus/client_golang/prometheus/process_collector_mem_nocgo_darwin.go
- vendor/github.com/prometheus/common/model/labels.go
- pkg/client/resolution/clientset/versioned/typed/resolution/v1beta1/resolution_client.go
- vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go
- vendor/github.com/go-logr/zapr/zapr_noslog.go
- vendor/github.com/grafana/regexp/README.md
- pkg/client/informers/externalversions/pipeline/v1beta1/customrun.go
🧰 Additional context used
🧬 Code graph analysis (7)
pkg/client/informers/externalversions/pipeline/v1beta1/stepaction.go (1)
pkg/client/informers/externalversions/pipeline/v1beta1/interface.go (1)
Interface
(26-39)
vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)
Histogram
(249-261)
vendor/github.com/prometheus/client_golang/prometheus/histogram.go (1)
vendor/github.com/prometheus/client_golang/prometheus/metric.go (1)
Metric
(33-59)
pkg/pipelinerunmetrics/metrics.go (2)
pkg/apis/config/metrics.go (4)
Metrics
(101-109)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/taskrunmetrics/metrics.go (3)
Recorder
(46-65)NewRecorder
(74-91)OnStore
(191-207)
pkg/taskrunmetrics/metrics_test.go (4)
pkg/taskrunmetrics/metrics.go (2)
Recorder
(46-65)NewRecorder
(74-91)pkg/names/generate.go (1)
SimpleNameGenerator
(48-48)pkg/apis/config/metrics.go (1)
Metrics
(101-109)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/taskrunmetrics/metrics.go (5)
pkg/apis/config/metrics.go (7)
Metrics
(101-109)TaskrunLevelAtTaskrun
(52-52)TaskrunLevelAtTask
(54-54)TaskrunLevelAtNS
(56-56)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/pipelinerunmetrics/metrics.go (3)
Recorder
(47-63)NewRecorder
(72-88)OnStore
(332-348)pkg/apis/pipeline/register.go (4)
PipelineTaskLabelKey
(47-47)TaskLabelKey
(29-29)PipelineLabelKey
(38-38)PipelineRunLabelKey
(41-41)vendor/go.opentelemetry.io/otel/attribute/set.go (1)
NewSet
(183-186)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/pipelinerunmetrics/metrics_test.go (2)
pkg/pipelinerunmetrics/metrics.go (2)
NewRecorder
(72-88)Recorder
(47-63)pkg/apis/config/metrics.go (4)
PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)Metrics
(101-109)
🪛 ast-grep (0.39.5)
vendor/github.com/gorilla/websocket/client.go
[warning] 501-501: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. Add
MinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(missing-ssl-minversion-go)
🪛 markdownlint-cli2 (0.18.1)
vendor/github.com/go-logr/zapr/README.md
10-10: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
duration := time.Since(pr.Status.StartTime.Time) | ||
if pr.Status.CompletionTime != nil { | ||
duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against nil StartTime in DurationAndCount
If StartTime is nil, this panics. Bail out or skip recording.
- duration := time.Since(pr.Status.StartTime.Time)
- if pr.Status.CompletionTime != nil {
- duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
- }
+ if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() {
+ return nil
+ }
+ duration := time.Since(pr.Status.StartTime.Time)
+ if pr.Status.CompletionTime != nil {
+ duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
duration := time.Since(pr.Status.StartTime.Time) | |
if pr.Status.CompletionTime != nil { | |
duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) | |
} | |
// Skip recording if StartTime is not set to avoid panics | |
if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() { | |
return nil | |
} | |
duration := time.Since(pr.Status.StartTime.Time) | |
if pr.Status.CompletionTime != nil { | |
duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) | |
} |
🤖 Prompt for AI Agents
In pkg/pipelinerunmetrics/metrics.go around lines 172-176, the code assumes
pr.Status.StartTime exists and uses pr.Status.StartTime.Time which can be
nil/zero and cause a panic; add a guard that checks if StartTime is nil (if it's
a pointer) or StartTime.IsZero() (if it's a metav1.Time value) and if so bail
out or skip recording the metric, otherwise compute duration as before and
continue; ensure the nil/zero check is performed before any use of
StartTime.Time.
if pr.Status.ChildReferences != nil { | ||
for _, childRef := range pr.Status.ChildReferences { | ||
if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" { | ||
waitingOnTaskCount++ | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over-counts “waiting on task resolution” per child; count each PipelineRun at most once
Incrementing inside the loop counts multiple TaskRuns per PR. If the intent is “number of PRs waiting…”, gate by a boolean.
- if pr.Status.ChildReferences != nil {
- for _, childRef := range pr.Status.ChildReferences {
- if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" {
- waitingOnTaskCount++
- }
- }
- }
+ if pr.Status.ChildReferences != nil {
+ counted := false
+ for _, cr := range pr.Status.ChildReferences {
+ if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" {
+ counted = true
+ break
+ }
+ }
+ if counted {
+ waitingOnTaskCount++
+ }
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if pr.Status.ChildReferences != nil { | |
for _, childRef := range pr.Status.ChildReferences { | |
if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" { | |
waitingOnTaskCount++ | |
} | |
} | |
} | |
if pr.Status.ChildReferences != nil { | |
counted := false | |
for _, cr := range pr.Status.ChildReferences { | |
if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" { | |
counted = true | |
break | |
} | |
} | |
if counted { | |
waitingOnTaskCount++ | |
} | |
} |
🤖 Prompt for AI Agents
pkg/pipelinerunmetrics/metrics.go around lines 232 to 238: the loop currently
increments waitingOnTaskCount for every TaskRun childReference, which
over-counts when a PipelineRun has multiple TaskRuns; change the logic to count
each PipelineRun at most once by using a boolean flag (e.g., foundWaitingTask)
initialized false before the inner loop, set it true when you encounter the
first childRef with Kind == "TaskRun" and PipelineTaskName != "", break out of
the inner loop immediately, and after the loop increment waitingOnTaskCount only
if the flag is true.
r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status))) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align TaskRun total counter attributes; respect CountWithReason
Currently only “status” is attached. Include namespace and optionally reason depending on cfg.CountWithReason to avoid uncontrolled cardinality and to match PR totals.
- r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status)))
+ countAttrs := []attribute.KeyValue{
+ attribute.String("namespace", tr.Namespace),
+ attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+ countAttrs = append(countAttrs, attribute.String("reason", reason))
+ }
+ r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status))) | |
// build attributes for the total counter | |
countAttrs := []attribute.KeyValue{ | |
attribute.String("namespace", tr.Namespace), | |
attribute.String("status", status), | |
} | |
if r.cfg != nil && r.cfg.CountWithReason { | |
countAttrs = append(countAttrs, attribute.String("reason", reason)) | |
} | |
r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...)) |
🤖 Prompt for AI Agents
In pkg/taskrunmetrics/metrics.go around lines 307 to 308, the call to
r.trTotalCounter.Add only attaches the "status" attribute; update it to also
include the TaskRun's namespace (attribute.String("namespace", namespace)) and,
if cfg.CountWithReason is true, include the reason attribute
(attribute.String("reason", reason)); ensure the code reads cfg.CountWithReason
and conditionally appends the reason attribute to the slice passed into
metric.WithAttributes so we avoid uncontrolled cardinality when CountWithReason
is disabled.
// Copyright 2009 The Go Authors. All rights reserved. | ||
// Use of this source code is governed by a BSD-style | ||
// license that can be found in the LICENSE file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Vendor license/NOTICE verification
This file carries Go Authors BSD header. Confirm LICENSE for this vendored module is included and compliant (BSD 3‑Clause), and repository NOTICE/THIRD_PARTY files updated if required.
🏁 Script executed:
#!/bin/bash
# Check for license files alongside the vendored package and at repo root
fd -a -H -I '^LICENSE(|\.md|\.txt)?$' vendor/github.com/grafana/regexp -x ls -l {} || true
fd -a -H -I 'LICENSE*|NOTICE*|THIRD*PARTY*' -d 1 . -x ls -l {} || true
# Grep for BSD in vendor license (if present)
rg -n --hidden -S 'BSD' vendor/github.com/grafana/regexp || true
Length of output: 583
🏁 Script executed:
#!/bin/bash
# Search entire repo for NOTICE* or THIRD_PARTY* files
fd -H -I 'NOTICE*|THIRD*PARTY*'
Length of output: 622
Add NOTICE reference for vendored grafana/regexp
The module’s BSD-style LICENSE exists at vendor/github.com/grafana/regexp/LICENSE, but there’s no corresponding NOTICE or THIRD_PARTY entry in the repo. Add a NOTICE file under vendor/github.com/grafana/regexp or update the project’s NOTICE/THIRD_PARTY documentation to include grafana/regexp.
🤖 Prompt for AI Agents
In vendor/github.com/grafana/regexp/regexp.go around lines 1 to 3, the vendored
grafana/regexp package is missing a NOTICE/THIRD_PARTY entry; add a NOTICE file
under vendor/github.com/grafana/regexp (or update the repository-level NOTICE or
THIRD_PARTY file) that cites the module name, the BSD-style license and path to
vendor/github.com/grafana/regexp/LICENSE, and any required attribution text per
the license; ensure the new NOTICE includes the package name, copyright holder,
and a pointer to the bundled LICENSE, or alternatively add an entry to the
project-wide THIRD_PARTY/NOTICE listing with the same information.
func ExponentialBucketsRange(minBucket, maxBucket float64, count int) []float64 { | ||
if count < 1 { | ||
panic("ExponentialBucketsRange count needs a positive count") | ||
} | ||
if min <= 0 { | ||
if minBucket <= 0 { | ||
panic("ExponentialBucketsRange min needs to be greater than 0") | ||
} | ||
|
||
// Formula for exponential buckets. | ||
// max = min*growthFactor^(bucketCount-1) | ||
|
||
// We know max/min and highest bucket. Solve for growthFactor. | ||
growthFactor := math.Pow(max/min, 1.0/float64(count-1)) | ||
growthFactor := math.Pow(maxBucket/minBucket, 1.0/float64(count-1)) | ||
|
||
// Now that we know growthFactor, solve for each bucket. | ||
buckets := make([]float64, count) | ||
for i := 1; i <= count; i++ { | ||
buckets[i-1] = min * math.Pow(growthFactor, float64(i-1)) | ||
buckets[i-1] = minBucket * math.Pow(growthFactor, float64(i-1)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExponentialBucketsRange: missing input validation for max > min (and count == 1 edge).
Without checks, callers can get non‑increasing buckets (later panics) or odd results for count==1 when min!=max. Add explicit guards.
Apply this diff:
func ExponentialBucketsRange(minBucket, maxBucket float64, count int) []float64 {
if count < 1 {
panic("ExponentialBucketsRange count needs a positive count")
}
if minBucket <= 0 {
panic("ExponentialBucketsRange min needs to be greater than 0")
}
+ if count == 1 {
+ if maxBucket != minBucket {
+ panic("ExponentialBucketsRange with count==1 requires max == min")
+ }
+ return []float64{minBucket}
+ }
+ if !(maxBucket > minBucket) {
+ panic("ExponentialBucketsRange requires max > min when count > 1")
+ }
// Formula for exponential buckets.
// max = min*growthFactor^(bucketCount-1)
// We know max/min and highest bucket. Solve for growthFactor.
growthFactor := math.Pow(maxBucket/minBucket, 1.0/float64(count-1))
// Now that we know growthFactor, solve for each bucket.
buckets := make([]float64, count)
for i := 1; i <= count; i++ {
buckets[i-1] = minBucket * math.Pow(growthFactor, float64(i-1))
}
return buckets
}
🤖 Prompt for AI Agents
In vendor/github.com/prometheus/client_golang/prometheus/histogram.go around
lines 339 to 357, add input validation: ensure maxBucket is greater than
minBucket (panic if maxBucket <= minBucket) and handle the count==1 edge
explicitly (if count == 1 then only allow it when minBucket == maxBucket and
return a single-element slice, otherwise panic). Keep the existing checks for
count<1 and minBucket<=0, and perform the new validations before computing
growthFactor so we never compute invalid or non‑increasing buckets.
//go:build !windows && !js && !wasip1 && !darwin | ||
// +build !windows,!js,!wasip1,!darwin | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Darwin excluded: verify alternate implementation exists.
Excluding darwin here is fine if a darwin-specific file covers Describe/Collect; otherwise mac builds will miss these metrics.
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Process collector variants in vendor:"
fd -t f 'process_collector*' vendor/github.com/prometheus/client_golang/prometheus | sort
Length of output: 726
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== process_collector_darwin.go (first 200 lines) ==="
sed -n '1,200p' vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go
echo
echo "=== Searching for Describe/Collect implementations ==="
rg -nP '\bfunc \(.*ProcessCollector.*\)\s+(Describe|Collect)' -C2 vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go || true
Length of output: 4308
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== process_collector.go (first 200 lines) ==="
sed -n '1,200p' vendor/github.com/prometheus/client_golang/prometheus/process_collector.go
echo
for file in vendor/github.com/prometheus/client_golang/prometheus/process_collector*.go; do
echo "=== In $file ==="
rg -nP 'func \\(.*processCollector.*\\) (Describe|Collect)' -C1 "$file" || true
done
Length of output: 5967
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Searching for describe implementations in all variants ==="
rg -nP 'func \\(c \\*processCollector\\) describe' -C2 vendor/github.com/prometheus/client_golang/prometheus/process_collector*.go || true
Length of output: 62
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== process_collector_procfsenabled.go (first 200 lines) ==="
sed -n '1,200p' vendor/github.com/prometheus/client_golang/prometheus/process_collector_procfsenabled.go
Length of output: 2972
Implement Describe in process_collector_darwin.go: that file defines processCollect but no describe method, so macOS builds won’t emit any descriptors—add a matching func (c *processCollector) describe to expose metrics on Darwin.
🤖 Prompt for AI Agents
In
vendor/github.com/prometheus/client_golang/prometheus/process_collector_darwin.go
around where processCollect is defined (lines near 14–16), there is no Describe
method implemented so Darwin builds won't expose metric descriptors; add a
method with signature func (c *processCollector) Describe(ch chan<-
*prometheus.Desc) that sends each metric descriptor contained in the
processCollector to the provided channel (mirror the Describe implementation
from the non-Darwin variant), ensuring the collector satisfies
prometheus.Collector and that all descriptors used in Collect/ProcessCollect are
emitted.
if internal.NewZstdWriter == nil { | ||
// The content encoding was not implemented yet. | ||
return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats()) | ||
} | ||
|
||
z.Reset(rw) | ||
return z, selected, func() { _ = z.Close() }, nil | ||
writer, closeWriter, err := internal.NewZstdWriter(rw) | ||
return writer, selected, closeWriter, err | ||
case "gzip": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compile-time bug: wrong fmt verb for slice argument.
defaultCompressionFormats()
returns []Compression
, but the error strings use %s
(string), causing a compile error. Use %v
or join into a string.
Apply this minimal fix:
- return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats())
+ return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %v", selected, defaultCompressionFormats())
and
- return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats())
+ return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %v", selected, defaultCompressionFormats())
Optionally, for nicer output:
+ // helper to stringify defaults for messages
+ valid := make([]string, 0, len(defaultCompressionFormats()))
+ for _, c := range defaultCompressionFormats() { valid = append(valid, string(c)) }
- return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, defaultCompressionFormats())
+ return nil, "", func() {}, fmt.Errorf("content compression format not recognized: %s. Valid formats are: %s", selected, strings.Join(valid, ", "))
(Remember to import "strings"
if using the optional improvement.)
Also applies to: 490-491
🤖 Prompt for AI Agents
In vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go around
lines 475-481 (and similarly at 490-491), the error message uses the %s verb
while defaultCompressionFormats() returns a []Compression slice, causing a
compile error; change the format verb to %v (or convert the slice to a joined
string and import "strings" if you prefer more readable output) so the slice is
printed correctly and update both error calls consistently.
The 'hack/update-codegen.sh' script was failing with 'streaming_list_type_json_tags' API rule violations. This was caused by incorrect JSON tags on list types that prevented them from being correctly interpreted as streaming lists by the Kubernetes API. This commit fixes the following: - Removes ',omitempty' from the 'Items' field in the v1 and v1beta1 PipelineRunList structs. - Adds ',omitempty' to the 'ListMeta' field in the v1beta1 ResolutionRequestList struct.
This isn't supported after knative otel's migration.
bab5a44
to
a2ea129
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
go.mod (1)
23-23
: Do not drop OpenCensus deps before migrating tests
Migrate or remove the OpenCensus imports in test/wait.go (line 54) and test/custom_task_test.go (line 38) (both importgo.opencensus.io/trace
) before removinggo.opencensus.io
andcontrib.go.opencensus.io
from go.mod.pkg/taskrunmetrics/metrics.go (1)
276-283
: Harden DurationAndCount: nil StartTime/Condition, and align counter attributesPrevent panics and align counter attributes with config to control cardinality.
- r.mutex.Lock() - defer r.mutex.Unlock() - - duration := time.Since(tr.Status.StartTime.Time) - if tr.Status.CompletionTime != nil { - duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time) - } + r.mutex.Lock() + defer r.mutex.Unlock() + if tr.Status.StartTime == nil || tr.Status.StartTime.IsZero() { + return nil + } + duration := time.Since(tr.Status.StartTime.Time) + if tr.Status.CompletionTime != nil { + duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time) + } @@ - cond := tr.Status.GetCondition(apis.ConditionSucceeded) + cond := tr.Status.GetCondition(apis.ConditionSucceeded) + if cond == nil { + return nil + } @@ - attrs := []attribute.KeyValue{ - attribute.String("namespace", tr.Namespace), - attribute.String("status", status), - attribute.String("reason", reason), - } + attrs := []attribute.KeyValue{ + attribute.String("namespace", tr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + attrs = append(attrs, attribute.String("reason", reason)) + } @@ - r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status))) + countAttrs := []attribute.KeyValue{ + attribute.String("namespace", tr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + countAttrs = append(countAttrs, attribute.String("reason", reason)) + } + r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))Also applies to: 286-297, 307-308
🧹 Nitpick comments (15)
config/300-crds/300-taskrun.yaml (1)
4739-4744
: New Sidecar stopSignal: verify controller support and consider parity
- Please confirm the controller/plumbing maps this field into Pod specs for the targeted Kubernetes version and that any required feature gates are handled in install docs/manifests.
- Consider adding the same field for Step/StepTemplate to avoid API asymmetry.
- Optional: add a schema pattern or enum guidance (e.g., SIGTERM or numeric) to prevent invalid values.
config/300-crds/300-task.yaml (1)
692-697
: New lifecycle.stopSignal: add version/feature‑gate note and basic validation hint.The field matches upstream K8s (alpha in v1.33 behind ContainerStopSignals; requires .spec.os.name). Suggest making this explicit in the description to prevent unexpected Pod rejections on clusters without the gate, and optionally add a light format check to catch typos (e.g., “^SIG[A-Z0-9]+$”). (kubernetes.io)
Proposed doc tweak (apply to each stopSignal description below):
- StopSignal defines which signal will be sent to a container when it is being stopped. - If not specified, the default is defined by the container runtime in use. - StopSignal can only be set for Pods with a non-empty .spec.os.name + StopSignal defines which signal will be sent to a container when it is being stopped. + If not specified, the default is defined by the container runtime in use. + Requires Kubernetes v1.33+ with the ContainerStopSignals feature gate enabled on apiserver and kubelet, + and can only be set for Pods with a non-empty .spec.os.name. On Windows nodes, only SIGTERM or SIGKILL are allowed.Optional CEL validation (if you want an early guard in the CRD; safe to defer):
+ x-kubernetes-validations: + - rule: "self.matches('^SIG[A-Z0-9]+$')" + message: "stopSignal must look like a POSIX signal name, e.g., SIGTERM, SIGUSR1"To check cluster support before surfacing this in Tekton-generated Pods, ensure your controller only forwards stopSignal when the API server’s ContainerStopSignals gate is on. (kubernetes.io)
Also applies to: 2061-2067, 3391-3397, 5123-5127
go.mod (1)
95-96
: Prefer a tagged release for gorilla/websocket over a pseudo-version.Pinning
github.com/gorilla/websocket
to a commit pseudo-version can hurt reproducibility and vulnerability tracking. Prefer the latest stable tag unless you need an unreleased fix.Check for a suitable tag providing your fix and update if possible.
pkg/pipelinerunmetrics/metrics_test.go (3)
183-189
: Optional: Assert recorded values using a manual reader (like below tests).Here you only check initialization. For stronger coverage, wire a
metric.NewManualReader
and assert one duration datapoint and counter increment as done in the “AtAllLevels” test.
301-315
: Isolate global state in tests (MeterProvider/recorder).This test mutates package globals (
otel
provider andr
). Addt.Cleanup
to restore the original MeterProvider and reset the global recorder/once to avoid cross‑test bleed.Example:
orig := otel.GetMeterProvider() t.Cleanup(func() { otel.SetMeterProvider(orig) r = nil once = sync.Once{} })Note: import
sync
and ensureonce
is accessible in this package.
364-383
: This test doesn’t assert anything about resolution-wait metrics.It only checks initialization/no error. Consider adding a manual reader and asserting
*_waiting_on_{pipeline,task}_resolution
counters produce expected datapoints.pkg/taskrunmetrics/metrics_test.go (2)
222-227
: Clean up global state in OTEL tests.These tests reset
once
and set a global MeterProvider but don’t restore it. Addt.Cleanup
to restore the previous provider and ensure deterministic behavior across tests.Example:
orig := otel.GetMeterProvider() t.Cleanup(func() { otel.SetMeterProvider(orig) once = sync.Once{} })Also applies to: 318-323
431-449
: Minor: fix typos in field names in table tests.
expetectedPipeline
/expetectedPipelineRun
are misspelled. Consider renaming toexpectedPipeline
/expectedPipelineRun
for clarity.Also applies to: 462-468
config/config-observability.yaml (3)
95-101
: Document allowed values and set a safe default for metrics.running-pipelinerun.levelEmpty string is ambiguous and can surprise users. Recommend documenting allowed values and defaulting to "namespace" to curb cardinality.
- # Tekton-specific metrics configuration + # Tekton-specific metrics configuration + # metrics.running-pipelinerun.level controls attribute granularity for the running gauge. + # Allowed values: "pipelinerun" | "pipeline" | "namespace" + # Default: "namespace" (lowest cardinality) - metrics.running-pipelinerun.level: "" + metrics.running-pipelinerun.level: "namespace"
40-74
: Clarify exporter semantics vs. controller reporting periodThese keys configure OTEL exporter/SDK behavior but not the controller’s internal 30s reporting loop (ReportingPeriod). Add a note to avoid confusion.
metrics-export-interval: "" + # Note: This is the OTEL SDK/export interval, not the Tekton controller's + # internal ReportingPeriod for periodic gauges. Those remain 30s unless + # wired explicitly in code.
95-101
: Add example knob for namespace label on throttle metricsTaskRun code reads cfg.ThrottleWithNamespace, but this example omits it. Add for discoverability.
metrics.count.enable-reason: "false" + # When true, throttle metrics will include the "namespace" attribute. + metrics.throttle.with-namespace: "false"pkg/pipelinerunmetrics/metrics.go (2)
245-254
: Avoid holding the lock while building attr sets; capture level before loopYou only need cfg to decide attribute shape. Read it once, then loop without the mutex.
- switch r.cfg.RunningPipelinerunLevel { + level := "" + if r.cfg != nil { + level = r.cfg.RunningPipelinerunLevel + } + switch level {
292-319
: getPipelineTagName: consider label-based fallbackIf a resolved PipelineRef name is injected via labels (e.g., PipelineLabelKey), you might prefer that over “anonymous” to improve attribution. Optional.
pkg/taskrunmetrics/metrics.go (2)
389-407
: Shrink lock scope in RecordPodLatency (optional)Lock is not needed for computing timestamps/attrs; hold it only for the Record call.
- r.mutex.Lock() - defer r.mutex.Unlock() - scheduledTime := getScheduledTime(pod) @@ - attrs := []attribute.KeyValue{ + attrs := []attribute.KeyValue{ attribute.String("namespace", tr.Namespace), attribute.String("pod", pod.Name), } attrs = append(attrs, r.insertTaskTag(taskName, tr.Name)...) - r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...)) + r.mutex.Lock() + r.podLatencyHistogram.Record(ctx, float64(latency.Milliseconds()), metric.WithAttributes(attrs...)) + r.mutex.Unlock()
25-34
: Prefer a single errors package for consistencyFile uses github.com/pkg/errors while PipelineRun metrics uses stdlib errors. Consider standardizing (stdlib is sufficient here).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
cmd/webhook/main.go
(0 hunks)config/300-crds/300-pipelinerun.yaml
(0 hunks)config/300-crds/300-task.yaml
(16 hunks)config/300-crds/300-taskrun.yaml
(7 hunks)config/config-observability.yaml
(2 hunks)go.mod
(8 hunks)pkg/apis/pipeline/v1/openapi_generated.go
(1 hunks)pkg/apis/pipeline/v1/pipelinerun_types.go
(1 hunks)pkg/apis/pipeline/v1/swagger.json
(1 hunks)pkg/apis/pipeline/v1beta1/openapi_generated.go
(1 hunks)pkg/apis/pipeline/v1beta1/pipelinerun_types.go
(1 hunks)pkg/apis/pipeline/v1beta1/swagger.json
(1 hunks)pkg/apis/resolution/v1beta1/resolution_request_types.go
(1 hunks)pkg/pipelinerunmetrics/metrics.go
(3 hunks)pkg/pipelinerunmetrics/metrics_test.go
(5 hunks)pkg/taskrunmetrics/metrics.go
(6 hunks)pkg/taskrunmetrics/metrics_test.go
(8 hunks)test/util.go
(0 hunks)
💤 Files with no reviewable changes (3)
- config/300-crds/300-pipelinerun.yaml
- cmd/webhook/main.go
- test/util.go
🧰 Additional context used
🧬 Code graph analysis (6)
pkg/apis/pipeline/v1beta1/pipelinerun_types.go (1)
pkg/apis/pipeline/v1/pipelinerun_types.go (2)
PipelineRun
(51-60)PipelineRun
(73-75)
pkg/apis/pipeline/v1/pipelinerun_types.go (1)
pkg/apis/pipeline/v1beta1/pipelinerun_types.go (2)
PipelineRun
(50-59)PipelineRun
(72-74)
pkg/pipelinerunmetrics/metrics_test.go (3)
pkg/pipelinerunmetrics/metrics.go (2)
NewRecorder
(72-88)Recorder
(47-63)pkg/names/generate.go (1)
SimpleNameGenerator
(48-48)pkg/apis/config/metrics.go (4)
PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)Metrics
(101-109)
pkg/taskrunmetrics/metrics.go (3)
pkg/apis/config/metrics.go (7)
Metrics
(101-109)TaskrunLevelAtTaskrun
(52-52)TaskrunLevelAtTask
(54-54)TaskrunLevelAtNS
(56-56)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/pipelinerunmetrics/metrics.go (3)
Recorder
(47-63)NewRecorder
(72-88)OnStore
(332-348)pkg/apis/pipeline/register.go (4)
PipelineTaskLabelKey
(47-47)TaskLabelKey
(29-29)PipelineLabelKey
(38-38)PipelineRunLabelKey
(41-41)
pkg/pipelinerunmetrics/metrics.go (3)
pkg/apis/pipeline/v1/pipelinerun_types.go (3)
PipelineRunReasonCancelled
(345-345)PipelineRun
(51-60)PipelineRun
(73-75)pkg/apis/config/metrics.go (4)
Metrics
(101-109)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/taskrunmetrics/metrics.go (3)
Recorder
(46-65)NewRecorder
(74-91)OnStore
(191-207)
pkg/taskrunmetrics/metrics_test.go (3)
pkg/taskrunmetrics/metrics.go (2)
Recorder
(46-65)NewRecorder
(74-91)pkg/apis/config/metrics.go (1)
Metrics
(101-109)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (16)
pkg/apis/resolution/v1beta1/resolution_request_types.go (1)
54-54
: LGTM:omitempty
matches core K8s list conventions.Making
metadata
optional keeps the JSON shape aligned with Kubernetes list types and avoids emitting empty objects. No further action needed.pkg/apis/pipeline/v1/openapi_generated.go (1)
1253-1254
: Requireitems
in the schemaGood call on marking
items
as required—this mirrors the Go struct (json:"items"
) and matches Kubernetes list conventions so clients now get an accurate contract.config/300-crds/300-taskrun.yaml (3)
4472-4512
: EnvFromSource description broadened: LGTMAccurately reflects ConfigMap/Secret sources; prefix wording matches K8s docs. No schema impact.
5835-5875
: StepTemplate envFrom wording: LGTMClarifies CM/Secret usage and prefix; consistent with core API semantics.
6390-6430
: Steps envFrom wording: LGTMSame clarity improvements as above; consistent and non-breaking.
config/300-crds/300-task.yaml (1)
425-425
: EnvFromSource description tweak reads well; repo-wide consistency confirmed. Found matching descriptions in config/300-crds/300-task.yaml (lines 425, 1790, 3122, 4855, 6218, 6773) and config/300-crds/300-taskrun.yaml (lines 4472, 5835, 6390). No other CRDs include this field.pkg/pipelinerunmetrics/metrics_test.go (1)
265-276
: Tests won’t compile:attribute.Set
is not a valid map key; use a canonical string key.Maps keyed by
attribute.Set
are invalid/unreliable. Convert expected/got tomap[string]float64
using a stable key derived from the attribute set.Apply:
@@ - expected map[attribute.Set]float64 + expected map[string]float64 @@ - expected: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1")): 1, - attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr2")): 1, - attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr3")): 1, - attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr6")): 1, - attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr7")): 1, - attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3"), attribute.String("pipelinerun", "pr8")): 1, - }, + expected: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr2"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"), attribute.String("pipelinerun", "pr3"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr6"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr7"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3"), attribute.String("pipelinerun", "pr8"))): 1, + }, @@ - expected: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1")): 1, - attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2")): 2, - attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1")): 2, - attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3")): 1, - }, + expected: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline2"))): 2, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline1"))): 2, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"), attribute.String("pipeline", "pipeline3"))): 1, + }, @@ - expected: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns1")): 3, - attribute.NewSet(attribute.String("namespace", "testns2")): 3, - }, + expected: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"))): 3, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 3, + }, @@ - expected: map[attribute.Set]float64{ - attribute.NewSet(): 6, - }, + expected: map[string]float64{ + attrKey(attribute.NewSet()): 6, + }, @@ - got := make(map[attribute.Set]float64) + got := make(map[string]float64) sum, ok := runningPRsMetric.Data.(metricdata.Sum[float64]) if !ok { t.Fatalf("metric data is not a Sum[float64]: %T", runningPRsMetric.Data) } for _, dp := range sum.DataPoints { - got[dp.Attributes] = dp.Value + got[attrKey(dp.Attributes)] = dp.Value }Add helper and imports:
@@ -import ( +import ( + "sort" + "strings" @@ )Place in this file:
// attrKey builds a deterministic key from an attribute set. func attrKey(s attribute.Set) string { kvs := s.ToSlice() sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) }) var b strings.Builder for _, kv := range kvs { b.WriteString(string(kv.Key)) b.WriteByte('=') b.WriteString(kv.Value.Emit()) b.WriteByte('|') } return b.String() }Also applies to: 279-285, 289-293, 296-299, 348-355
pkg/taskrunmetrics/metrics_test.go (2)
292-314
: Tests won’t compile/rely on non-comparableattribute.Set
keys; use canonical strings.Convert expected/got maps to
map[string]float64
and stringify attribute sets.Apply:
@@ - expectedQuota map[attribute.Set]float64 - expectedNode map[attribute.Set]float64 + expectedQuota map[string]float64 + expectedNode map[string]float64 @@ - expectedQuota: map[attribute.Set]float64{ - attribute.NewSet(): 3, - }, + expectedQuota: map[string]float64{ + attrKey(attribute.NewSet()): 3, + }, - expectedNode: map[attribute.Set]float64{ - attribute.NewSet(): 2, - }, + expectedNode: map[string]float64{ + attrKey(attribute.NewSet()): 2, + }, @@ - expectedQuota: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns1")): 2, - attribute.NewSet(attribute.String("namespace", "testns2")): 1, - }, + expectedQuota: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"))): 2, + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1, + }, - expectedNode: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns2")): 1, - attribute.NewSet(attribute.String("namespace", "testns3")): 1, - }, + expectedNode: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1, + attrKey(attribute.NewSet(attribute.String("namespace", "testns3"))): 1, + }, @@ - gotQuota := make(map[attribute.Set]float64) - gotNode := make(map[attribute.Set]float64) + gotQuota := make(map[string]float64) + gotNode := make(map[string]float64) @@ - gotQuota[dp.Attributes] = dp.Value + gotQuota[attrKey(dp.Attributes)] = dp.Value @@ - gotNode[dp.Attributes] = dp.Value + gotNode[attrKey(dp.Attributes)] = dp.ValueAdd helper/imports (if not already present in this file):
@@ -import ( +import ( + "sort" + "strings" @@ )func attrKey(s attribute.Set) string { kvs := s.ToSlice() sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) }) var b strings.Builder for _, kv := range kvs { b.WriteString(string(kv.Key)) b.WriteByte('=') b.WriteString(kv.Value.Emit()) b.WriteByte('|') } return b.String() }Also applies to: 338-356, 358-364
418-423
: Incorrect assumption aboutRecordPodLatency
signature
The method in pkg/taskrunmetrics/metrics.go is defined asfunc (r *Recorder) RecordPodLatency(ctx context.Context, pod *corev1.Pod, tr *v1.TaskRun) errorso the test calls using
(ctx, pod, taskRun)
already match the signature and require no change.Likely an incorrect or invalid review comment.
go.mod (2)
141-145
: Resolve AWS ECR replace/require divergence.I ran
rg -n 'replace .*aws-sdk-go-v2' go.mod rg -n 'github.com/aws/aws-sdk-go-v2/service/(ecr|ecrpublic)' go.modand found no other conflicts. Please manually verify there are no additional replace or require entries for these modules.
Either:
- Keep the
replace
and remove the explicitrequire
entries forservice/ecr
andservice/ecrpublic
, or- Drop the
replace
once upstreams are compatible and require the intended versions directly.Proposed cleanup (if keeping replace):
require ( @@ - github.com/aws/aws-sdk-go-v2/service/ecr v1.18.11 // indirect - github.com/aws/aws-sdk-go-v2/service/ecrpublic v1.16.2 // indirect + # removed: versions controlled by replace block )Add a short comment citing the upstream issue/PR and set a TODO with a re-evaluation date.
3-3
: Ignore patch version warning for thego
directive
Thego
directive in go.mod supports full release versions including patch (e.g.,go 1.24.0
) (tip.golang.org).Likely an incorrect or invalid review comment.
pkg/pipelinerunmetrics/metrics.go (3)
205-218
: Don’t hold the mutex while calling lister.List(); shorten the critical sectionMove the expensive list outside the critical section to avoid blocking updates/dynamic reconfig.
- r.mutex.Lock() - defer r.mutex.Unlock() - - prs, err := lister.List(labels.Everything()) + prs, err := lister.List(labels.Everything()) if err != nil { return fmt.Errorf("failed to list pipelineruns: %w", err) } + // lock only when reading/updating recorder state (below)
219-238
: Count “waiting on task resolution” at most once per PipelineRunCurrent loop increments per TaskRun child and inflates the count.
- if pr.Status.ChildReferences != nil { - for _, childRef := range pr.Status.ChildReferences { - if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" { - waitingOnTaskCount++ - } - } - } + if pr.Status.ChildReferences != nil { + counted := false + for _, cr := range pr.Status.ChildReferences { + if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" { + counted = true + break + } + } + if counted { + waitingOnTaskCount++ + } + }
261-268
: Fix UpDownCounter usage: add deltas, not totals (running + waiting gauges)Adding absolute totals on each scrape makes gauges monotonically increase. Track the previous snapshot (per-attribute for running) and emit deltas; reset on instrument rebuild.
- // Report running_pipelineruns. - for attrSet, value := range currentCounts { - r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } - r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes()) - r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes()) + r.mutex.Lock() + defer r.mutex.Unlock() + + // Emit per-attr deltas and update cache. + if r.runningPrev == nil { + r.runningPrev = make(map[attribute.Set]float64) + } + // union of keys: negative deltas for disappeared keys + seen := make(map[attribute.Set]struct{}) + for set, curr := range currentCounts { + prev := r.runningPrev[set] + delta := curr - prev + if delta != 0 { + r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(set.ToSlice()...)) + } + r.runningPrev[set] = curr + seen[set] = struct{}{} + } + for set, prev := range r.runningPrev { + if _, ok := seen[set]; !ok && prev != 0 { + // went to zero, emit negative delta + r.runningPRsGauge.Add(ctx, -prev, metric.WithAttributes(set.ToSlice()...)) + delete(r.runningPrev, set) + } + } + + // cluster-wide waiting gauges + if delta := waitingOnPipelineCount - r.waitingOnPipelinePrev; delta != 0 { + r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, delta, metric.WithAttributes()) + r.waitingOnPipelinePrev = waitingOnPipelineCount + } + if delta := waitingOnTaskCount - r.waitingOnTaskPrev; delta != 0 { + r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, delta, metric.WithAttributes()) + r.waitingOnTaskPrev = waitingOnTaskCount + }And extend Recorder + reset on init:
type Recorder struct { @@ runningPRsWaitingOnTaskResolutionGauge metric.Float64UpDownCounter @@ + // cached snapshots for delta-based UpDownCounter updates + runningPrev map[attribute.Set]float64 + waitingOnPipelinePrev float64 + waitingOnTaskPrev float64 } @@ func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { @@ r.runningPRsWaitingOnTaskResolutionGauge = runningPRsWaitingOnTaskResolutionGauge + + // reset caches on rebuild + r.runningPrev = make(map[attribute.Set]float64) + r.waitingOnPipelinePrev = 0 + r.waitingOnTaskPrev = 0pkg/taskrunmetrics/metrics.go (2)
318-325
: Move lister.List() outside the critical sectionReduce lock contention; only lock when reading/updating recorder state.
- r.mutex.Lock() - defer r.mutex.Unlock() - - trs, err := lister.List(labels.Everything()) + trs, err := lister.List(labels.Everything()) if err != nil { return err } + // lock later only for shared state updates
326-351
: Fix UpDownCounter usage for running/throttled TaskRuns: emit deltas and handle attr-set churnTotals on each tick inflate gauges. Track previous snapshots per attribute set and add only the delta; reset caches on instrument rebuild.
- runningTrs := 0 - trsThrottledByQuota := make(map[attribute.Set]float64) - trsThrottledByNode := make(map[attribute.Set]float64) + runningTrs := 0 + trsThrottledByQuota := make(map[attribute.Set]float64) + trsThrottledByNode := make(map[attribute.Set]float64) @@ - r.runningTRsGauge.Add(ctx, float64(runningTrs)) - for attrSet, value := range trsThrottledByQuota { - r.runningTRsThrottledByQuotaGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } - for attrSet, value := range trsThrottledByNode { - r.runningTRsThrottledByNodeGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } + r.mutex.Lock() + defer r.mutex.Unlock() + // running + deltaRun := float64(runningTrs) - r.lastRunningTRs + if deltaRun != 0 { + r.runningTRsGauge.Add(ctx, deltaRun) + r.lastRunningTRs = float64(runningTrs) + } + // throttled by quota + if r.lastThrottledByQuota == nil { + r.lastThrottledByQuota = make(map[attribute.Set]float64) + } + seen := make(map[attribute.Set]struct{}) + for set, curr := range trsThrottledByQuota { + prev := r.lastThrottledByQuota[set] + if d := curr - prev; d != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, d, metric.WithAttributes(set.ToSlice()...)) + } + r.lastThrottledByQuota[set] = curr + seen[set] = struct{}{} + } + for set, prev := range r.lastThrottledByQuota { + if _, ok := seen[set]; !ok && prev != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, -prev, metric.WithAttributes(set.ToSlice()...)) + delete(r.lastThrottledByQuota, set) + } + } + // throttled by node + if r.lastThrottledByNode == nil { + r.lastThrottledByNode = make(map[attribute.Set]float64) + } + seen = make(map[attribute.Set]struct{}) + for set, curr := range trsThrottledByNode { + prev := r.lastThrottledByNode[set] + if d := curr - prev; d != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, d, metric.WithAttributes(set.ToSlice()...)) + } + r.lastThrottledByNode[set] = curr + seen[set] = struct{}{} + } + for set, prev := range r.lastThrottledByNode { + if _, ok := seen[set]; !ok && prev != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, -prev, metric.WithAttributes(set.ToSlice()...)) + delete(r.lastThrottledByNode, set) + } + }And extend Recorder + reset on init:
type Recorder struct { @@ podLatencyHistogram metric.Float64Histogram @@ + // caches for delta-based updates + lastRunningTRs float64 + lastThrottledByQuota map[attribute.Set]float64 + lastThrottledByNode map[attribute.Set]float64 } @@ func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { @@ r.podLatencyHistogram = podLatencyHistogram + + // reset caches on rebuild + r.lastRunningTRs = 0 + r.lastThrottledByQuota = make(map[attribute.Set]float64) + r.lastThrottledByNode = make(map[attribute.Set]float64)Also applies to: 353-360
duration := time.Since(pr.Status.StartTime.Time) | ||
if pr.Status.CompletionTime != nil { | ||
duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard nil StartTime/Condition to avoid panics
Dereferencing StartTime or using a nil Condition can panic. Bail out early or skip until fields are set.
- duration := time.Since(pr.Status.StartTime.Time)
- if pr.Status.CompletionTime != nil {
- duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
- }
+ if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() {
+ return nil
+ }
+ duration := time.Since(pr.Status.StartTime.Time)
+ if pr.Status.CompletionTime != nil {
+ duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time)
+ }
@@
- cond := pr.Status.GetCondition(apis.ConditionSucceeded)
+ cond := pr.Status.GetCondition(apis.ConditionSucceeded)
+ if cond == nil {
+ return nil
+ }
@@
- attrs := []attribute.KeyValue{
- attribute.String("namespace", pr.Namespace),
- attribute.String("status", status),
- attribute.String("reason", reason),
- }
+ attrs := []attribute.KeyValue{
+ attribute.String("namespace", pr.Namespace),
+ attribute.String("status", status),
+ }
+ if r.cfg != nil && r.cfg.CountWithReason {
+ attrs = append(attrs, attribute.String("reason", reason))
+ }
Also applies to: 179-189
🤖 Prompt for AI Agents
In pkg/pipelinerunmetrics/metrics.go around lines 172-175 and 179-189, the code
assumes pr.Status.StartTime and certain Conditions are non-nil and dereferences
them, which can panic; add guards that check pr.Status.StartTime is set (not
nil/zero) before calling Time/Since/Sub and check the Condition pointer for nil
before accessing fields; if StartTime or the needed Condition is missing, bail
out early or skip emitting/updating metrics for this PipelineRun to avoid
panics.
- Updated `pipelinerunmetrics` and `taskrunmetrics` to use OpenTelemetry instruments (histograms, counters, gauges) for creating and recording metrics. Introduced new OpenTelemetry configurations in `config/config-observability.yaml` for exporters and protocols, while marking legacy OpenCensus settings as deprecated. Rewrote the test suites for `pipelinerunmetrics` and `taskrunmetrics` to be compatible with the new OpenTelemetry-based implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
go.mod (1)
3-3
: Invalid Go version directive
go.mod
requiresgo 1.24
(no patch) or another valid released version.1.24.0
will be rejected bygo
tooling.-go 1.24.0 +go 1.24pkg/pipelinerunmetrics/metrics.go (1)
172-201
: Guard nil StartTime/Condition and gate reason cardinalityAvoid panics when StartTime/Condition are unset; include “reason” only when configured for counters (to control cardinality).
- duration := time.Since(pr.Status.StartTime.Time) - if pr.Status.CompletionTime != nil { - duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) - } + if pr.Status.StartTime == nil || pr.Status.StartTime.IsZero() { + return nil + } + duration := time.Since(pr.Status.StartTime.Time) + if pr.Status.CompletionTime != nil { + duration = pr.Status.CompletionTime.Sub(pr.Status.StartTime.Time) + } @@ - cond := pr.Status.GetCondition(apis.ConditionSucceeded) + cond := pr.Status.GetCondition(apis.ConditionSucceeded) + if cond == nil { + return nil + } @@ - attrs := []attribute.KeyValue{ - attribute.String("namespace", pr.Namespace), - attribute.String("status", status), - attribute.String("reason", reason), - } + attrs := []attribute.KeyValue{ + attribute.String("namespace", pr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + attrs = append(attrs, attribute.String("reason", reason)) + } @@ - r.prDurationHistogram.Record(ctx, duration.Seconds(), metric.WithAttributes(attrs...)) - r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(attrs...)) + r.prDurationHistogram.Record(ctx, duration.Seconds(), metric.WithAttributes(attrs...)) + // For counters, keep reason optional to bound cardinality. + countAttrs := []attribute.KeyValue{ + attribute.String("namespace", pr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + countAttrs = append(countAttrs, attribute.String("reason", reason)) + } + r.prTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))
🧹 Nitpick comments (7)
go.mod (4)
123-131
: Align OpenTelemetry contrib/exporter versionsContrib is at
v0.62.0
, butexporters/prometheus
isv0.59.1
. Prefer matching contrib minor to avoid subtle API/ABI drift.- go.opentelemetry.io/otel/exporters/prometheus v0.59.1 // indirect + go.opentelemetry.io/otel/exporters/prometheus v0.62.0 // indirectPlease verify downstream compatibility before bumping.
23-23
: Remove stale OpenCensus deps if fully migrated to OTEL
go.opencensus.io
and ocagent/prometheus exporters remain. If no longer referenced, drop them to reduce surface and transitive deps.
- Search and remove unused OC imports/usages; keep only if still required.
Also applies to: 147-149
95-96
: Pin stable websocket release (optional)A pseudo-version pinned to a commit can be brittle. If possible, prefer the latest stable tag (e.g.,
v1.5.x
).
133-133
: Deprecated package
golang.org/x/tools/go/expect v0.1.0-deprecated
is marked deprecated. Confirm it’s intentionally required; otherwise, remove.pkg/taskrunmetrics/metrics_test.go (1)
85-95
: Test isolation: reset package singleton
NewRecorder
uses a package-levelsync.Once
. Reset it at test start to avoid cross-test coupling when run with-shuffle
.func TestUpdateConfig(t *testing.T) { - // Test that the config is updated when it changes, and not when it doesn't. + // Ensure a fresh recorder for this test. + once = sync.Once{} + // Test that the config is updated when it changes, and not when it doesn't.pkg/pipelinerunmetrics/metrics_test.go (1)
374-383
: Strengthen assertions (optional)
TestRecordRunningPipelineRunsResolutionWaitCounts
only checks init. Consider wiring aManualReader
here too and asserting the two “waiting” gauges produced expected values.pkg/pipelinerunmetrics/metrics.go (1)
19-37
: Helper: canonical attribute key (and imports)Add a small helper to stringify attribute sets deterministically.
import ( "context" "errors" "fmt" + "sort" + "strings" "sync" "time" @@ ) + +// attrKey builds a deterministic key from an attribute set. +func attrKey(s attribute.Set) string { + kvs := s.ToSlice() + sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) }) + var b strings.Builder + for _, kv := range kvs { + b.WriteString(string(kv.Key)) + b.WriteByte('=') + b.WriteString(kv.Value.Emit()) + b.WriteByte('|') + } + return b.String() +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
config/config-observability.yaml
(2 hunks)go.mod
(8 hunks)pkg/pipelinerunmetrics/metrics.go
(3 hunks)pkg/pipelinerunmetrics/metrics_test.go
(5 hunks)pkg/taskrunmetrics/metrics.go
(6 hunks)pkg/taskrunmetrics/metrics_test.go
(8 hunks)test/util.go
(0 hunks)
💤 Files with no reviewable changes (1)
- test/util.go
🚧 Files skipped from review as they are similar to previous changes (1)
- config/config-observability.yaml
🧰 Additional context used
🧬 Code graph analysis (4)
pkg/pipelinerunmetrics/metrics_test.go (3)
pkg/apis/pipeline/v1/pipelinerun_types.go (4)
PipelineRun
(51-60)PipelineRun
(73-75)PipelineRunSpec
(250-286)PipelineRunStatus
(321-326)pkg/pipelinerunmetrics/metrics.go (2)
NewRecorder
(72-88)Recorder
(47-63)pkg/apis/config/metrics.go (4)
PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)Metrics
(101-109)
pkg/pipelinerunmetrics/metrics.go (2)
pkg/apis/config/metrics.go (4)
Metrics
(101-109)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/taskrunmetrics/metrics.go (3)
Recorder
(46-65)NewRecorder
(74-91)OnStore
(191-207)
pkg/taskrunmetrics/metrics_test.go (5)
pkg/pipelinerunmetrics/metrics.go (2)
Recorder
(47-63)NewRecorder
(72-88)pkg/taskrunmetrics/metrics.go (2)
Recorder
(46-65)NewRecorder
(74-91)pkg/names/generate.go (1)
SimpleNameGenerator
(48-48)pkg/apis/config/metrics.go (1)
Metrics
(101-109)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
pkg/taskrunmetrics/metrics.go (4)
pkg/apis/config/metrics.go (7)
Metrics
(101-109)TaskrunLevelAtTaskrun
(52-52)TaskrunLevelAtTask
(54-54)TaskrunLevelAtNS
(56-56)PipelinerunLevelAtPipelinerun
(65-65)PipelinerunLevelAtPipeline
(68-68)PipelinerunLevelAtNS
(71-71)pkg/pipelinerunmetrics/metrics.go (3)
Recorder
(47-63)NewRecorder
(72-88)OnStore
(332-348)pkg/apis/pipeline/register.go (4)
PipelineTaskLabelKey
(47-47)TaskLabelKey
(29-29)PipelineLabelKey
(38-38)PipelineRunLabelKey
(41-41)pkg/pod/status.go (2)
ReasonExceededResourceQuota
(60-60)ReasonExceededNodeResources
(64-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (6)
pkg/taskrunmetrics/metrics_test.go (1)
418-424
: LGTM: corrected parameter order for RecordPodLatency
RecordPodLatency(ctx, pod, taskRun)
matches the function signature.pkg/taskrunmetrics/metrics.go (3)
318-325
: Avoid holding the mutex acrosslister.List()
Move the list call outside the critical section to reduce lock contention and avoid blocking metrics updates.
-func (r *Recorder) RunningTaskRuns(ctx context.Context, lister listers.TaskRunLister) error { +func (r *Recorder) RunningTaskRuns(ctx context.Context, lister listers.TaskRunLister) error { if !r.initialized { return errors.New("ignoring the metrics recording, failed to initialize the metrics recorder") } - - r.mutex.Lock() - defer r.mutex.Unlock() - - trs, err := lister.List(labels.Everything()) + trs, err := lister.List(labels.Everything()) if err != nil { return err } + r.mutex.Lock() + defer r.mutex.Unlock()
276-307
: Guard nil StartTime/Condition and align counter attributesPrevent panics and align
trTotalCounter
attributes with namespace and optional reason (per config).- duration := time.Since(tr.Status.StartTime.Time) - if tr.Status.CompletionTime != nil { - duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time) - } + if tr.Status.StartTime == nil || tr.Status.StartTime.IsZero() { + return nil + } + duration := time.Since(tr.Status.StartTime.Time) + if tr.Status.CompletionTime != nil { + duration = tr.Status.CompletionTime.Sub(tr.Status.StartTime.Time) + } @@ - cond := tr.Status.GetCondition(apis.ConditionSucceeded) + cond := tr.Status.GetCondition(apis.ConditionSucceeded) + if cond == nil { + return nil + } @@ - attrs := []attribute.KeyValue{ - attribute.String("namespace", tr.Namespace), - attribute.String("status", status), - attribute.String("reason", reason), - } + attrs := []attribute.KeyValue{ + attribute.String("namespace", tr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + attrs = append(attrs, attribute.String("reason", reason)) + } @@ - r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("status", status))) + countAttrs := []attribute.KeyValue{ + attribute.String("namespace", tr.Namespace), + attribute.String("status", status), + } + if r.cfg != nil && r.cfg.CountWithReason { + countAttrs = append(countAttrs, attribute.String("reason", reason)) + } + r.trTotalCounter.Add(ctx, 1, metric.WithAttributes(countAttrs...))
353-360
: Fix cumulative gauges: use deltas and stable keysAdds full totals each scrape; UpDownCounters require deltas. Also replace
map[attribute.Set]
with stable string keys and retain attrs for reporting.- r.runningTRsGauge.Add(ctx, float64(runningTrs)) - for attrSet, value := range trsThrottledByQuota { - r.runningTRsThrottledByQuotaGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } - for attrSet, value := range trsThrottledByNode { - r.runningTRsThrottledByNodeGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) - } + // running taskruns delta + if delta := float64(runningTrs) - r.prevRunningTRs; delta != 0 { + r.runningTRsGauge.Add(ctx, delta) + r.prevRunningTRs = float64(runningTrs) + } + // throttled-by-quota deltas + for key, cur := range trQuotaCur { + prev := r.prevTrsThrottledByQuota[key] + if d := cur - prev; d != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, d, metric.WithAttributes(trQuotaAttrs[key]...)) + } + } + for key, prev := range r.prevTrsThrottledByQuota { + if _, ok := trQuotaCur[key]; !ok && prev != 0 { + r.runningTRsThrottledByQuotaGauge.Add(ctx, -prev, metric.WithAttributes(r.prevTrsThrottledByQuotaAttrs[key]...)) + } + } + r.prevTrsThrottledByQuota = trQuotaCur + r.prevTrsThrottledByQuotaAttrs = trQuotaAttrs + // throttled-by-node deltas + for key, cur := range trNodeCur { + prev := r.prevTrsThrottledByNode[key] + if d := cur - prev; d != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, d, metric.WithAttributes(trNodeAttrs[key]...)) + } + } + for key, prev := range r.prevTrsThrottledByNode { + if _, ok := trNodeCur[key]; !ok && prev != 0 { + r.runningTRsThrottledByNodeGauge.Add(ctx, -prev, metric.WithAttributes(r.prevTrsThrottledByNodeAttrs[key]...)) + } + } + r.prevTrsThrottledByNode = trNodeCur + r.prevTrsThrottledByNodeAttrs = trNodeAttrsPrecompute current maps earlier:
@@ - trsThrottledByQuota := make(map[attribute.Set]float64) - trsThrottledByNode := make(map[attribute.Set]float64) + trQuotaCur := make(map[string]float64) + trQuotaAttrs := make(map[string][]attribute.KeyValue) + trNodeCur := make(map[string]float64) + trNodeAttrs := make(map[string][]attribute.KeyValue) @@ - attrSet := attribute.NewSet(attrs...) + attrSet := attribute.NewSet(attrs...) + key := attrKey(attrSet) @@ - trsThrottledByQuota[attrSet]++ + trQuotaCur[key]++ + trQuotaAttrs[key] = attrs @@ - trsThrottledByNode[attrSet]++ + trNodeCur[key]++ + trNodeAttrs[key] = attrspkg/pipelinerunmetrics/metrics_test.go (1)
262-299
: Fix: attribute.Set is not a map key; use canonical string keysReplace all
map[attribute.Set]float64
withmap[string]float64
and stringify attribute sets for both expected and got.@@ - expected map[attribute.Set]float64 + expected map[string]float64 @@ - expected: map[attribute.Set]float64{ - attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1")): 1, - ... - }, + expected: map[string]float64{ + attrKey(attribute.NewSet(attribute.String("namespace", "testns1"), attribute.String("pipeline", "pipeline1"), attribute.String("pipelinerun", "pr1"))): 1, + ... + }, @@ - got := make(map[attribute.Set]float64) + got := make(map[string]float64) @@ - for _, dp := range sum.DataPoints { - got[dp.Attributes] = dp.Value - } + for _, dp := range sum.DataPoints { + got[attrKey(dp.Attributes)] = dp.Value + }Add helper + imports if not present:
@@ -import ( +import ( "context" "testing" "time" + "sort" + "strings" @@ ) @@ +func attrKey(s attribute.Set) string { + kvs := s.ToSlice() + sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) }) + var b strings.Builder + for _, kv := range kvs { + b.WriteString(string(kv.Key)) + b.WriteByte('=') + b.WriteString(kv.Value.Emit()) + b.WriteByte('|') + } + return b.String() +}Also applies to: 348-360
pkg/pipelinerunmetrics/metrics.go (1)
228-238
: Count “waiting on task resolution” once per PipelineRunCurrently increments per child TaskRun; count each PipelineRun at most once.
- if pr.Status.ChildReferences != nil { - for _, childRef := range pr.Status.ChildReferences { - if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" { - waitingOnTaskCount++ - } - } - } + if pr.Status.ChildReferences != nil { + counted := false + for _, cr := range pr.Status.ChildReferences { + if cr.Kind == "TaskRun" && cr.PipelineTaskName != "" { + counted = true + break + } + } + if counted { + waitingOnTaskCount++ + } + }
// Recorder holds OpenTelemetry instruments for PipelineRun metrics | ||
type Recorder struct { | ||
mutex sync.Mutex | ||
initialized bool | ||
cfg *config.Metrics | ||
|
||
insertTag func(pipeline, | ||
pipelinerun string) []tag.Mutator | ||
meter metric.Meter | ||
|
||
ReportingPeriod time.Duration | ||
prDurationHistogram metric.Float64Histogram | ||
prTotalCounter metric.Int64Counter | ||
runningPRsGauge metric.Float64UpDownCounter | ||
runningPRsWaitingOnPipelineResolutionGauge metric.Float64UpDownCounter | ||
runningPRsWaitingOnTaskResolutionGauge metric.Float64UpDownCounter | ||
|
||
insertTag func(pipeline, pipelinerun string) []attribute.KeyValue | ||
|
||
hash string | ||
ReportingPeriod time.Duration | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Cache previous snapshot for deltas
Add fields to support delta computation and attribute retention.
type Recorder struct {
@@
- insertTag func(pipeline, pipelinerun string) []attribute.KeyValue
+ insertTag func(pipeline, pipelinerun string) []attribute.KeyValue
+
+ // Cached snapshot for delta-based UpDownCounter updates.
+ prevRunningPRCounts map[string]float64
+ prevRunningPRAttrs map[string][]attribute.KeyValue
+ prevWaitingOnPipeline float64
+ prevWaitingOnTask float64
func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { | ||
r.mutex.Lock() | ||
defer r.mutex.Unlock() | ||
|
||
var prunTag []tag.Key | ||
r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller") | ||
|
||
switch cfg.PipelinerunLevel { | ||
case config.PipelinerunLevelAtPipelinerun: | ||
prunTag = []tag.Key{pipelinerunTag, pipelineTag} | ||
r.insertTag = pipelinerunInsertTag | ||
case config.PipelinerunLevelAtPipeline: | ||
prunTag = []tag.Key{pipelineTag} | ||
r.insertTag = pipelineInsertTag | ||
case config.PipelinerunLevelAtNS: | ||
prunTag = []tag.Key{} | ||
r.insertTag = nilInsertTag | ||
default: | ||
return errors.New("invalid config for PipelinerunLevel: " + cfg.PipelinerunLevel) | ||
} | ||
|
||
var runningPRTag []tag.Key | ||
switch cfg.RunningPipelinerunLevel { | ||
case config.PipelinerunLevelAtPipelinerun: | ||
runningPRTag = []tag.Key{pipelinerunTag, pipelineTag, namespaceTag} | ||
case config.PipelinerunLevelAtPipeline: | ||
runningPRTag = []tag.Key{pipelineTag, namespaceTag} | ||
case config.PipelinerunLevelAtNS: | ||
runningPRTag = []tag.Key{namespaceTag} | ||
default: | ||
runningPRTag = []tag.Key{} | ||
} | ||
|
||
distribution := view.Distribution(10, 30, 60, 300, 900, 1800, 3600, 5400, 10800, 21600, 43200, 86400) | ||
|
||
if cfg.PipelinerunLevel == config.PipelinerunLevelAtPipelinerun { | ||
distribution = view.LastValue() | ||
} else { | ||
switch cfg.DurationPipelinerunType { | ||
case config.DurationTaskrunTypeHistogram: | ||
case config.DurationTaskrunTypeLastValue: | ||
distribution = view.LastValue() | ||
default: | ||
return errors.New("invalid config for DurationTaskrunType: " + cfg.DurationTaskrunType) | ||
} | ||
} | ||
|
||
if cfg.CountWithReason { | ||
prunTag = append(prunTag, reasonTag) | ||
} | ||
|
||
prDurationView = &view.View{ | ||
Description: prDuration.Description(), | ||
Measure: prDuration, | ||
Aggregation: distribution, | ||
TagKeys: append([]tag.Key{statusTag, namespaceTag}, prunTag...), | ||
} | ||
|
||
prTotalView = &view.View{ | ||
Description: prTotal.Description(), | ||
Measure: prTotal, | ||
Aggregation: view.Count(), | ||
TagKeys: []tag.Key{statusTag}, | ||
} | ||
|
||
runningPRsView = &view.View{ | ||
Description: runningPRs.Description(), | ||
Measure: runningPRs, | ||
Aggregation: view.LastValue(), | ||
TagKeys: runningPRTag, | ||
} | ||
|
||
runningPRsWaitingOnPipelineResolutionView = &view.View{ | ||
Description: runningPRsWaitingOnPipelineResolution.Description(), | ||
Measure: runningPRsWaitingOnPipelineResolution, | ||
Aggregation: view.LastValue(), | ||
} | ||
|
||
runningPRsWaitingOnTaskResolutionView = &view.View{ | ||
Description: runningPRsWaitingOnTaskResolution.Description(), | ||
Measure: runningPRsWaitingOnTaskResolution, | ||
Aggregation: view.LastValue(), | ||
} | ||
|
||
return view.Register( | ||
prDurationView, | ||
prTotalView, | ||
runningPRsView, | ||
runningPRsWaitingOnPipelineResolutionView, | ||
runningPRsWaitingOnTaskResolutionView, | ||
prDurationHistogram, err := r.meter.Float64Histogram( | ||
"tekton_pipelines_controller_pipelinerun_duration_seconds", | ||
metric.WithDescription("The pipelinerun execution time in seconds"), | ||
metric.WithUnit("s"), | ||
metric.WithExplicitBucketBoundaries(10, 30, 60, 300, 900, 1800, 3600, 5400, 10800, 21600, 43200, 86400), | ||
) | ||
} | ||
|
||
func viewUnregister() { | ||
view.Unregister(prDurationView, | ||
prTotalView, | ||
runningPRsView, | ||
runningPRsWaitingOnPipelineResolutionView, | ||
runningPRsWaitingOnTaskResolutionView) | ||
} | ||
|
||
// OnStore returns a function that checks if metrics are configured for a config.Store, and registers it if so | ||
func OnStore(logger *zap.SugaredLogger, r *Recorder) func(name string, | ||
value interface{}) { | ||
return func(name string, value interface{}) { | ||
if name == config.GetMetricsConfigName() { | ||
cfg, ok := value.(*config.Metrics) | ||
if !ok { | ||
logger.Error("Failed to do type insertion for extracting metrics config") | ||
return | ||
} | ||
updated := r.updateConfig(cfg) | ||
if !updated { | ||
return | ||
} | ||
// Update metrics according to configuration | ||
viewUnregister() | ||
err := viewRegister(cfg) | ||
if err != nil { | ||
logger.Errorf("Failed to register View %v ", err) | ||
return | ||
} | ||
} | ||
if err != nil { | ||
return fmt.Errorf("failed to create pipelinerun duration histogram: %w", err) | ||
} | ||
} | ||
r.prDurationHistogram = prDurationHistogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Initialize/reset caches during instrument setup
Ensure snapshot is reset when instruments are rebuilt.
func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
@@
- r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller")
+ r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller")
+ // reset delta caches
+ r.prevRunningPRCounts = make(map[string]float64)
+ r.prevRunningPRAttrs = make(map[string][]attribute.KeyValue)
+ r.prevWaitingOnPipeline = 0
+ r.prevWaitingOnTask = 0
Also applies to: 136-155
🤖 Prompt for AI Agents
In pkg/pipelinerunmetrics/metrics.go around lines 90-116 (and likewise apply the
same change at 136-155), when rebuilding instruments under the mutex you must
reset the recorder's cached snapshot so stale data isn't reused; after
creating/assigning new instruments (e.g. after r.prDurationHistogram =
prDurationHistogram) set r.snapshot to an empty/zero value (for example
r.snapshot = metricsSnapshot{} or nil if the type allows) and clear any related
caches so the recorder starts with a fresh snapshot whenever instruments are
reinitialized.
// RunningPipelineRuns logs the number of PipelineRuns running right now | ||
// returns an error if it fails to log the metrics | ||
func (r *Recorder) RunningPipelineRuns(lister listers.PipelineRunLister) error { | ||
r.mutex.Lock() | ||
defer r.mutex.Unlock() | ||
func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error { | ||
if !r.initialized { | ||
return errors.New("ignoring the metrics recording, failed to initialize the metrics recorder") | ||
return fmt.Errorf("ignoring the metrics recording, failed to initialize the metrics recorder") | ||
} | ||
|
||
r.mutex.Lock() | ||
defer r.mutex.Unlock() | ||
|
||
prs, err := lister.List(labels.Everything()) | ||
if err != nil { | ||
return fmt.Errorf("failed to list pipelineruns while generating metrics : %w", err) | ||
return fmt.Errorf("failed to list pipelineruns: %w", err) | ||
} | ||
|
||
var runningPipelineRuns int | ||
var trsWaitResolvingTaskRef int | ||
var prsWaitResolvingPipelineRef int | ||
countMap := map[string]int{} | ||
// This map holds the new counts for the current interval. | ||
currentCounts := make(map[attribute.Set]float64) | ||
var waitingOnPipelineCount float64 | ||
var waitingOnTaskCount float64 | ||
|
||
for _, pr := range prs { | ||
pipelineName := getPipelineTagName(pr) | ||
pipelineRunKey := "" | ||
mutators := []tag.Mutator{ | ||
tag.Insert(namespaceTag, pr.Namespace), | ||
tag.Insert(pipelineTag, pipelineName), | ||
tag.Insert(pipelinerunTag, pr.Name), | ||
} | ||
if r.cfg != nil { | ||
switch r.cfg.RunningPipelinerunLevel { | ||
case runningPRLevelPipelinerun: | ||
pipelineRunKey = pipelineRunKey + "#" + pr.Name | ||
fallthrough | ||
case runningPRLevelPipeline: | ||
pipelineRunKey = pipelineRunKey + "#" + pipelineName | ||
fallthrough | ||
case runningPRLevelNamespace: | ||
pipelineRunKey = pipelineRunKey + "#" + pr.Namespace | ||
case runningPRLevelCluster: | ||
default: | ||
return fmt.Errorf("RunningPipelineRunLevel value \"%s\" is not valid ", r.cfg.RunningPipelinerunLevel) | ||
if pr.Status.GetCondition(apis.ConditionSucceeded).IsUnknown() { | ||
// This PR is running. | ||
|
||
// Handle waiting metrics (these are cluster-wide, no extra attributes). | ||
if pr.Status.PipelineSpec == nil && pr.Spec.PipelineRef != nil { | ||
waitingOnPipelineCount++ | ||
} | ||
} | ||
ctx_, err_ := tag.New(context.Background(), mutators...) | ||
if err_ != nil { | ||
return err | ||
} | ||
if !pr.IsDone() && !pr.IsPending() { | ||
countMap[pipelineRunKey]++ | ||
metrics.Record(ctx_, runningPRs.M(float64(countMap[pipelineRunKey]))) | ||
runningPipelineRuns++ | ||
succeedCondition := pr.Status.GetCondition(apis.ConditionSucceeded) | ||
if succeedCondition != nil && succeedCondition.Status == corev1.ConditionUnknown { | ||
switch succeedCondition.Reason { | ||
case v1.TaskRunReasonResolvingTaskRef: | ||
trsWaitResolvingTaskRef++ | ||
case v1.PipelineRunReasonResolvingPipelineRef.String(): | ||
prsWaitResolvingPipelineRef++ | ||
if pr.Status.ChildReferences != nil { | ||
for _, childRef := range pr.Status.ChildReferences { | ||
if childRef.Kind == "TaskRun" && childRef.PipelineTaskName != "" { | ||
waitingOnTaskCount++ | ||
} | ||
} | ||
} | ||
} else { | ||
// In case there are no running PipelineRuns for the pipelineRunKey, set the metric value to 0 to ensure | ||
// the metric is set for the key. | ||
if _, exists := countMap[pipelineRunKey]; !exists { | ||
countMap[pipelineRunKey] = 0 | ||
metrics.Record(ctx_, runningPRs.M(0)) | ||
|
||
// Handle running_pipelineruns metric with per-level aggregation. | ||
pipelineName := getPipelineTagName(pr) | ||
var attrs []attribute.KeyValue | ||
|
||
// Build attributes based on configured level. | ||
switch r.cfg.RunningPipelinerunLevel { | ||
case config.PipelinerunLevelAtPipelinerun: | ||
attrs = append(attrs, attribute.String("pipelinerun", pr.Name)) | ||
fallthrough | ||
case config.PipelinerunLevelAtPipeline: | ||
attrs = append(attrs, attribute.String("pipeline", pipelineName)) | ||
fallthrough | ||
case config.PipelinerunLevelAtNS: | ||
attrs = append(attrs, attribute.String("namespace", pr.Namespace)) | ||
} | ||
|
||
attrSet := attribute.NewSet(attrs...) | ||
currentCounts[attrSet]++ | ||
} | ||
} | ||
|
||
ctx, err := tag.New(context.Background()) | ||
if err != nil { | ||
return err | ||
// Report running_pipelineruns. | ||
for attrSet, value := range currentCounts { | ||
r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...)) | ||
} | ||
metrics.Record(ctx, runningPRsWaitingOnPipelineResolution.M(float64(prsWaitResolvingPipelineRef))) | ||
metrics.Record(ctx, runningPRsWaitingOnTaskResolution.M(float64(trsWaitResolvingTaskRef))) | ||
metrics.Record(ctx, runningPRs.M(float64(runningPipelineRuns))) | ||
|
||
r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes()) | ||
r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes()) | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix cumulative gauges: report deltas and avoid map[attribute.Set]
Current code:
- Adds full totals on every scrape to UpDownCounters (cumulative inflation).
- Uses
map[attribute.Set]float64
(fragile; Set is not a safe map key). - Holds the mutex during
lister.List()
.
Address by:
- Computing deltas vs. a cached previous snapshot (per-attribute-key), including emitting negative deltas for disappeared series.
- Moving
lister.List()
outside the critical section. - Keying by a canonical string and retaining attributes for reporting.
@@
-func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error {
+func (r *Recorder) RunningPipelineRuns(ctx context.Context, lister listers.PipelineRunLister) error {
if !r.initialized {
return fmt.Errorf("ignoring the metrics recording, failed to initialize the metrics recorder")
}
-
- r.mutex.Lock()
- defer r.mutex.Unlock()
-
- prs, err := lister.List(labels.Everything())
+ prs, err := lister.List(labels.Everything())
if err != nil {
return fmt.Errorf("failed to list pipelineruns: %w", err)
}
- // This map holds the new counts for the current interval.
- currentCounts := make(map[attribute.Set]float64)
- var waitingOnPipelineCount float64
- var waitingOnTaskCount float64
+ // Current snapshot (keyed by canonical attr key).
+ currentCounts := make(map[string]float64)
+ attrsByKey := make(map[string][]attribute.KeyValue)
+ var waitingOnPipelineCount float64
+ var waitingOnTaskCount float64
@@
- // Handle running_pipelineruns metric with per-level aggregation.
+ // Handle running_pipelineruns metric with per-level aggregation.
pipelineName := getPipelineTagName(pr)
var attrs []attribute.KeyValue
@@
- attrSet := attribute.NewSet(attrs...)
- currentCounts[attrSet]++
+ key := attrKey(attribute.NewSet(attrs...))
+ currentCounts[key]++
+ attrsByKey[key] = attrs
}
}
- // Report running_pipelineruns.
- for attrSet, value := range currentCounts {
- r.runningPRsGauge.Add(ctx, value, metric.WithAttributes(attrSet.ToSlice()...))
- }
-
- r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, waitingOnPipelineCount, metric.WithAttributes())
- r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, waitingOnTaskCount, metric.WithAttributes())
+ r.mutex.Lock()
+ defer r.mutex.Unlock()
+
+ // Emit deltas for running counts (including negatives for vanished series).
+ for key, cur := range currentCounts {
+ prev := r.prevRunningPRCounts[key]
+ if delta := cur - prev; delta != 0 {
+ r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(attrsByKey[key]...))
+ }
+ }
+ for key, prev := range r.prevRunningPRCounts {
+ if _, still := currentCounts[key]; !still && prev != 0 {
+ // series disappeared -> emit negative delta using previous attrs
+ r.runningPRsGauge.Add(ctx, -prev, metric.WithAttributes(r.prevRunningPRAttrs[key]...))
+ }
+ }
+ r.prevRunningPRCounts = currentCounts
+ r.prevRunningPRAttrs = attrsByKey
+
+ if delta := waitingOnPipelineCount - r.prevWaitingOnPipeline; delta != 0 {
+ r.runningPRsWaitingOnPipelineResolutionGauge.Add(ctx, delta, metric.WithAttributes())
+ r.prevWaitingOnPipeline = waitingOnPipelineCount
+ }
+ if delta := waitingOnTaskCount - r.prevWaitingOnTask; delta != 0 {
+ r.runningPRsWaitingOnTaskResolutionGauge.Add(ctx, delta, metric.WithAttributes())
+ r.prevWaitingOnTask = waitingOnTaskCount
+ }
return nil
}
Add helpers/state (see struct and init comments below).
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In pkg/pipelinerunmetrics/metrics.go around lines 205 to 270, the current
implementation reads the lister under the recorder mutex, uses
map[attribute.Set]float64 as the snapshot key, and calls Add with full totals on
each scrape (causing cumulative inflation and unsafe map keys). Change to: call
lister.List(...) before acquiring r.mutex; build a canonical string key (e.g.,
join sorted key=value pairs) and keep a map[string]struct{attrs
[]attribute.KeyValue; value float64} for the currentCounts; compute deltas
against a recorder-level previousSnapshot (protected by r.mutex) so you call
r.runningPRsGauge.Add(ctx, delta, metric.WithAttributes(attrs...)) and emit
negative deltas for keys that disappeared; keep attribute slices to pass to
metric.WithAttributes and update previousSnapshot after emitting deltas; add
small helper(s) to build the canonical key and to convert stored attrs to metric
options; and only hold r.mutex while reading/updating previousSnapshot (not
during List).
for _, tc := range []struct { | ||
status corev1.ConditionStatus | ||
reason string | ||
nodeCount float64 | ||
quotaCount float64 | ||
waitCount float64 | ||
addNS bool | ||
}{ | ||
{ | ||
status: corev1.ConditionTrue, | ||
reason: "", | ||
}, | ||
{ | ||
status: corev1.ConditionTrue, | ||
reason: pod.ReasonExceededResourceQuota, | ||
}, | ||
{ | ||
status: corev1.ConditionTrue, | ||
reason: pod.ReasonExceededResourceQuota, | ||
addNS: true, | ||
}, | ||
{ | ||
status: corev1.ConditionTrue, | ||
reason: pod.ReasonExceededNodeResources, | ||
}, | ||
{ | ||
status: corev1.ConditionTrue, | ||
reason: pod.ReasonExceededNodeResources, | ||
addNS: true, | ||
}, | ||
{ | ||
status: corev1.ConditionTrue, | ||
reason: v1.TaskRunReasonResolvingTaskRef, | ||
}, | ||
{ | ||
status: corev1.ConditionFalse, | ||
reason: "", | ||
}, | ||
{ | ||
status: corev1.ConditionFalse, | ||
reason: pod.ReasonExceededResourceQuota, | ||
}, | ||
{ | ||
status: corev1.ConditionFalse, | ||
reason: pod.ReasonExceededNodeResources, | ||
}, | ||
{ | ||
status: corev1.ConditionFalse, | ||
reason: v1.TaskRunReasonResolvingTaskRef, | ||
}, | ||
{ | ||
status: corev1.ConditionUnknown, | ||
reason: "", | ||
}, | ||
{ | ||
status: corev1.ConditionUnknown, | ||
reason: pod.ReasonExceededResourceQuota, | ||
quotaCount: 3, | ||
}, | ||
{ | ||
status: corev1.ConditionUnknown, | ||
reason: pod.ReasonExceededNodeResources, | ||
nodeCount: 3, | ||
name string | ||
throttleWithNamespace bool | ||
expectedQuota map[attribute.Set]float64 | ||
expectedNode map[attribute.Set]float64 | ||
}{{ | ||
name: "without namespace", | ||
throttleWithNamespace: false, | ||
expectedQuota: map[attribute.Set]float64{ | ||
attribute.NewSet(): 3, | ||
}, | ||
{ | ||
status: corev1.ConditionUnknown, | ||
reason: pod.ReasonExceededResourceQuota, | ||
quotaCount: 3, | ||
addNS: true, | ||
expectedNode: map[attribute.Set]float64{ | ||
attribute.NewSet(): 2, | ||
}, | ||
{ | ||
status: corev1.ConditionUnknown, | ||
reason: pod.ReasonExceededNodeResources, | ||
nodeCount: 3, | ||
addNS: true, | ||
}, { | ||
name: "with namespace", | ||
throttleWithNamespace: true, | ||
expectedQuota: map[attribute.Set]float64{ | ||
attribute.NewSet(attribute.String("namespace", "testns1")): 2, | ||
attribute.NewSet(attribute.String("namespace", "testns2")): 1, | ||
}, | ||
{ | ||
status: corev1.ConditionUnknown, | ||
reason: v1.TaskRunReasonResolvingTaskRef, | ||
waitCount: 3, | ||
expectedNode: map[attribute.Set]float64{ | ||
attribute.NewSet(attribute.String("namespace", "testns2")): 1, | ||
attribute.NewSet(attribute.String("namespace", "testns3")): 1, | ||
}, | ||
} { | ||
unregisterMetrics() | ||
ctx, _ := ttesting.SetupFakeContext(t) | ||
informer := faketaskruninformer.Get(ctx) | ||
for range multiplier { | ||
tr := &v1.TaskRun{ | ||
ObjectMeta: metav1.ObjectMeta{Name: names.SimpleNameGenerator.RestrictLengthWithRandomSuffix("taskrun-"), Namespace: "test"}, | ||
Status: v1.TaskRunStatus{ | ||
Status: duckv1.Status{ | ||
Conditions: duckv1.Conditions{{ | ||
Type: apis.ConditionSucceeded, | ||
Status: tc.status, | ||
Reason: tc.reason, | ||
}}, | ||
}, | ||
}, | ||
}} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: attribute.Set is not a map key; use canonical string keys
attribute.Set
isn’t comparable; map[attribute.Set]float64
won’t compile. Convert attribute sets to deterministic strings for map keys in both expected and got.
@@
- expectedQuota map[attribute.Set]float64
- expectedNode map[attribute.Set]float64
+ expectedQuota map[string]float64
+ expectedNode map[string]float64
@@
- expectedQuota: map[attribute.Set]float64{
- attribute.NewSet(): 3,
- },
+ expectedQuota: map[string]float64{
+ attrKey(attribute.NewSet()): 3,
+ },
@@
- expectedNode: map[attribute.Set]float64{
- attribute.NewSet(): 2,
- },
+ expectedNode: map[string]float64{
+ attrKey(attribute.NewSet()): 2,
+ },
@@
- expectedQuota: map[attribute.Set]float64{
- attribute.NewSet(attribute.String("namespace", "testns1")): 2,
- attribute.NewSet(attribute.String("namespace", "testns2")): 1,
- },
+ expectedQuota: map[string]float64{
+ attrKey(attribute.NewSet(attribute.String("namespace", "testns1"))): 2,
+ attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1,
+ },
@@
- expectedNode: map[attribute.Set]float64{
- attribute.NewSet(attribute.String("namespace", "testns2")): 1,
- attribute.NewSet(attribute.String("namespace", "testns3")): 1,
- },
+ expectedNode: map[string]float64{
+ attrKey(attribute.NewSet(attribute.String("namespace", "testns2"))): 1,
+ attrKey(attribute.NewSet(attribute.String("namespace", "testns3"))): 1,
+ },
@@
- gotQuota := make(map[attribute.Set]float64)
- gotNode := make(map[attribute.Set]float64)
+ gotQuota := make(map[string]float64)
+ gotNode := make(map[string]float64)
@@
- case "tekton_pipelines_controller_running_taskruns_throttled_by_quota":
+ case "tekton_pipelines_controller_running_taskruns_throttled_by_quota":
for _, dp := range sum.DataPoints {
- gotQuota[dp.Attributes] = dp.Value
+ gotQuota[attrKey(dp.Attributes)] = dp.Value
}
case "tekton_pipelines_controller_running_taskruns_throttled_by_node":
for _, dp := range sum.DataPoints {
- gotNode[dp.Attributes] = dp.Value
+ gotNode[attrKey(dp.Attributes)] = dp.Value
}
Add helper + imports:
@@
import (
"context"
"sync"
"testing"
"time"
+ "sort"
+ "strings"
@@
)
@@
+// attrKey builds a deterministic key from an attribute set.
+func attrKey(s attribute.Set) string {
+ kvs := s.ToSlice()
+ sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) })
+ var b strings.Builder
+ for _, kv := range kvs {
+ b.WriteString(string(kv.Key))
+ b.WriteByte('=')
+ b.WriteString(kv.Value.Emit())
+ b.WriteByte('|')
+ }
+ return b.String()
+}
Also applies to: 336-365
🤖 Prompt for AI Agents
In pkg/taskrunmetrics/metrics_test.go around lines 289 to 314 (and also apply
same change to lines 336–365), the test currently declares maps using
map[attribute.Set]float64 which fails because attribute.Set is not comparable;
change those maps to map[string]float64 and convert attribute.Set keys to a
deterministic canonical string. Add a small helper function (and necessary
imports) to serialize an attribute.Set into a stable string (e.g., sort
attributes and format as key=value pairs) and use that helper when building both
expectedQuota/expectedNode and when constructing the actual result keys before
comparing maps.
// Recorder holds OpenTelemetry instruments for TaskRun metrics | ||
type Recorder struct { | ||
mutex sync.Mutex | ||
initialized bool | ||
cfg *config.Metrics | ||
|
||
ReportingPeriod time.Duration | ||
meter metric.Meter | ||
|
||
insertTaskTag func(task, | ||
taskrun string) []tag.Mutator | ||
trDurationHistogram metric.Float64Histogram | ||
prTRDurationHistogram metric.Float64Histogram | ||
trTotalCounter metric.Int64Counter | ||
runningTRsGauge metric.Float64UpDownCounter | ||
runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter | ||
runningTRsThrottledByNodeGauge metric.Float64UpDownCounter | ||
podLatencyHistogram metric.Float64Histogram | ||
|
||
insertPipelineTag func(pipeline, | ||
pipelinerun string) []tag.Mutator | ||
insertTaskTag func(task, taskrun string) []attribute.KeyValue | ||
insertPipelineTag func(pipeline, pipelinerun string) []attribute.KeyValue | ||
|
||
hash string | ||
ReportingPeriod time.Duration | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add delta caches to Recorder
type Recorder struct {
@@
podLatencyHistogram metric.Float64Histogram
@@
ReportingPeriod time.Duration
+
+ // Delta caches
+ prevRunningTRs float64
+ prevTrsThrottledByQuota map[string]float64
+ prevTrsThrottledByQuotaAttrs map[string][]attribute.KeyValue
+ prevTrsThrottledByNode map[string]float64
+ prevTrsThrottledByNodeAttrs map[string][]attribute.KeyValue
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Recorder holds OpenTelemetry instruments for TaskRun metrics | |
type Recorder struct { | |
mutex sync.Mutex | |
initialized bool | |
cfg *config.Metrics | |
ReportingPeriod time.Duration | |
meter metric.Meter | |
insertTaskTag func(task, | |
taskrun string) []tag.Mutator | |
trDurationHistogram metric.Float64Histogram | |
prTRDurationHistogram metric.Float64Histogram | |
trTotalCounter metric.Int64Counter | |
runningTRsGauge metric.Float64UpDownCounter | |
runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter | |
runningTRsThrottledByNodeGauge metric.Float64UpDownCounter | |
podLatencyHistogram metric.Float64Histogram | |
insertPipelineTag func(pipeline, | |
pipelinerun string) []tag.Mutator | |
insertTaskTag func(task, taskrun string) []attribute.KeyValue | |
insertPipelineTag func(pipeline, pipelinerun string) []attribute.KeyValue | |
hash string | |
ReportingPeriod time.Duration | |
} | |
// Recorder holds OpenTelemetry instruments for TaskRun metrics | |
type Recorder struct { | |
mutex sync.Mutex | |
initialized bool | |
cfg *config.Metrics | |
meter metric.Meter | |
trDurationHistogram metric.Float64Histogram | |
prTRDurationHistogram metric.Float64Histogram | |
trTotalCounter metric.Int64Counter | |
runningTRsGauge metric.Float64UpDownCounter | |
runningTRsThrottledByQuotaGauge metric.Float64UpDownCounter | |
runningTRsThrottledByNodeGauge metric.Float64UpDownCounter | |
podLatencyHistogram metric.Float64Histogram | |
insertTaskTag func(task, taskrun string) []attribute.KeyValue | |
insertPipelineTag func(pipeline, pipelinerun string) []attribute.KeyValue | |
ReportingPeriod time.Duration | |
// Delta caches | |
prevRunningTRs float64 | |
prevTrsThrottledByQuota map[string]float64 | |
prevTrsThrottledByQuotaAttrs map[string][]attribute.KeyValue | |
prevTrsThrottledByNode map[string]float64 | |
prevTrsThrottledByNodeAttrs map[string][]attribute.KeyValue | |
} |
🤖 Prompt for AI Agents
In pkg/taskrunmetrics/metrics.go around lines 45 to 66, the Recorder struct
needs delta caches added; add fields to store deltas and protect them (e.g.,
deltaCache map[string]float64 and deltaCacheMu sync.Mutex or similar typed cache
and mutex for thread-safety), and ensure callers that construct Recorder
initialize the cache (make(map[string]float64)) and use the mutex when
reading/updating the cache.
func (r *Recorder) initializeInstruments(cfg *config.Metrics) error { | ||
r.mutex.Lock() | ||
defer r.mutex.Unlock() | ||
|
||
var prunTag []tag.Key | ||
r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller") | ||
|
||
switch cfg.TaskrunLevel { | ||
case config.TaskrunLevelAtTaskrun: | ||
r.insertTaskTag = taskrunInsertTag | ||
case config.TaskrunLevelAtTask: | ||
r.insertTaskTag = taskInsertTag | ||
case config.TaskrunLevelAtNS: | ||
r.insertTaskTag = nilInsertTag | ||
default: | ||
return errors.New("invalid config for TaskrunLevel: " + cfg.TaskrunLevel) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Initialize/reset delta caches and add attrKey helper/imports
Reset caches when instruments are (re)initialized and add the key helper.
import (
"context"
"fmt"
+ "sort"
+ "strings"
"sync"
"time"
@@
func (r *Recorder) initializeInstruments(cfg *config.Metrics) error {
@@
r.meter = otel.GetMeterProvider().Meter("tekton_pipelines_controller")
+ // reset delta caches
+ r.prevRunningTRs = 0
+ r.prevTrsThrottledByQuota = make(map[string]float64)
+ r.prevTrsThrottledByQuotaAttrs = make(map[string][]attribute.KeyValue)
+ r.prevTrsThrottledByNode = make(map[string]float64)
+ r.prevTrsThrottledByNodeAttrs = make(map[string][]attribute.KeyValue)
@@
r.podLatencyHistogram = podLatencyHistogram
return nil
}
+
+// attrKey builds a deterministic key from an attribute set.
+func attrKey(s attribute.Set) string {
+ kvs := s.ToSlice()
+ sort.Slice(kvs, func(i, j int) bool { return string(kvs[i].Key) < string(kvs[j].Key) })
+ var b strings.Builder
+ for _, kv := range kvs {
+ b.WriteString(string(kv.Key))
+ b.WriteByte('=')
+ b.WriteString(kv.Value.Emit())
+ b.WriteByte('|')
+ }
+ return b.String()
+}
Also applies to: 121-188
🤖 Prompt for AI Agents
In pkg/taskrunmetrics/metrics.go around lines 93 to 109 (also apply same changes
to 121-188), when instruments are initialized or re-initialized you must reset
the delta caches and add the attrKey helper/imports: clear or reassign the maps
used to store deltaTaskrunCounts, deltaTaskCounts (and any other per-interval
cache fields) so they start empty on re-init, and add a small helper function
attrKey that converts label names/values to the attribute.Key/attribute.KeyValue
form used by Otel along with the required import
(go.opentelemetry.io/otel/attribute); ensure the cache reset code is placed
inside the same mutex-protected block after meter setup so re-init is
thread-safe and duplicate the same cache-reset and helper-import usage in the
other instrument initialization block at lines 121-188.
Changes
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
/kind <type>
. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tepRelease Notes
Summary by CodeRabbit
New Features
Improvements
Bug Fixes