Skip to content

Conversation

christopherzli
Copy link

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

pending test

@christopherzli christopherzli requested review from a team, jnyi, hczhu-db and yuchen-db and removed request for a team August 28, 2024 00:09
Copy link
Collaborator

@hczhu-db hczhu-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

}
f.lruCache.Add(queryExpressionNormalized, queryExpressionRangeLength)
}
f.lruCache.Add(queryExpressionNormalized, queryExpressionRangeLength)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm Add() is thread-safe? Multiple goroutines from the HTTP server may call this function here concurrently.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it uses a mutex

// TODO(hc.zhu): add a flag for the threshold
// The current gateway timeout is 5 minutes, so we cache the failed query running longer than 5 minutes - 10 seconds.
if queryResponseTime > time.Second * (60 * 5 - 10) {
if queryResponseTime > time.Second*(60*5-10) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the magic number a flag since you are touching this part? :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt, it was due to go format

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do it in a separate PR

func TestNewFailedQueryCache(t *testing.T) {
reg := prometheus.NewRegistry()
cache, err := NewFailedQueryCache(2, reg)
cache := NewFailedQueryCache(2, 0, reg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add test cases where the TTL is not 0?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does 0 TTL mean? Never expire?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 means no ttl

QueryStatsEnabled bool `yaml:"query_stats_enabled"`
LogFailedQueries bool `yaml:"log_failed_queries"`
FailedQueryCacheCapacity int `yaml:"failed_query_cache_capacity"`
FailedQueryTTL time.Duration `yaml:"failed_query_ttl"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add a cmd line flag for this config.
image

}

func NewFailedQueryCache(capacity int, reg prometheus.Registerer) (*FailedQueryCache, error) {
func NewFailedQueryCache(capacity int, ttlDuration time.Duration, reg prometheus.Registerer) *FailedQueryCache {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the eventual goal of a TTL is to cache expensive queries as well. If a query fetches too many time series, add it to the cache.
Can you rename the cache like ExpensiveQueryCache and make it cache both failed queries (usually expensive ones), long-running ones (> 4 minutes), and the ones fetching too many time series?

Copy link
Collaborator

@hczhu-db hczhu-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the comments.

@yuchen-db yuchen-db force-pushed the db_main branch 8 times, most recently from 64fbb3a to 5276dd1 Compare June 5, 2025 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants