Skip to content

Conversation

@erikjohnston
Copy link
Member

This reduces the size of metrics by ~80%. Responding with the metrics takes significant amounts of time.

This reduces the size of metrics by ~80%. Responding with the metrics
takes significant amounts of time.
@erikjohnston erikjohnston marked this pull request as ready for review November 4, 2025 10:55
@erikjohnston erikjohnston requested a review from a team as a code owner November 4, 2025 10:55
origin_entity = "*client*"
else:
origin_type = "remote"
origin_entity = get_domain_from_id(event.sender)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 We don't need to track the origin_entity server name of remote events (high cardinality).

Comment on lines 377 to -382
if context.app_service:
origin_type = "local"
origin_entity = context.app_service.id
elif self.hs.is_mine_id(event.sender):
origin_type = "local"
origin_entity = "*client*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

origin_entity still used in contrib/

origin_entity is still used in a few metrics (contrib/prometheus/synapse-v2.rules and contrib/grafana/synapse.json):

  • synapse_storage_events_persisted_by_source_type used in the Events/s Local vs Remote graph.
  • synapse_storage_events_persisted_by_event_type used in the Events/s by Type graph

I understand those are contrib/ things that we don't necessarily maintain but do we at-least have a plan for how we're going to proceed with our own graphs (for example matrix.org)? Seems like adjusting our matrix.org graphs would overlap with fixing contrib/ anyway.

Comment on lines 377 to -382
if context.app_service:
origin_type = "local"
origin_entity = context.app_service.id
elif self.hs.is_mine_id(event.sender):
origin_type = "local"
origin_entity = "*client*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Losing information on application service (bridges)

From contrib/prometheus/synapse-v2.rules, to replace the functionality, it looks like we just need a label to distinguish local client vs bridge (appservice).

From the metrics in Grafana, it looks like we aren't running application services on matrix.org. But for some homeserver using application services, it could be nice to have the context.app_service.id available to distinguish where load is coming from. It's not necessarily high cardinality and not user controlled.

We could add a origin_app_service label which would be context.app_service.id or None.

@@ -0,0 +1 @@
Reduce cardinality of `synapse_storage_events_persisted_events_sep_total` metric by removing `origin_type` label.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

origin_type or origin_entity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants