Prometheus Metrics
Prometheus and Alertmanager deployment scraping all platform services, evaluating alerting rules, and routing alerts to PagerDuty and Slack.
Software Subsystem Application active
Prometheus Metrics runs two Prometheus replicas (HA pair) with the Prometheus Operator managing ServiceMonitors. Alertmanager deduplicates and groups alerts before routing to PagerDuty (critical) and Slack (warning). Metrics retention is 15 days in-cluster; longer retention is handled by Thanos sidecar to GCS.
Relationships
Composes outgoing 2
Part of incoming 2
Owned by incoming 1
Served by incoming 1
Architecture Context
Diagrams
Not yet referenced in any diagram
Properties
Type Software Subsystem
Layer Application
Domain Platform Operations
Status active
Owner Platform Operations Team
Additional Metadata
Catalog Id SUB-OPS-001
Environments production, staging
Owns Data Aggregates Service Health Metric Aggregate
Served By Cloud Services Operations GKE Cluster
Archimate Type application-component
C4 Type Container
Togaf Type Application Component
Meta Model
Business
Organization
Application current
Technology