Prometheus Metrics runs two Prometheus replicas (HA pair) with the Prometheus Operator managing ServiceMonitors. Alertmanager deduplicates and groups alerts before routing to PagerDuty (critical) and Slack (warning). Metrics retention is 15 days in-cluster; longer retention is handled by Thanos sidecar to GCS.

Relationships

Composes outgoing 2
Target Element Element Type
Monitoring Stack Software System
Metrics Query Endpoint API Endpoint
Part of incoming 2
Source Element Element Type
Monitoring Stack Software System
Metrics Query Endpoint API Endpoint
Owned by incoming 1
Source Element Element Type
Service Health Metric Aggregate Data Aggregate
Served by incoming 1
Source Element Element Type
Operations GKE Cluster Cloud Service

Architecture Context

Diagrams

Not yet referenced in any diagram

Properties

Type Software Subsystem
Layer Application
Domain Platform Operations
Status active
Owner Platform Operations Team

Additional Metadata

Catalog Id SUB-OPS-001
Environments production, staging
Owns Data Aggregates Service Health Metric Aggregate
Served By Cloud Services Operations GKE Cluster
Archimate Type application-component
C4 Type Container
Togaf Type Application Component

Meta Model

Business
Organization
Application current
Technology

Actions