prometheus-configuration

Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.

Author

Install

Hot:4

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=sickn33-skills-prometheus-configuration&locale=en&source=copy

Prometheus Configuration - Comprehensive Monitoring Configuration Guide

Skill Overview


Prometheus Configuration provides an end-to-end configuration guide from installation to production deployment. It helps you build an enterprise-grade monitoring solution for metrics collection, alerting rules, and service discovery.

Use Cases

1. Kubernetes Cluster Monitoring


Use Helm to deploy Prometheus quickly. With Kubernetes service discovery, automatically collect Pod and Node metrics, and use alerting rules to monitor overall cluster health.

2. Application Performance Monitoring


Configure the metrics endpoint for application metric scraping. Use recording rules to precompute common queries (e.g., P95 latency, error rate). Then, use alerting rules to promptly detect performance anomalies.

3. Infrastructure Monitoring


Deploy Node Exporter to collect server metrics. Configure static or file-based service discovery to provide comprehensive monitoring of CPU, memory, disk, and other resources.

Core Features

Scrape Configurations


Supports various ways to configure scrape targets, such as static target configuration, file-based service discovery, and Kubernetes service discovery. You can flexibly scrape application and infrastructure metrics. With relabel_configs, you can dynamically add labels, filter targets, and rewrite metric paths.

Recording Rules


Precompute results for high-frequency queries to reduce query load. Supports rule definitions for API metrics (request rates, error rates, latency quantiles) as well as resource metrics (CPU, memory, disk usage).

Alert Rules


Define alert conditions using PromQL expressions. Supports multi-level severities (critical, warning) and rich-text annotations. Includes common alert templates such as service availability, error rate, latency, and resource usage.

Common Questions

What is Prometheus’s default scrape interval? How can it be adjusted?


The default scrape interval is 15 seconds (scrape_interval: 15s), and the evaluation interval is also 15 seconds (evaluation_interval: 15s). You can adjust them globally in the global section of prometheus.yml, or set intervals per individual job. In production, it’s recommended to set intervals based on business needs, typically between 15 and 60 seconds.

How do I configure Prometheus service discovery in Kubernetes?


Use kubernetes_sd_configs to set up Kubernetes service discovery, using the role parameter to specify discovery type (pod, service, node, etc.). Combined with relabel_configs, you can filter targets based on annotations, set metric paths, and ports. For example, use the annotation prometheus.io/scrape: "true" to mark Pods that should be scraped.

What does the for parameter mean in Prometheus alert rules?


The for parameter specifies how long the alert condition must be continuously satisfied before the alert is triggered. For example, for: 5m means the metric must exceed the threshold for 5 consecutive minutes before the alert fires. This helps avoid false positives caused by short-term fluctuations. For severe alerts such as service downtime, it’s common to set it to 1 minute; for alerts related to resource usage, it’s typically set to 5 minutes.

Skill Boundary


This skill focuses on Prometheus configuration and does not cover the following:
  • Grafana dashboard design (available via the grafana-dashboards skill)

  • Application code instrumentation and metric exposure

  • Alertmanager routing and notification configuration details

  • Deep configuration of long-term storage solutions such as Thanos/Cortex