Spring Boot Actuator: Production Monitoring with Prometheus and Grafana

Spring Boot Actuator exposes production-ready operational endpoints — health checks, metrics, environment info, thread dumps — out of the box. Combined with Prometheus and Grafana, you get a full monitoring stack with minimal configuration.

This guide covers everything from initial setup to Kubernetes health probes, custom metrics, and securing your management endpoints.


Setup

Dependencies

<dependencies>
    <!-- Actuator -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

    <!-- Micrometer Prometheus registry -->
    <dependency>
        <groupId>io.micrometer</groupId>
        <artifactId>micrometer-registry-prometheus</artifactId>
        <scope>runtime</scope>
    </dependency>
</dependencies>

Basic configuration

# application.yaml
management:
  endpoints:
    web:
      exposure:
        include: health, info, metrics, prometheus, env, loggers
  endpoint:
    health:
      show-details: when-authorized   # never show details publicly
      probes:
        enabled: true                 # enable K8s liveness/readiness probes
  metrics:
    tags:
      application: ${spring.application.name}  # tag all metrics with app name

Key Endpoints

EndpointURLPurpose
Health/actuator/healthIs the app healthy?
Liveness/actuator/health/livenessIs the process alive? (K8s)
Readiness/actuator/health/readinessReady to receive traffic? (K8s)
Metrics/actuator/metricsList all metric names
Prometheus/actuator/prometheusMetrics in Prometheus format
Info/actuator/infoApp version, git commit
Loggers/actuator/loggersView/change log levels at runtime
Env/actuator/envEnvironment properties
ThreadDump/actuator/threaddumpJVM thread dump
Heapdump/actuator/heapdumpJVM heap dump (binary)

Health Endpoint

Auto-configured health indicators

Spring Boot automatically adds health indicators for:

  • Database (db): checks JDBC connection
  • Redis (redis): checks Redis ping
  • Kafka (kafka): checks broker connectivity
  • Disk (diskSpace): checks available disk space
  • Elasticsearch (elasticsearch)
curl http://localhost:8080/actuator/health | jq
{
  "status": "UP",
  "components": {
    "db": { "status": "UP", "details": { "database": "PostgreSQL", "validationQuery": "isValid()" } },
    "diskSpace": { "status": "UP", "details": { "total": 500000000000, "free": 350000000000 } },
    "redis": { "status": "UP" }
  }
}

Custom health indicator

@Component
public class ExternalApiHealthIndicator implements HealthIndicator {

    private final PaymentClient paymentClient;

    public ExternalApiHealthIndicator(PaymentClient paymentClient) {
        this.paymentClient = paymentClient;
    }

    @Override
    public Health health() {
        try {
            paymentClient.ping();
            return Health.up()
                .withDetail("url", paymentClient.getBaseUrl())
                .build();
        } catch (Exception e) {
            return Health.down()
                .withDetail("error", e.getMessage())
                .build();
        }
    }
}

Result:

{
  "status": "UP",
  "components": {
    "externalApi": {
      "status": "UP",
      "details": { "url": "https://api.payment-provider.com" }
    }
  }
}

Kubernetes Health Probes

Spring Boot’s liveness and readiness probes map directly to Kubernetes probe endpoints:

# application.yaml
management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true

Liveness (/actuator/health/liveness): Is the JVM process functional? Returns UP unless the app is in a broken internal state that requires a restart.

Readiness (/actuator/health/readiness): Is the app ready to serve requests? Returns DOWN during startup and if a required downstream service is unavailable.

Kubernetes deployment

apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
        - name: app
          image: myapp:latest
          ports:
            - containerPort: 8080

          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
            failureThreshold: 3

          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3

          startupProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            failureThreshold: 30  # 30 * 10s = 5 min startup window
            periodSeconds: 10

Why three probes?

  • startupProbe prevents the liveness probe from killing slow-starting apps
  • livenessProbe restarts containers in deadlock/broken state
  • readinessProbe removes pods from load balancer rotation without restarting them

Controlling readiness from code

During graceful shutdown, Spring Boot automatically marks readiness as DOWN (stops receiving new traffic) before the shutdown sequence begins:

@Component
public class MaintenanceModeService {

    private final ApplicationContext context;

    public void enableMaintenanceMode() {
        // Marks app as not ready — K8s stops sending traffic
        AvailabilityChangeEvent.publish(
            context, ReadinessState.REFUSING_TRAFFIC
        );
    }

    public void disableMaintenanceMode() {
        AvailabilityChangeEvent.publish(
            context, ReadinessState.ACCEPTING_TRAFFIC
        );
    }
}

Prometheus Integration

Prometheus scrape configuration

Add your Spring Boot service as a Prometheus scrape target:

# prometheus.yml
scrape_configs:
  - job_name: 'spring-boot-app'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 15s
    static_configs:
      - targets: ['app:8080']
        labels:
          env: production

What metrics are available

Spring Boot auto-configures metrics for:

  • JVM: heap, GC, threads, classes (jvm.*)
  • HTTP server: request count, latency, active connections (http.server.requests)
  • Tomcat: thread pool, connection stats (tomcat.*)
  • HikariCP: pool size, active connections, wait time (hikaricp.*)
  • Spring Cache: hit/miss rates, size (cache.*)
  • Kafka consumer/producer (spring.kafka.*)

Check available metrics:

curl http://localhost:8080/actuator/metrics | jq '.names[]'

Drill into a metric:

curl "http://localhost:8080/actuator/metrics/http.server.requests?tag=uri:/api/orders&tag=status:200"

Custom Metrics

Counter

@Service
public class OrderService {

    private final Counter ordersCreated;
    private final Counter ordersFailed;

    public OrderService(MeterRegistry registry) {
        ordersCreated = Counter.builder("orders.created")
            .description("Total orders created")
            .register(registry);
        ordersFailed = Counter.builder("orders.failed")
            .description("Total orders failed")
            .register(registry);
    }

    public Order createOrder(CreateOrderRequest request) {
        try {
            Order order = processOrder(request);
            ordersCreated.increment();
            return order;
        } catch (Exception e) {
            ordersFailed.increment();
            throw e;
        }
    }
}

Gauge (for current values)

@Component
public class QueueMetrics {

    public QueueMetrics(MeterRegistry registry, OrderQueue orderQueue) {
        Gauge.builder("orders.queue.size", orderQueue, OrderQueue::size)
            .description("Current order queue depth")
            .register(registry);
    }
}

Timer (for latency)

@Service
public class PaymentService {

    private final Timer paymentTimer;

    public PaymentService(MeterRegistry registry) {
        paymentTimer = Timer.builder("payment.processing.time")
            .description("Time to process a payment")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(registry);
    }

    public PaymentResult processPayment(PaymentRequest request) {
        return paymentTimer.record(() -> paymentGateway.charge(request));
    }
}

@Timed annotation (simpler)

@Timed(value = "orders.creation.time", percentiles = {0.5, 0.95, 0.99})
public Order createOrder(CreateOrderRequest request) {
    // timed automatically
}

Requires TimedAspect bean:

@Bean
public TimedAspect timedAspect(MeterRegistry registry) {
    return new TimedAspect(registry);
}

Grafana Dashboard

Essential panels for every Spring Boot service

1. Request rate and error rate

# Request rate
rate(http_server_requests_seconds_count{application="my-app"}[1m])

# Error rate (5xx only)
rate(http_server_requests_seconds_count{application="my-app",status=~"5.."}[1m])

# Error percentage
100 * rate(http_server_requests_seconds_count{status=~"5.."}[1m])
    / rate(http_server_requests_seconds_count[1m])

2. Response time (P95, P99)

# P95 latency
histogram_quantile(0.95,
    rate(http_server_requests_seconds_bucket{application="my-app"}[5m])
)

# P99 latency
histogram_quantile(0.99,
    rate(http_server_requests_seconds_bucket{application="my-app"}[5m])
)

3. JVM heap usage

# Heap used
jvm_memory_used_bytes{area="heap", application="my-app"}

# Heap max
jvm_memory_max_bytes{area="heap", application="my-app"}

# Heap usage %
100 * jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}

4. HikariCP connection pool

# Active connections
hikaricp_connections_active{application="my-app"}

# Pending (waiting for connection)
hikaricp_connections_pending{application="my-app"}

# Pool utilization %
100 * hikaricp_connections_active / hikaricp_connections_max

5. GC pressure

# GC pause time rate
rate(jvm_gc_pause_seconds_sum[1m])

# GC pause count
rate(jvm_gc_pause_seconds_count[1m])

Import pre-built dashboards

Grafana has community dashboards for Spring Boot:

  • JVM Micrometer dashboard (ID: 4701) — JVM metrics
  • Spring Boot Statistics dashboard (ID: 6756) — HTTP, Tomcat, HikariCP

Import via Grafana UI → Dashboards → Import → paste dashboard ID.


Securing Actuator Endpoints

Never expose all actuator endpoints publicly. In production:

# application.yaml — production
management:
  server:
    port: 8081         # separate port for management — not exposed to the internet
  endpoints:
    web:
      exposure:
        include: health, prometheus  # only expose what's needed publicly
  endpoint:
    health:
      show-details: never  # no details in the public health endpoint

Spring Security on management port

@Configuration
@Order(1)
public class ActuatorSecurityConfig {

    @Bean
    public SecurityFilterChain managementFilterChain(HttpSecurity http) throws Exception {
        return http
            .securityMatcher(EndpointRequest.toAnyEndpoint())
            .authorizeHttpRequests(auth -> auth
                .requestMatchers(EndpointRequest.to(HealthEndpoint.class)).permitAll()
                .requestMatchers(EndpointRequest.to(PrometheusEndpoint.class))
                    .hasIpAddress("10.0.0.0/8")  // Prometheus server IP range only
                .anyRequest().hasRole("ADMIN")
            )
            .build();
    }
}

Changing Log Levels at Runtime

Without restarting the app:

# View current level for a package
curl http://localhost:8080/actuator/loggers/com.example.service

# Change to DEBUG
curl -X POST http://localhost:8080/actuator/loggers/com.example.service \
     -H "Content-Type: application/json" \
     -d '{"configuredLevel": "DEBUG"}'

# Reset to default
curl -X POST http://localhost:8080/actuator/loggers/com.example.service \
     -H "Content-Type: application/json" \
     -d '{"configuredLevel": null}'

This is invaluable for debugging production issues without restarting.


Spring Boot 3.5: SSL Metrics

Spring Boot 3.5 added SSL certificate expiry metrics:

# Days until SSL certificate expires
ssl_certificate_expiry_seconds / 86400

Alert when this drops below 30 days. No more surprise certificate expirations.


Quick Reference

# application.yaml — production baseline
management:
  server:
    port: 8081
  endpoints:
    web:
      exposure:
        include: health, prometheus, loggers
  endpoint:
    health:
      probes:
        enabled: true
      show-details: never
  metrics:
    tags:
      application: ${spring.application.name}
# Key Prometheus queries
rate(http_server_requests_seconds_count[1m])                          # req/s
histogram_quantile(0.99, rate(http_server_requests_seconds_bucket[5m]))  # P99
jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}  # heap %
hikaricp_connections_pending                                           # DB pool wait
# Change log level at runtime
curl -X POST http://localhost:8081/actuator/loggers/com.example \
     -H "Content-Type: application/json" \
     -d '{"configuredLevel": "DEBUG"}'

Summary

Spring Boot Actuator with Prometheus and Grafana gives you production observability with minimal setup. Enable only the endpoints you need, put management on a separate port, use show-details: never publicly. For Kubernetes, enable probes and wire them to liveness/readiness/startup probes. Build Grafana dashboards around the four golden signals: request rate, error rate, latency (P95/P99), and HikariCP pool saturation. Add custom metrics with MeterRegistry for business-level visibility.

Abhay

Abhay Pratap Singh

DevOps Engineer passionate about automation, cloud infrastructure, and self-hosted tools. I write about Kubernetes, Terraform, DNS, and everything in between.