How do you implement custom health checks and readiness/liveness probes for Kubernetes?

9 minadvancedhealth-checkskubernetesreadiness-liveness

Quick Answer

A custom health check is a Spring bean implementing HealthIndicator (or the reactive ReactiveHealthIndicator), whose health() method reports UP/DOWN status plus optional detail for a specific dependency (a downstream service, a critical cache) — Actuator automatically aggregates all registered health indicators into the overall /actuator/health result. Spring Boot's built-in 'liveness' and 'readiness' health groups map directly onto Kubernetes' liveness and readiness probe concepts: liveness answers 'should this pod be restarted' (is the app in a broken, unrecoverable state), while readiness answers 'should this pod receive traffic right now' (is it fully initialized and its dependencies reachable) — exposed as separate, dedicated endpoints Kubernetes can poll directly.

Detailed Answer

Custom health indicators let you report the health of application-specific dependencies beyond Spring Boot's automatically-detected defaults (database, disk space):

@Component
class PaymentGatewayHealthIndicator implements HealthIndicator {
    private final PaymentGatewayClient client;

    @Override
    public Health health() {
        try {
            client.ping();
            return Health.up().withDetail("gateway", "reachable").build();
        } catch (Exception e) {
            return Health.down(e).withDetail("gateway", "unreachable").build();
        }
    }
}

Actuator automatically discovers and aggregates every registered HealthIndicator bean into the overall /actuator/health response — if any indicator reports DOWN, the aggregated overall status is DOWN by default.

Kubernetes liveness vs. readiness — and why they're genuinely different questions:

  • Liveness asks: "is this instance in a broken state that a restart would fix?" — e.g., deadlocked, stuck in an infinite loop, or otherwise unable to make forward progress. A failed liveness probe causes Kubernetes to kill and restart the pod.
  • Readiness asks: "is this instance ready to receive traffic right now?" — e.g., has it finished its startup sequence, warmed up its caches, and can it currently reach its required dependencies? A failed readiness probe doesn't restart the pod — it just tells Kubernetes to stop routing traffic to it temporarily until it reports ready again (useful during startup, or a transient dependency outage that will self-resolve).

Spring Boot Actuator has built-in support for exactly this distinction via health groups, auto-enabled in Kubernetes-detected environments:

management.endpoint.health.probes.enabled=true
management.health.livenessstate.enabled=true
management.health.readinessstate.enabled=true

This exposes two dedicated endpoints:

  • /actuator/health/liveness — reflects the application's internal liveness state (Spring Boot's LivenessState, which application code can explicitly mark as BROKEN if it detects an unrecoverable internal condition).
  • /actuator/health/readiness — aggregates readiness-relevant health indicators (has the app fully started, can it reach its critical dependencies) — this is where a custom indicator like the PaymentGatewayHealthIndicator above would typically be classified, since "can't reach the payment gateway" is a reason to stop routing traffic here, not a reason to restart the pod.
# Kubernetes deployment manifest
livenessProbe:
  httpGet: { path: /actuator/health/liveness, port: 8080 }
readinessProbe:
  httpGet: { path: /actuator/health/readiness, port: 8080 }

Why the distinction matters in practice: conflating the two (e.g., pointing both probes at the same generic /actuator/health) can cause a genuinely bad outcome — a transient, self-recovering dependency outage (which should only affect readiness, pausing traffic briefly) instead triggers a full pod restart via the liveness probe, needlessly discarding the instance's warmed-up state and potentially making an already-degraded situation worse by cycling pods that weren't actually broken.