How do you implement custom health checks and readiness/liveness probes for Kubernetes?

Detailed Answer

Custom health indicators let you report the health of application-specific dependencies beyond Spring Boot's automatically-detected defaults (database, disk space):

@Component
class PaymentGatewayHealthIndicator implements HealthIndicator {
    private final PaymentGatewayClient client;

    @Override
    public Health health() {
        try {
            client.ping();
            return Health.up().withDetail("gateway", "reachable").build();
        } catch (Exception e) {
            return Health.down(e).withDetail("gateway", "unreachable").build();
        }
    }
}

Actuator automatically discovers and aggregates every registered HealthIndicator bean into the overall /actuator/health response — if any indicator reports DOWN, the aggregated overall status is DOWN by default.

Kubernetes liveness vs. readiness — and why they're genuinely different questions:

Liveness asks: "is this instance in a broken state that a restart would fix?" — e.g., deadlocked, stuck in an infinite loop, or otherwise unable to make forward progress. A failed liveness probe causes Kubernetes to kill and restart the pod.
Readiness asks: "is this instance ready to receive traffic right now?" — e.g., has it finished its startup sequence, warmed up its caches, and can it currently reach its required dependencies? A failed readiness probe doesn't restart the pod — it just tells Kubernetes to stop routing traffic to it temporarily until it reports ready again (useful during startup, or a transient dependency outage that will self-resolve).

Spring Boot Actuator has built-in support for exactly this distinction via health groups, auto-enabled in Kubernetes-detected environments:

management.endpoint.health.probes.enabled=true
management.health.livenessstate.enabled=true
management.health.readinessstate.enabled=true

This exposes two dedicated endpoints:

/actuator/health/liveness — reflects the application's internal liveness state (Spring Boot's LivenessState, which application code can explicitly mark as BROKEN if it detects an unrecoverable internal condition).
/actuator/health/readiness — aggregates readiness-relevant health indicators (has the app fully started, can it reach its critical dependencies) — this is where a custom indicator like the PaymentGatewayHealthIndicator above would typically be classified, since "can't reach the payment gateway" is a reason to stop routing traffic here, not a reason to restart the pod.

# Kubernetes deployment manifest
livenessProbe:
  httpGet: { path: /actuator/health/liveness, port: 8080 }
readinessProbe:
  httpGet: { path: /actuator/health/readiness, port: 8080 }

Why the distinction matters in practice: conflating the two (e.g., pointing both probes at the same generic /actuator/health) can cause a genuinely bad outcome — a transient, self-recovering dependency outage (which should only affect readiness, pausing traffic briefly) instead triggers a full pod restart via the liveness probe, needlessly discarding the instance's warmed-up state and potentially making an already-degraded situation worse by cycling pods that weren't actually broken.

How do you implement custom health checks and readiness/liveness probes for Kubernetes?

Quick Answer

Detailed Answer