How do you implement custom health checks and readiness/liveness probes for Kubernetes?
Quick Answer
A custom health check is a Spring bean implementing HealthIndicator (or the reactive ReactiveHealthIndicator), whose health() method reports UP/DOWN status plus optional detail for a specific dependency (a downstream service, a critical cache) — Actuator automatically aggregates all registered health indicators into the overall /actuator/health result. Spring Boot's built-in 'liveness' and 'readiness' health groups map directly onto Kubernetes' liveness and readiness probe concepts: liveness answers 'should this pod be restarted' (is the app in a broken, unrecoverable state), while readiness answers 'should this pod receive traffic right now' (is it fully initialized and its dependencies reachable) — exposed as separate, dedicated endpoints Kubernetes can poll directly.
Detailed Answer
Custom health indicators let you report the health of application-specific dependencies beyond Spring Boot's automatically-detected defaults (database, disk space):
@Component
class PaymentGatewayHealthIndicator implements HealthIndicator {
private final PaymentGatewayClient client;
@Override
public Health health() {
try {
client.ping();
return Health.up().withDetail("gateway", "reachable").build();
} catch (Exception e) {
return Health.down(e).withDetail("gateway", "unreachable").build();
}
}
}
Actuator automatically discovers and aggregates every registered HealthIndicator bean into the overall /actuator/health response — if any indicator reports DOWN, the aggregated overall status is DOWN by default.
Kubernetes liveness vs. readiness — and why they're genuinely different questions:
- Liveness asks: "is this instance in a broken state that a restart would fix?" — e.g., deadlocked, stuck in an infinite loop, or otherwise unable to make forward progress. A failed liveness probe causes Kubernetes to kill and restart the pod.
- Readiness asks: "is this instance ready to receive traffic right now?" — e.g., has it finished its startup sequence, warmed up its caches, and can it currently reach its required dependencies? A failed readiness probe doesn't restart the pod — it just tells Kubernetes to stop routing traffic to it temporarily until it reports ready again (useful during startup, or a transient dependency outage that will self-resolve).
Spring Boot Actuator has built-in support for exactly this distinction via health groups, auto-enabled in Kubernetes-detected environments:
management.endpoint.health.probes.enabled=true
management.health.livenessstate.enabled=true
management.health.readinessstate.enabled=true
This exposes two dedicated endpoints:
/actuator/health/liveness— reflects the application's internal liveness state (Spring Boot'sLivenessState, which application code can explicitly mark asBROKENif it detects an unrecoverable internal condition)./actuator/health/readiness— aggregates readiness-relevant health indicators (has the app fully started, can it reach its critical dependencies) — this is where a custom indicator like thePaymentGatewayHealthIndicatorabove would typically be classified, since "can't reach the payment gateway" is a reason to stop routing traffic here, not a reason to restart the pod.
# Kubernetes deployment manifest
livenessProbe:
httpGet: { path: /actuator/health/liveness, port: 8080 }
readinessProbe:
httpGet: { path: /actuator/health/readiness, port: 8080 }
Why the distinction matters in practice: conflating the two (e.g., pointing both probes at the same generic /actuator/health) can cause a genuinely bad outcome — a transient, self-recovering dependency outage (which should only affect readiness, pausing traffic briefly) instead triggers a full pod restart via the liveness probe, needlessly discarding the instance's warmed-up state and potentially making an already-degraded situation worse by cycling pods that weren't actually broken.