Microservices & Spring Cloud

Difficulty

Splitting a single monolithic application into many independently deployable microservices solves real organizational and scaling problems (independent deployability, team autonomy, technology heterogeneity, targeted scaling) — but it also trades a simple in-process method call for a network call, which introduces a whole category of problems that simply didn't exist before:

  1. Service discovery — with a monolith, calling another module is just a method call; with microservices, Service A needs to know Service B's actual network location (host/port), which changes constantly as instances scale up/down, restart, or move between hosts in a dynamic environment (e.g., Kubernetes, an auto-scaling group).

  2. Centralized, consistent configuration — dozens of independently-deployed services each need their own configuration, and keeping them consistent (and updatable without a redeploy) across many services is much harder than editing one monolith's config file.

  3. Resilience against partial failure — in a monolith, "the database is slow" is one problem; in microservices, a slow or failing downstream service can cause calling services to pile up waiting threads/connections, and that failure can cascade upstream through a whole chain of dependent services if nothing stops it.

  4. Observability across service boundaries — a single user request might touch five different services; understanding what happened (and where something went wrong or was slow) requires being able to trace that request's path across service boundaries, not just within one process's logs.

  5. Inter-service communication patterns — deciding between synchronous (REST) and asynchronous (messaging) communication, and handling the different failure modes each implies.

Spring Cloud is a cohesive set of libraries, built on top of Spring Boot, specifically addressing each of these: Eureka/Consul for service discovery, Spring Cloud Config for centralized configuration, Resilience4j (integrated via Spring Cloud Circuit Breaker) for resilience patterns, Micrometer Tracing for distributed tracing, and Spring Cloud Gateway/OpenFeign for routing and inter-service calls — each addressing one specific distributed-systems concern that a single-process monolith never had to think about at all.

In a dynamic microservices environment, service instances are constantly starting, stopping, scaling, and moving between hosts — a static configuration file listing "Order Service is at 10.0.0.5:8080" would go stale almost immediately. Service discovery solves this by letting services find each other by logical name, resolved dynamically at call time.

How it typically works (using Eureka as the concrete example):

  1. Registration: each service instance, on startup, registers itself with the Eureka server (the discovery registry), reporting its host, port, and service name (order-service), and sends periodic heartbeats to confirm it's still alive.
@SpringBootApplication
@EnableDiscoveryClient
class OrderServiceApplication { ... } // registers with Eureka on startup
eureka:
  client:
    service-url:
      defaultZone: http://eureka-server:8761/eureka/
  1. Discovery: when Service A wants to call order-service, instead of a hardcoded URL, it asks the Eureka client (a local cache, periodically refreshed from the Eureka server) for the current set of healthy order-service instances:
@Autowired DiscoveryClient discoveryClient;
List<ServiceInstance> instances = discoveryClient.getInstances("order-service");

More commonly, this is done transparently via a load-balanced RestTemplate/WebClient or OpenFeign client, which resolves the logical service name to an actual instance (and load-balances across multiple healthy instances) automatically, without the calling code ever handling raw addresses itself.

  1. Health/liveness tracking: if an instance stops sending heartbeats (crashed, network partition, scaled down), Eureka eventually evicts it from the registry, so calling services stop being routed to a dead instance.

Alternatives: Consul offers similar discovery capabilities (plus additional features like key-value config storage and health checking) and integrates with Spring Cloud similarly. In Kubernetes-based deployments, it's increasingly common to rely on Kubernetes' own built-in DNS-based service discovery (a Kubernetes Service resource provides a stable DNS name resolving to healthy pod IPs automatically) rather than adding a separate Eureka server, since the container orchestrator is already solving essentially the same problem at the infrastructure level.

Why this matters: without service discovery, scaling a service up/down, replacing an unhealthy instance, or moving services between hosts would require manually updating every other service's configuration to match — service discovery makes the whole system able to reconfigure itself dynamically as instances come and go.

In a single monolith, configuration is just one application.yml. Across dozens of independently-deployed microservices, configuration management gets meaningfully harder: shared settings (a common database connection pool size convention, feature flags, third-party API endpoints) need to stay consistent, environment-specific overrides (dev/staging/prod) need to apply uniformly, and configuration changes ideally shouldn't require rebuilding and redeploying every affected service just to update one value.

Spring Cloud Config Server centralizes this: it's a small Spring Boot application that serves configuration to other services from a single source of truth, most commonly a Git repository (though other backends — a plain filesystem, Vault for secrets — are also supported):

# config-server's own application.yml
spring:
  cloud:
    config:
      server:
        git:
          uri: https://github.com/myorg/config-repo
config-repo/
├── application.yml           # shared across all services
├── order-service.yml         # specific to the "order-service" application
├── order-service-prod.yml    # specific to order-service AND the "prod" profile

Client services fetch their configuration from the Config Server at startup instead of relying purely on their own local application.yml:

spring:
  config:
    import: "configserver:http://config-server:8888"
  application:
    name: order-service # tells the Config Server which config file(s) to serve

Benefits of centralizing configuration this way:

  • Single source of truth — no risk of subtly divergent copies of the same shared setting drifting across services.
  • Auditability — since it's backed by Git, every configuration change has a full history (who changed what, when, and why, via commit messages) — much stronger than editing a value directly on a server or in an untracked properties file.
  • Consistent environment handling — profile-specific overlays (order-service-prod.yml) work the same familiar way Spring profiles already do locally, just centrally managed.
  • Runtime refresh without redeployment — combined with Spring Cloud Bus (which broadcasts a refresh event across all instances via a message broker) or a manual /actuator/refresh call, @RefreshScope-annotated beans can pick up a configuration change without requiring a full redeploy/restart of the affected service — valuable for things like adjusting a feature flag or a rate limit on the fly.

Trade-off worth noting: this introduces the Config Server itself as a new, critical piece of infrastructure other services now depend on at startup — its own availability and resilience (e.g., client-side caching of last-known-good configuration) become an operational concern in their own right.

In a microservices architecture with many independently deployed services, having every client (a mobile app, a web frontend, a third-party integration) talk directly to each individual backend service creates real problems: clients need to know every service's address, cross-cutting concerns (authentication, rate limiting, logging, CORS) get duplicated across every service, and there's no single place to apply a system-wide policy change.

An API Gateway sits in front of the whole collection of services as a single, unified entry point: clients talk only to the gateway, which routes each request to the appropriate backend service and centralizes those cross-cutting concerns in one place instead of duplicating them everywhere.

Spring Cloud Gateway is Spring's implementation of this pattern, built on the reactive, non-blocking Spring WebFlux stack (chosen specifically because a gateway needs to efficiently handle a large volume of concurrent, often I/O-bound "just route this elsewhere" requests, which suits a non-blocking model well).

Core concepts:

  • Routes — map an incoming request (matched by path, header, or other predicates) to a destination:
spring:
  cloud:
    gateway:
      routes:
        - id: order-service-route
          uri: lb://order-service        # "lb://" — load-balanced via service discovery
          predicates:
            - Path=/api/orders/**
          filters:
            - StripPrefix=1
  • Predicates — conditions that determine whether a route matches a given request (path, header, method, query parameter, time-based, etc.).
  • Filters — modify the request before it's forwarded, or the response before it's returned to the client — used for cross-cutting concerns applied centrally rather than per-service: adding/removing headers, rate limiting (RequestRateLimiter), authentication checks, request/response logging, and integrating a circuit breaker filter around a downstream call.

Why centralize this at the gateway rather than in each service:

  • One place to change a cross-cutting policy — e.g., rolling out a new rate-limiting rule or authentication requirement across the whole API surface doesn't require touching every individual backend service.
  • Simplifies clients — a single, stable entry point and URL structure, regardless of how backend services are actually organized, scaled, or renamed internally.
  • Consistent enforcement — a security or rate-limiting rule enforced at the gateway can't be accidentally skipped by one service that forgot to implement it locally.

Trade-off: the gateway becomes a critical, shared piece of infrastructure — it needs to be highly available and carefully monitored, since every request now passes through it, making it both a natural place to centralize concerns and a potential single point of failure if not deployed and scaled appropriately.

Related Resources

When Service A calls a downstream Service B that's failing, slow, or timing out, naively retrying every request makes the problem worse in two ways: it keeps tying up Service A's own threads/connections waiting on a dependency that's unlikely to respond in time, and it adds even more load onto an already-struggling Service B — potentially delaying its recovery, or causing the failure to cascade further upstream as Service A itself becomes slow/unresponsive to its own callers.

The circuit breaker pattern (named by analogy to an electrical circuit breaker) addresses this by tracking recent call outcomes and, once failures cross a threshold, stopping further calls entirely for a while — "tripping open" — instead of continuing to try and fail:

States:

  • Closed (normal operation): calls pass through normally; the breaker tracks the recent success/failure rate.
  • Open: once the failure rate crosses a configured threshold, the breaker "trips" — further calls are short-circuited immediately (typically routed to a fallback) without even attempting the real call, for a configured cooldown period. This protects both the struggling downstream service (no more load piling on) and the calling service (no more threads tied up waiting on a call unlikely to succeed).
  • Half-Open: after the cooldown, the breaker allows a small number of trial calls through to test whether the dependency has actually recovered — if they succeed, it transitions back to Closed; if they still fail, it goes back to Open for another cooldown period.

Resilience4j is the modern, lightweight library Spring Boot integrates with for this (having succeeded Netflix Hystrix, which is now in maintenance mode) — typically applied declaratively via annotations:

@Service
class InventoryClient {
    @CircuitBreaker(name = "inventoryService", fallbackMethod = "fallbackStock")
    int getStockLevel(String productId) {
        return restClient.get("/inventory/" + productId, Integer.class);
    }

    int fallbackStock(String productId, Throwable t) {
        return -1; // or a cached/default value — some sensible degraded behavior instead of an exception
    }
}
resilience4j:
  circuitbreaker:
    instances:
      inventoryService:
        failure-rate-threshold: 50       # trip open once 50% of recent calls fail
        wait-duration-in-open-state: 10s # cooldown before trying half-open
        sliding-window-size: 20          # evaluate failure rate over the last 20 calls

Complementary patterns Resilience4j also provides, often combined with a circuit breaker: retry (with backoff, for transient failures), rate limiter (cap outgoing call rate), bulkhead (isolate resources for one dependency so its failure can't starve threads needed for calls to other, healthy dependencies), and time limiter (enforce a call timeout).

The key mental model: a circuit breaker's goal isn't to make a failing call succeed — it's to fail fast once failure is likely, protecting both sides of the call and giving the struggling dependency room to recover, rather than compounding the problem with an unbounded pile of hung, retrying requests.

Related Resources