General & Behavioral

Difficulty

This is a genuinely important architectural judgment call, and the "default to microservices" instinct is often wrong — splitting a service is a real, non-refundable cost (added network calls, independent deployment/monitoring overhead, new distributed-systems failure modes) that should be justified by a concrete benefit, not applied reflexively as a best practice.

Reasons that genuinely justify a new, separate service:

  • A meaningfully different scaling profile — e.g., an image-processing feature that's CPU-intensive and bursty deserves to scale independently from a mostly I/O-bound, steady-traffic order API; bundling them means scaling one forces over-provisioning the other.
  • A different release cadence or ownership boundary — a separate team that needs to deploy on its own schedule, without coordinating releases with an unrelated team's service, benefits genuinely from an independent deployable unit.
  • A different technology requirement — a feature that's a much better fit for a different language/runtime (e.g., a machine-learning inference service in Python) obviously can't just be "added to" an existing Spring Boot service.
  • A genuine domain boundary (Domain-Driven Design's "bounded context") — a feature that represents a conceptually distinct business capability, with its own data model and vocabulary, that would otherwise blur into a very different existing service's own domain model.

Reasons that usually don't justify splitting it out:

  • "It felt cleaner as its own thing" — cleanliness is achievable with good package/module structure inside an existing service, without paying the network/deployment tax of a separate service.
  • "Microservices are the modern best practice" — applied without any of the concrete pressures above, this mainly adds complexity (service discovery, inter-service auth, distributed tracing, eventual consistency) without a corresponding, specific benefit.
  • A small team maintaining the resulting sprawl of many tiny services often ends up worse off than a well-organized modular monolith, purely from the added operational overhead of many independently deployed pieces.

A practical approach in an interview answer: start by keeping a new feature inside the existing, appropriately-bounded service by default; only split it out once one of the concrete pressures above (scaling, ownership, technology, or a real domain boundary) actually materializes — and be ready to justify which specific pressure is driving the decision, rather than treating "more microservices" as an unconditional good.

A structured, evidence-first approach, rather than guessing at likely causes:

1. Start with existing observability data, not guesswork. If Micrometer/Actuator metrics are already being collected, http.server.requests broken down by endpoint/status immediately shows whether the slowness is isolated to one specific endpoint or system-wide, and whether it correlates with a deployment, a traffic spike, or a specific time window. If distributed tracing is available, a slow trace's span breakdown often points directly at which specific downstream call or database query is actually responsible, rather than needing to guess across the whole request path.

2. Check the highest-probability culprits first, since a handful of causes account for the overwhelming majority of real-world Spring Boot slow-endpoint incidents:

  • N+1 query problems — enable SQL logging temporarily (or check existing APM tooling) to see if a single request is issuing far more queries than it logically should.
  • A missing or ineffective database index on a column the query filters/joins on — check the query's execution plan (EXPLAIN ANALYZE) if the database itself seems to be the bottleneck.
  • A degraded downstream dependency — if the endpoint calls another service, is that service currently slow? (This is exactly the scenario a circuit breaker and distributed tracing are meant to make immediately visible.)
  • Excessive allocation / GC pressure — a sudden change in GC pause frequency/duration (visible in JVM metrics) correlating with the slowdown points at a memory/allocation-pattern regression rather than a query or downstream-call problem.

3. Reproduce in a safer environment when possible, rather than only ever experimenting directly against production — a staging environment with representative data volume/load lets you iterate on a hypothesis (add an index, batch a set of calls, adjust a connection pool size) without further risking the production system.

4. Form a specific, falsifiable hypothesis from the evidence gathered, rather than making several speculative changes simultaneously — e.g., "the trace shows 80% of this endpoint's latency is inside the inventory-service call, and that service's own metrics show its p99 latency spiked at the same time" is a specific, testable claim, not a vague guess.

5. Verify the fix using the same signal that originally surfaced the problem — if the diagnosis came from http.server.requests latency percentiles, confirm the fix actually moved that same metric, not just that the code change "looks correct."

What this communicates in an interview: a methodical, data-driven approach (use existing observability first, form a specific hypothesis, verify with the same signal) rather than jumping straight to plausible-sounding guesses — and genuine familiarity with the actual tools (Micrometer, distributed tracing, SQL logging/EXPLAIN) rather than just naming them abstractly.

Two common top-level organizing principles, with a fairly clear winner for anything beyond a small application:

Package-by-layer (a common starting point, but scales poorly):

com.example.app
├── controller/
│   ├── OrderController.java
│   ├── CustomerController.java
│   └── ProductController.java
├── service/
│   ├── OrderService.java
│   ├── CustomerService.java
│   └── ProductService.java
└── repository/
    ├── OrderRepository.java
    ├── CustomerRepository.java
    └── ProductRepository.java

The problem as an application grows: each package becomes a large, unrelated grab-bag of classes belonging to entirely different business features — finding "everything related to orders" means jumping across three separate, increasingly large packages, and it's easy for classes belonging to genuinely unrelated features to end up more coupled than they should be, simply because nothing about the package structure discourages it.

Package-by-feature (or package-by-domain) — group everything related to one business capability together:

com.example.app
├── orders/
│   ├── OrderController.java
│   ├── OrderService.java
│   ├── OrderRepository.java
│   └── Order.java
├── customers/
│   ├── CustomerController.java
│   ├── CustomerService.java
│   ├── CustomerRepository.java
│   └── Customer.java
└── products/
    ├── ProductController.java
    ├── ProductService.java
    ├── ProductRepository.java
    └── Product.java

Why this scales better: everything related to one feature lives in one place, package-private visibility can be used to genuinely enforce that other features only interact with a feature through its intended public surface (rather than reaching directly into its repository, for instance), and — importantly — it's much easier to eventually extract a feature into its own separate microservice later if a real justification for doing so emerges (see the microservices-decision question), since its code is already cleanly bounded rather than smeared across three layer-wide packages.

Within each feature package, the layered separation still matters — a controller should still only talk to its own feature's service, which talks to its own repository, not the other way around; package-by-feature changes the top-level organizing axis, not the underlying layered-architecture discipline itself.

Practical guidance for an interview answer: default to package-by-feature for anything beyond a genuinely small application, note that layering discipline (controller → service → repository, each depending only downward) still applies within each feature package, and mention that this structure also happens to make a later "should this become its own microservice" decision much easier to execute cleanly if that need ever arises.

A combination of a few habits tends to work well specifically for staying current with Spring:

Official sources:

  • The Spring blog (spring.io/blog) publishes detailed release announcements for every Spring Framework/Spring Boot/Spring Security release, usually explaining not just what changed but why — genuinely useful for understanding the reasoning behind a change, not just its mechanics.
  • Migration guides for major version upgrades (e.g., the Spring Boot 2 → 3 migration guide, which covered the javax.*jakarta.* namespace change) are typically thorough, and reading one even without an immediate upgrade planned is a good way to understand where the ecosystem is heading.
  • Release notes and the project's own GitHub milestones/issues for a closer look at what's landing in an upcoming minor/patch release.

Community engagement:

  • SpringOne (and recorded talks from it) is the primary conference for the ecosystem — talks from the actual Spring team members building a feature are often the clearest source for understanding a new capability's intended use and trade-offs.
  • Following well-known contributors and the Spring team's own communication channels for early context on where things are headed.

Hands-on practice:

  • Trying an upgrade early, on a side project or a lower-stakes internal service, before it's needed on a critical production system — this surfaces real friction points (a changed default, a removed deprecated API) with time to actually understand and address them, rather than discovering them under pressure during a mandatory production upgrade.
  • Reading a new minor version's changelog specifically looking for new features relevant to a current problem you're working on — often the fastest way a new capability actually gets adopted, versus reading changelogs in the abstract with no immediate application.

In an interview context, similar to the general Java-ecosystem version of this question, a concrete example (a specific migration you navigated, a new feature you evaluated and did/didn't adopt, and why) demonstrates genuine engagement far more convincingly than a generic list of resources.

Breaking a contract other teams depend on is one of the more disruptive things an API-owning team can do if handled carelessly — a deliberate, communicative process matters as much as the technical versioning mechanism itself.

1. Prefer additive, non-breaking changes whenever the requirement genuinely allows it. Adding a new optional field to a response, adding a new endpoint, or adding a new optional request parameter with a sensible default typically requires zero coordination with consuming teams at all, since well-behaved clients (deserializing into a DTO that ignores unknown fields, for instance) are unaffected. This should always be the first option considered before reaching for a breaking change and a new version.

2. When a genuine breaking change is unavoidable (removing a field, changing a field's type or meaning, changing an endpoint's behavior in an incompatible way), introduce it behind a new API version (see the API-versioning question — URI versioning being the most common, most discoverable approach) rather than modifying the existing, already-depended-upon contract in place.

3. Communicate proactively, well before the change ships. Consuming teams should hear about a planned breaking change and its migration deadline before it happens, not discover it via a failing integration test or a production incident — this might mean an internal API changelog, a direct message to known consumer teams, or (for external/public APIs) a formal deprecation notice with a concrete sunset date.

4. Run both versions in parallel during a genuine migration window. The old version should stay fully functional (not degraded or "soft-deprecated" in some broken state) for long enough that consuming teams can realistically plan and execute their own migration on their own schedule, not an artificially rushed one dictated purely by the API team's convenience.

5. Monitor actual usage of the deprecated version (request counts/logs tagged by API version, or a deprecation warning header returned to callers still using the old version) to know, with real data, whether it's actually safe to retire — rather than guessing or assuming migration is complete.

6. Only remove the old version once consumers have genuinely migrated, or the previously-communicated deadline has passed with clear, repeated advance notice — removing a still-actively-used old version on an arbitrary internal timeline, regardless of what was actually communicated, is exactly the kind of action that erodes other teams' trust in the API team's reliability going forward.

What this communicates in an interview: treating API compatibility as a cross-team communication and trust problem, not just a technical versioning mechanism — the versioning scheme itself (covered in the API-versioning question) is necessary but not sufficient without the surrounding process.