— Case study
Java 21 Virtual Threads — 83% faster, 75% fewer threads
Upgraded the report scheduler from Java 11 to Java 21 with virtual threads. Cut peak thread usage by 75% and report generation time from 12 min to 2 min.
- Role
- Product Engineer
- Period
- Q4 2025
- Stack
- Java 21 · Spring Boot · Virtual Threads · GCP
Context
The report scheduler was the most thread-hungry service we ran. Peak concurrency hit at end-of-day settlement: hundreds of jobs, each blocking on database I/O and SFTP delivery. Under Java 11 platform threads, we kept hitting GC pressure spikes that bled into latency tail on otherwise-healthy services on the same node.
The upgrade had been sitting in the backlog for a quarter. I wrote the technical RFC, validated the path with a load-test harness, and got it through review.
Decisions
Tuning the platform-thread pool would have masked the symptom rather than fixing it, so the upgrade went straight to Java 21 and virtual threads. The workload was I/O-bound waiting on database calls and SFTP. Cooperative scheduling was what it actually needed.
The virtual threads upgrade shipped Q4 2025. The API delivery channel came separately in Q1 2026 — once the scheduler was stable, adding HTTP alongside email and SFTP was a contained change. Bundling it into the original upgrade would have widened the blast radius; keeping it separate meant the Q4 cutover stayed focused.
Cutover was a dual-run, not a big-bang. Both schedulers ran against the same job catalog for two weeks with the new one in shadow mode, and we diffed outputs daily before flipping traffic.
Results
- Peak thread usage down 75%.
- Report generation time down 83% (12 min → 2 min, end-of-day settlement window).
- GC pressure spikes eliminated; co-located services no longer affected.
- API delivery channel shipped Q1 2026; adopted by 3 internal consumers within the first month.
Stack notes
The least obvious cost was the JDBC driver. A few connection pools we used didn’t pin properly when the carrier thread was virtual, which would have made the upgrade worse, not better, under load. Catching that in the load-test harness before production was the real win. The version bump itself was a one-line change.