In Part 1 we measured what web backends cost before serving a single request — idle memory ranging from 3 MB (Rust) to 500 MB (Spring Boot). In Part 2 we measured what happens when traffic arrives — throughput, tail latency, and the surprising effect of adding a real database.
Now comes the question developers actually need answered: given all this data, how do you choose?
The honest answer is that backend weight is not a flaw to minimize — it’s a trade-off to understand. Heavier runtimes invest their overhead in infrastructure that lighter alternatives either don’t have or require you to assemble yourself. The goal isn’t the lightest stack. It’s the right one.
The Full Weight Spectrum
Here’s every major framework ranked by weight class, combining the idle cost data from Part 1 with throughput and concurrency data from Part 2:
| Weight Class | Framework | Idle RAM | Startup | Req/s (JSON) | Docker Image | Concurrency Model |
|---|---|---|---|---|---|---|
| Ultralight | Rust (Actix/Axum) | 3–15 MB | <5 ms | ~165,000 | 5–10 MB | Async tasks (tokio) |
| Ultralight | Go (net/http) | 8–15 MB | <10 ms | ~132,000 | 5–12 MB | Goroutine per request |
| Light | C# (Native AOT) | 17–23 MB | 14–17 ms | High | 18–90 MB | Async I/O, native binary |
| Light | Elixir (Phoenix) | 24–70 MB | 1–3 s | ~4,375* | 25–80 MB | BEAM processes (~2 KB each) |
| Light | PHP (plain FPM) | 5–15 MB/worker | <5 ms/req | ~7,000 | 51–80 MB | Process per request |
| Medium | Flask (Gunicorn) | 30–50 MB | 0.3–1 s | ~3,000 | 50–70 MB | Pre-forked workers |
| Medium | FastAPI (Uvicorn) | 40–60 MB | 0.5–1.5 s | ~4,800** | 50–80 MB | Async event loop |
| Medium | Node.js (Express) | 50–55 MB | 200–500 ms | ~13,000 | 100–180 MB | Single-thread event loop |
| Medium | C# (ASP.NET Core) | 40–80 MB | 70–80 ms | ~118,000 | 120–216 MB | Async I/O, JIT compiled |
| Medium | Java (Spring Native) | 50–80 MB | 30–90 ms | Moderate | ~136 MB | Native binary, no JVM |
| Heavy | Django (Gunicorn) | 70–130 MB | 2–5 s | ~950 | 80–120 MB | Pre-forked workers |
| Heavy | Ruby (Rails/Puma) | 80–150 MB | 3–8 s | ~2,340 | 180–300 MB | Forked workers + threads |
| Heavy | PHP (Laravel/FPM) | 30–60 MB/worker | 50–200 ms/req | ~299 | 100–150 MB | Process per request |
| Heavyweight | Java (Spring Boot/JVM) | 250–500 MB | 2.5–5 s | ~18,500 | 250–430 MB | JVM thread pool (200 default) |
* Phoenix’s HTTP throughput is moderate, but it handles 2M+ concurrent WebSocket connections — best for connection-dense workloads. ** FastAPI with asyncpg + ujson + 8 workers. Default single worker is ~1,185 req/s.
What Heavy Actually Buys You
Before writing off the heavyweights, it’s worth understanding what that overhead purchases.
Spring Boot’s 300 MB baseline isn’t waste — it’s investment in runtime infrastructure:
JIT optimizations unavailable to compiled languages. The JVM’s C2 compiler can inline virtual method calls based on observed targets, perform speculative optimizations on actual runtime data, and eliminate heap allocations through escape analysis. After 15–60 seconds of warm-up, a Java REST API can achieve 50,000–100,000 requests per second — more than Go for long-running server processes with stable traffic patterns.
Ecosystem depth you don’t have to build. Spring Boot includes out-of-the-box: OAuth2/SAML/LDAP security, JPA/JDBC/MongoDB/Redis data access, Kafka/RabbitMQ messaging, distributed tracing, Actuator observability, batch processing, and transaction management across multiple data sources. In a lightweight ecosystem, you assemble equivalent functionality from scattered packages and maintain compatibility yourself.
Production diagnostic tooling. Java Flight Recorder, async-profiler, heap dump analysis, GC logging — the JVM has the most mature production diagnostics of any runtime. Go has pprof and Rust has perf, but neither matches JVM depth for production troubleshooting.
The Concurrency Model Is the Biggest Variable
When choosing a backend, the concurrency model matters more than raw throughput numbers. It determines how your application behaves when load exceeds your expectations.
| Framework | Behavior Under Overload |
|---|---|
| Go, Rust, Elixir | Graceful degradation. New goroutines/tasks/processes are cheap. Latency rises gradually. |
| Node.js | Event loop slows. CPU-bound work blocks everything. No cliff, but a hard single-thread ceiling. |
| Java (Virtual Threads, JDK 21+) | Similar to Go — graceful scaling. This is the modern answer to Java’s threading problem. |
| Java (Traditional Tomcat) | Cliff at thread pool exhaustion. When all 200 threads are busy, requests queue. Latency spikes. |
| PHP-FPM | Cliff at worker pool exhaustion. Fixed worker count means a hard concurrency limit. |
| Rails/Puma | Moderate degradation. GVL limits parallelism within each worker. Queue builds when all threads are busy. |
| Python (Gunicorn sync) | Worker-bounded. Each worker handles one request at a time. |
The frameworks with “cliffs” — traditional Java Tomcat and PHP-FPM — are not fatally flawed, but they require explicit capacity planning. You need to know your concurrency ceiling before you hit it in production.
Framework Overhead vs. Language Overhead
One of the most common mistakes is conflating language weight with framework weight. They are separate variables:
| Framework | Language Baseline | Framework Adds | Total |
|---|---|---|---|
| Spring Boot | JVM ~50–180 MB | +100–200 MB (autoconfiguration, Tomcat, DI) | 250–400 MB |
| Rails | Ruby ~20–30 MB | +300–400 MB (ActiveRecord, gems, metaprogramming) | 400–600 MB |
| Django | Python ~20–30 MB | +100–120 MB (ORM, admin, middleware) | 120–140 MB |
| Laravel | PHP ~5–15 MB | +50–80 MB (Eloquent, queues, auth) | 60–100 MB |
| Phoenix | BEAM VM ~30–50 MB | +minimal | 30–50 MB |
| Express | V8 ~20–30 MB | +10–20 MB (routing + middleware) | 30–50 MB |
| Gin | Go runtime ~5–7 MB | +5–10 MB (HTTP router) | 10–15 MB |
| Actix Web | None | ~5 MB total | ~5 MB |
Phoenix is the outlier: a batteries-included framework on an ultralight runtime. The BEAM VM’s per-process isolation keeps it lean despite providing real-time channels, PubSub, and an ORM.
How Many Instances Fit on 8 GB?
For teams running multiple services or planning Kubernetes deployments, memory density directly maps to infrastructure cost. With ~7 GB available after OS overhead:
| Framework | Memory/Instance | Instances on 7 GB |
|---|---|---|
| Rust (Actix) | ~10–30 MB | 230–700 |
| Go (Gin) | ~25–70 MB | 100–280 |
| Elixir (Phoenix) | ~50–100 MB | 70–140 |
| Node.js (Express) | ~40–80 MB | 87–175 |
| C# (ASP.NET Core) | ~100–300 MB | 7–23 |
| Python (Flask, 4 workers) | ~200–300 MB | 23–35 |
| Java (Spring Boot, tuned) | ~128–256 MB | 5–10 |
| Java (Spring Boot, default) | ~256–512 MB | 3–5 |
| Ruby (Rails, 4 workers) | ~400–600 MB | 3–4 |
The cloud cost implication is real: 50 microservices at 512 MB (Java default) = 25 GB of memory. The same 50 services at 50 MB (Go) = 2.5 GB. That’s a 10x difference in infrastructure costs.
But before optimizing for memory density, check whether you actually need 50 microservices — or whether a single well-tuned JVM monolith would serve those 50 concerns more efficiently with less operational overhead.
The Database Equalizer
The most important nuance in this entire series: most web applications are I/O-bound, not CPU-bound.
When benchmarks include real database queries, the performance gaps that look enormous in pure HTTP tests collapse dramatically:
| Framework | JSON (no DB) | With DB Queries | What Happened |
|---|---|---|---|
| Go (Gin) | ~132,000 req/s | ~7,517 req/s | 18x gap… |
| Spring Boot (Java) | ~18,500 req/s | ~7,886 req/s | …becomes ~1x |
| FastAPI (Python) | ~4,800 req/s | ~4,831 req/s | Already DB-bound at baseline |
| Express (Node.js) | ~13,000 req/s | ~4,145 req/s | Converges with Java |
When your service spends 95% of its time waiting on a database query, the difference between Go and Java in CPU efficiency is noise. The bottleneck is your database, not your language. Language weight only dominates the discussion when you’ve already optimized your data layer.
Choosing the Right Weight
With all of the above in mind, here’s a practical decision framework:
| Scenario | Recommended | Why |
|---|---|---|
| Serverless / edge computing | Ultralight (Go, Rust) | Cold start is everything; JVM warm-up never pays off |
| Microservices at scale (many instances) | Light–Medium (Go, Node.js, Elixir) | Memory density and scaling speed matter |
| Enterprise applications | Heavy (Java/Spring, C#/.NET) | Ecosystem depth, tooling, and long-term maintainability |
| Rapid prototyping / startups | Heavy framework, light runtime (Laravel, Django) | Developer velocity over server cost |
| Real-time / WebSocket-heavy | Light (Elixir/Phoenix) | 2M+ connections, per-process garbage collection |
| AI/ML service backends | Medium (Python/FastAPI) | Python ML ecosystem is unmatched |
| High-throughput APIs (compute-bound) | Ultralight (Rust, Go) | When CPU efficiency genuinely matters |
| Long-running, compute-heavy workloads | Heavyweight JVM (Java) | JIT optimizations compound over hours of runtime |
When Weight Genuinely Doesn’t Matter
There are three situations where this entire analysis becomes irrelevant:
Developer productivity vs. server cost. For a team of 10 engineers at $150K+, saving $500/month on cloud infrastructure by choosing Go over Java is a rounding error if Spring Boot saves each developer two hours per week. The human cost almost always dominates the infrastructure cost at team scale.
Monolithic deployments. The JVM overhead is paid once and shared across all endpoints. A single well-tuned JVM monolith can be more memory-efficient than a fleet of 20 lightweight microservices, each with its own container overhead, sidecar proxy, and health check process.
I/O-bound applications. When the database is the bottleneck, language weight is noise. Optimize your query patterns, add indexes, and tune your connection pool before considering a rewrite in a lighter language.
The Mental Model
Think of backend weight as layers, each adding overhead — and each buying something:
Layer 5: Framework Spring: +200MB Rails: +400MB Express: +20MB Gin: +8MB
Layer 4: Standard Library Java: large Ruby: large Node: moderate Go: moderate
Layer 3: Concurrency Model Threads: 1MB ea GVL-limited Event loop Goroutines: 4KB
Layer 2: Memory Management JVM GC: % heap Ruby GC V8 GC Go GC: low-pause
Layer 1: Execution Engine JVM: ~100MB Ruby: ~25MB V8: ~25MB Go: ~5MB
Layer 0: Operating System [Shared by all]
Java is heavyweight because overhead accumulates at every layer. Go is lightweight because layers 1–3 are minimal. Rust is ultralight because layers 1–4 are essentially zero. Elixir is the outlier: moderate at the VM layer, but with the most efficient concurrency model available for connection-dense workloads.
Every layer of overhead exists to provide a capability. The JIT compiler buys peak throughput exceeding AOT-compiled code. The garbage collector buys freedom from manual memory management. The thick framework buys developer productivity. OS threads buy a simple synchronous programming model.
The question isn’t which stack is lightest. It’s which capabilities you actually need — and whether you’re willing to pay the cost to get them.
Data sources: TechEmpower Framework Benchmarks (Round 22/23), Sharkbench, Phoenix Road to 2M WebSocket Connections, and 80+ framework-specific benchmarks and documentation sources compiled in February 2026.
This is Part 3 of the Web Stack Weight series. Read Part 1: What Your Backend Costs at Rest and Part 2: What Your Backend Costs Under Load.
Comments
Loading comments...
Leave a Comment