The Weight of Your Web Stack, Part 3: Choosing the Right Backend for the Job

In Part 1 we measured what web backends cost before serving a single request — idle memory ranging from 3 MB (Rust) to 500 MB (Spring Boot). In Part 2 we measured what happens when traffic arrives — throughput, tail latency, and the surprising effect of adding a real database.

Now comes the question developers actually need answered: given all this data, how do you choose?

The honest answer is that backend weight is not a flaw to minimize — it’s a trade-off to understand. Heavier runtimes invest their overhead in infrastructure that lighter alternatives either don’t have or require you to assemble yourself. The goal isn’t the lightest stack. It’s the right one.

The Full Weight Spectrum

Here’s every major framework ranked by weight class, combining the idle cost data from Part 1 with throughput and concurrency data from Part 2:

Weight Class	Framework	Idle RAM	Startup	Req/s (JSON)	Docker Image	Concurrency Model
Ultralight	Rust (Actix/Axum)	3–15 MB	<5 ms	~165,000	5–10 MB	Async tasks (tokio)
Ultralight	Go (net/http)	8–15 MB	<10 ms	~132,000	5–12 MB	Goroutine per request
Light	C# (Native AOT)	17–23 MB	14–17 ms	High	18–90 MB	Async I/O, native binary
Light	Elixir (Phoenix)	24–70 MB	1–3 s	~4,375*	25–80 MB	BEAM processes (~2 KB each)
Light	PHP (plain FPM)	5–15 MB/worker	<5 ms/req	~7,000	51–80 MB	Process per request
Medium	Flask (Gunicorn)	30–50 MB	0.3–1 s	~3,000	50–70 MB	Pre-forked workers
Medium	FastAPI (Uvicorn)	40–60 MB	0.5–1.5 s	~4,800**	50–80 MB	Async event loop
Medium	Node.js (Express)	50–55 MB	200–500 ms	~13,000	100–180 MB	Single-thread event loop
Medium	C# (ASP.NET Core)	40–80 MB	70–80 ms	~118,000	120–216 MB	Async I/O, JIT compiled
Medium	Java (Spring Native)	50–80 MB	30–90 ms	Moderate	~136 MB	Native binary, no JVM
Heavy	Django (Gunicorn)	70–130 MB	2–5 s	~950	80–120 MB	Pre-forked workers
Heavy	Ruby (Rails/Puma)	80–150 MB	3–8 s	~2,340	180–300 MB	Forked workers + threads
Heavy	PHP (Laravel/FPM)	30–60 MB/worker	50–200 ms/req	~299	100–150 MB	Process per request
Heavyweight	Java (Spring Boot/JVM)	250–500 MB	2.5–5 s	~18,500	250–430 MB	JVM thread pool (200 default)

* Phoenix’s HTTP throughput is moderate, but it handles 2M+ concurrent WebSocket connections — best for connection-dense workloads. ** FastAPI with asyncpg + ujson + 8 workers. Default single worker is ~1,185 req/s.

What Heavy Actually Buys You

Before writing off the heavyweights, it’s worth understanding what that overhead purchases.

Spring Boot’s 300 MB baseline isn’t waste — it’s investment in runtime infrastructure:

JIT optimizations unavailable to compiled languages. The JVM’s C2 compiler can inline virtual method calls based on observed targets, perform speculative optimizations on actual runtime data, and eliminate heap allocations through escape analysis. After 15–60 seconds of warm-up, a Java REST API can achieve 50,000–100,000 requests per second — more than Go for long-running server processes with stable traffic patterns.

Ecosystem depth you don’t have to build. Spring Boot includes out-of-the-box: OAuth2/SAML/LDAP security, JPA/JDBC/MongoDB/Redis data access, Kafka/RabbitMQ messaging, distributed tracing, Actuator observability, batch processing, and transaction management across multiple data sources. In a lightweight ecosystem, you assemble equivalent functionality from scattered packages and maintain compatibility yourself.

Production diagnostic tooling. Java Flight Recorder, async-profiler, heap dump analysis, GC logging — the JVM has the most mature production diagnostics of any runtime. Go has pprof and Rust has perf, but neither matches JVM depth for production troubleshooting.

The Concurrency Model Is the Biggest Variable

When choosing a backend, the concurrency model matters more than raw throughput numbers. It determines how your application behaves when load exceeds your expectations.

Framework	Behavior Under Overload
Go, Rust, Elixir	Graceful degradation. New goroutines/tasks/processes are cheap. Latency rises gradually.
Node.js	Event loop slows. CPU-bound work blocks everything. No cliff, but a hard single-thread ceiling.
Java (Virtual Threads, JDK 21+)	Similar to Go — graceful scaling. This is the modern answer to Java’s threading problem.
Java (Traditional Tomcat)	Cliff at thread pool exhaustion. When all 200 threads are busy, requests queue. Latency spikes.
PHP-FPM	Cliff at worker pool exhaustion. Fixed worker count means a hard concurrency limit.
Rails/Puma	Moderate degradation. GVL limits parallelism within each worker. Queue builds when all threads are busy.
Python (Gunicorn sync)	Worker-bounded. Each worker handles one request at a time.

The frameworks with “cliffs” — traditional Java Tomcat and PHP-FPM — are not fatally flawed, but they require explicit capacity planning. You need to know your concurrency ceiling before you hit it in production.

Framework Overhead vs. Language Overhead

One of the most common mistakes is conflating language weight with framework weight. They are separate variables:

Framework	Language Baseline	Framework Adds	Total
Spring Boot	JVM ~50–180 MB	+100–200 MB (autoconfiguration, Tomcat, DI)	250–400 MB
Rails	Ruby ~20–30 MB	+300–400 MB (ActiveRecord, gems, metaprogramming)	400–600 MB
Django	Python ~20–30 MB	+100–120 MB (ORM, admin, middleware)	120–140 MB
Laravel	PHP ~5–15 MB	+50–80 MB (Eloquent, queues, auth)	60–100 MB
Phoenix	BEAM VM ~30–50 MB	+minimal	30–50 MB
Express	V8 ~20–30 MB	+10–20 MB (routing + middleware)	30–50 MB
Gin	Go runtime ~5–7 MB	+5–10 MB (HTTP router)	10–15 MB
Actix Web	None	~5 MB total	~5 MB

Phoenix is the outlier: a batteries-included framework on an ultralight runtime. The BEAM VM’s per-process isolation keeps it lean despite providing real-time channels, PubSub, and an ORM.

How Many Instances Fit on 8 GB?

For teams running multiple services or planning Kubernetes deployments, memory density directly maps to infrastructure cost. With ~7 GB available after OS overhead:

Framework	Memory/Instance	Instances on 7 GB
Rust (Actix)	~10–30 MB	230–700
Go (Gin)	~25–70 MB	100–280
Elixir (Phoenix)	~50–100 MB	70–140
Node.js (Express)	~40–80 MB	87–175
C# (ASP.NET Core)	~100–300 MB	7–23
Python (Flask, 4 workers)	~200–300 MB	23–35
Java (Spring Boot, tuned)	~128–256 MB	5–10
Java (Spring Boot, default)	~256–512 MB	3–5
Ruby (Rails, 4 workers)	~400–600 MB	3–4

The cloud cost implication is real: 50 microservices at 512 MB (Java default) = 25 GB of memory. The same 50 services at 50 MB (Go) = 2.5 GB. That’s a 10x difference in infrastructure costs.

But before optimizing for memory density, check whether you actually need 50 microservices — or whether a single well-tuned JVM monolith would serve those 50 concerns more efficiently with less operational overhead.

The Database Equalizer

The most important nuance in this entire series: most web applications are I/O-bound, not CPU-bound.

When benchmarks include real database queries, the performance gaps that look enormous in pure HTTP tests collapse dramatically:

Framework	JSON (no DB)	With DB Queries	What Happened
Go (Gin)	~132,000 req/s	~7,517 req/s	18x gap…
Spring Boot (Java)	~18,500 req/s	~7,886 req/s	…becomes ~1x
FastAPI (Python)	~4,800 req/s	~4,831 req/s	Already DB-bound at baseline
Express (Node.js)	~13,000 req/s	~4,145 req/s	Converges with Java

When your service spends 95% of its time waiting on a database query, the difference between Go and Java in CPU efficiency is noise. The bottleneck is your database, not your language. Language weight only dominates the discussion when you’ve already optimized your data layer.

Choosing the Right Weight

With all of the above in mind, here’s a practical decision framework:

Scenario	Recommended	Why
Serverless / edge computing	Ultralight (Go, Rust)	Cold start is everything; JVM warm-up never pays off
Microservices at scale (many instances)	Light–Medium (Go, Node.js, Elixir)	Memory density and scaling speed matter
Enterprise applications	Heavy (Java/Spring, C#/.NET)	Ecosystem depth, tooling, and long-term maintainability
Rapid prototyping / startups	Heavy framework, light runtime (Laravel, Django)	Developer velocity over server cost
Real-time / WebSocket-heavy	Light (Elixir/Phoenix)	2M+ connections, per-process garbage collection
AI/ML service backends	Medium (Python/FastAPI)	Python ML ecosystem is unmatched
High-throughput APIs (compute-bound)	Ultralight (Rust, Go)	When CPU efficiency genuinely matters
Long-running, compute-heavy workloads	Heavyweight JVM (Java)	JIT optimizations compound over hours of runtime

When Weight Genuinely Doesn’t Matter

There are three situations where this entire analysis becomes irrelevant:

Developer productivity vs. server cost. For a team of 10 engineers at $150K+, saving $500/month on cloud infrastructure by choosing Go over Java is a rounding error if Spring Boot saves each developer two hours per week. The human cost almost always dominates the infrastructure cost at team scale.

Monolithic deployments. The JVM overhead is paid once and shared across all endpoints. A single well-tuned JVM monolith can be more memory-efficient than a fleet of 20 lightweight microservices, each with its own container overhead, sidecar proxy, and health check process.

I/O-bound applications. When the database is the bottleneck, language weight is noise. Optimize your query patterns, add indexes, and tune your connection pool before considering a rewrite in a lighter language.

The Mental Model

Think of backend weight as layers, each adding overhead — and each buying something:

Layer 5: Framework           Spring: +200MB    Rails: +400MB    Express: +20MB    Gin: +8MB
Layer 4: Standard Library    Java: large       Ruby: large      Node: moderate    Go: moderate
Layer 3: Concurrency Model   Threads: 1MB ea   GVL-limited      Event loop        Goroutines: 4KB
Layer 2: Memory Management   JVM GC: % heap    Ruby GC          V8 GC             Go GC: low-pause
Layer 1: Execution Engine    JVM: ~100MB       Ruby: ~25MB      V8: ~25MB         Go: ~5MB
Layer 0: Operating System    [Shared by all]

Java is heavyweight because overhead accumulates at every layer. Go is lightweight because layers 1–3 are minimal. Rust is ultralight because layers 1–4 are essentially zero. Elixir is the outlier: moderate at the VM layer, but with the most efficient concurrency model available for connection-dense workloads.

Every layer of overhead exists to provide a capability. The JIT compiler buys peak throughput exceeding AOT-compiled code. The garbage collector buys freedom from manual memory management. The thick framework buys developer productivity. OS threads buy a simple synchronous programming model.

The question isn’t which stack is lightest. It’s which capabilities you actually need — and whether you’re willing to pay the cost to get them.

Data sources: TechEmpower Framework Benchmarks (Round 22/23), Sharkbench, Phoenix Road to 2M WebSocket Connections, and 80+ framework-specific benchmarks and documentation sources compiled in February 2026.

This is Part 3 of the Web Stack Weight series. Read Part 1: What Your Backend Costs at Rest and Part 2: What Your Backend Costs Under Load.