The Weight of Your Web Stack, Part 1: What Your Backend Costs at Rest

Last week we looked at the top web programming languages and the frameworks developers are choosing in 2026. But there’s a question those rankings don’t answer: what does each of those choices actually cost your server?

Not in dollars (though we’ll get there), but in memory, startup time, and disk space. Before your application serves a single request, it has already claimed resources just by existing. Some frameworks claim 3 MB. Others claim 500 MB. That difference matters when you’re running dozens of services, scaling in Kubernetes, or paying per-gigabyte on cloud infrastructure.

We dug into benchmark data from Sharkbench, TechEmpower, framework documentation, and 40+ independent benchmark studies to quantify the standing cost of 14 popular web backend combinations.

This is Part 1 of a two-part series. This post covers what your backend costs at rest — idle memory, startup time, Docker image sizes, and concurrency models. Part 2 covers what happens when traffic arrives.

The Standing Cost: Every Framework Ranked

Here’s what each framework consumes when it’s running but idle — no traffic, just waiting for requests. All measurements use default configurations unless otherwise noted.

Framework	Idle RAM	Startup Time	Docker Image	Weight Class
Rust (Actix/Axum)	3-15 MB	<5 ms	5-10 MB	Ultralight
Go (net/http)	8-15 MB	<10 ms	5-12 MB	Ultralight
C# (Native AOT)	17-23 MB	14-17 ms	18-90 MB	Light
Elixir (Phoenix)	24-70 MB	1-3 s	25-80 MB	Light
PHP plain (FPM)	5-15 MB/worker	<5 ms/req	51-80 MB	Light
Python (Flask)	30-50 MB	0.3-1 s	50-70 MB	Medium
Python (FastAPI)	40-60 MB	0.5-1.5 s	50-80 MB	Medium
C# (ASP.NET Core)	40-80 MB	70-80 ms	120-216 MB	Medium
Node.js (Express)	50-55 MB	200-500 ms	100-180 MB	Medium
Java (Spring Native)	50-80 MB	30-90 ms	~136 MB	Medium
Python (Django)	70-130 MB	2-5 s	80-120 MB	Heavy
Ruby (Rails/Puma)	80-150 MB	3-8 s	180-300 MB	Heavy
PHP (Laravel/FPM)	30-60 MB/worker	50-200 ms/req	100-150 MB	Heavy
Java (Spring Boot/JVM)	250-500 MB	2.5-5 s	250-430 MB	Heavyweight

Sources: Sharkbench Web Framework Benchmark, Markaicode Rust Framework Benchmark 2025, Thinktecture: Native AOT with ASP.NET Core, Baeldung: Spring Boot Default Memory Settings, DeployHQ: Ruby Application Servers 2025

The spread is enormous. A Rust Axum binary sits at 3 MB of RAM in a 5 MB Docker image. A default Spring Boot application with embedded Tomcat claims 250-500 MB of RAM in a 250-430 MB Docker image. That’s a 30-100x difference in idle resource consumption.

But as we’ll see, those numbers don’t tell the full story.

What You’re Actually Paying For

The idle cost isn’t waste — it’s a pre-investment in different things depending on the runtime. Understanding what each layer buys you explains why some frameworks are deliberately heavy.

The Weight Stack

Think of backend overhead as layers, each adding cost for a reason:

Layer 5: Framework           Spring: +200MB   Rails: +400MB   Express: +20MB   Gin: +8MB
Layer 4: Standard Library    Java: large      Ruby: large     Node: moderate   Go: moderate
Layer 3: Concurrency Model   Threads: 1MB ea  GVL-limited     Event loop       Goroutines: 4KB
Layer 2: Memory Management   JVM GC           Ruby GC         V8 GC            Go GC: low-pause
Layer 1: Execution Engine    JVM: ~100MB      Ruby: ~25MB     V8: ~25MB        Go: ~5MB
Layer 0: Operating System    [Shared by all]

Java is heavyweight because it has substantial overhead at every layer. Go is lightweight because layers 1 through 3 are minimal. Rust is ultralight because layers 1 through 4 are essentially zero — there’s no runtime, no garbage collector, no virtual machine. Elixir is a special case: the BEAM VM adds moderate overhead at layer 1, but its concurrency model at layer 3 is the most memory-efficient available.

Every layer of overhead exists to provide a capability:

Overhead	What It Buys You
JIT compiler (JVM, V8)	Peak throughput that can exceed ahead-of-time compiled code for long-running processes
Garbage collector	Freedom from manual memory management and entire classes of bugs
Class loading / reflection	Runtime flexibility, dependency injection frameworks
Thick framework (Spring, Rails)	Developer productivity, convention over configuration, batteries included
OS threads	Simple synchronous programming model that’s easy to reason about

Source: Red Hat: How the JVM Uses and Allocates Memory

The Three Weight Classes, Explained

Ultralight: Rust and Go (3-15 MB)

Rust with Actix Web or Axum produces a single static binary with no runtime, no garbage collector, and no virtual machine. Idle memory is 3-15 MB. Docker images built on scratch are 5-10 MB. Startup is effectively instant — under 5 milliseconds.

Go is nearly as lean. The Go runtime adds a small scheduler and garbage collector (about 5-7 MB), but goroutines cost only 2-4 KB each compared to the ~1 MB per OS thread in traditional Java. Docker images are 5-12 MB with a static binary.

Both languages compile to native binaries that need nothing else to run. No JDK, no interpreter, no node_modules. This is why they dominate in serverless and edge computing where cold start time is everything.

Best for: Microservices at scale, serverless functions, Kubernetes sidecars, edge computing, any environment where you’re paying per-MB of memory.

Sources: Sharkbench Rust Benchmarks (16.6 MB Actix, 8.5 MB Axum under load), Sharkbench Go Benchmarks (13.4 MB FastHTTP, 16.7 MB Gin under load)

Medium: Node.js, Python, C#, Elixir (24-80 MB)

This tier uses lightweight runtimes — V8, CPython, CoreCLR, BEAM — that add moderate baseline overhead but provide significant developer productivity benefits.

Node.js with Express claims 50-55 MB at idle. The V8 engine is 20-30 MB alone, with the rest going to the event loop infrastructure and compiled bytecode cache. The single-threaded event loop model means one process handles many concurrent connections via callbacks and promises, keeping memory scaling modest. TypeScript adds a compilation step but no runtime overhead once compiled.

Python with FastAPI sits at 40-60 MB for a single Uvicorn worker. FastAPI’s async model (built on ASGI) handles concurrent I/O via coroutines, but CPU-bound work blocks the event loop. Real deployments typically run 2-8 workers, multiplying memory to 120-400 MB.

C# with ASP.NET Core is 40-80 MB with the standard CLR. But Microsoft’s Native AOT compilation drops this to 17-23 MB with 14-17 ms startup — competitive with Go. The catch is that Native AOT doesn’t support all .NET features (reflection, dynamic loading), so it’s not a drop-in replacement for every application.

Elixir with Phoenix starts at 24-70 MB. The BEAM VM pre-allocates one scheduler per CPU core and maintains the OTP supervision tree, which adds overhead. But Phoenix’s lightweight processes (starting at ~0.5 KB each) mean it handles massive concurrency without significant additional memory per connection. The BEAM’s per-process garbage collection avoids the stop-the-world pauses that plague other runtimes.

Sources: Node.js: Understanding and Tuning Memory, Better Stack: Flask vs FastAPI, Thinktecture: Native AOT with ASP.NET Core, Phoenix: The Road to 2 Million WebSocket Connections

Heavy and Heavyweight: Rails, Django, Laravel, Spring Boot (70-500 MB)

These frameworks trade server resources for developer productivity. The overhead isn’t accidental — it’s the cost of having an ORM, admin interface, middleware stack, security framework, and template engine loaded and ready.

Ruby on Rails is the heaviest framework-plus-language combination at 80-150 MB for a single Puma process. In cluster mode with 4 workers, that’s 200-300 MB. Rails eager-loads application code, Active Record, the middleware stack, and the asset pipeline at startup, which takes 3-8 seconds. The payoff is the most productive CRUD development experience available — a single rails generate scaffold builds a full REST resource with model, controller, views, and migrations.

Python with Django claims 70-130 MB. Django’s ORM, admin interface, middleware stack, and template engine all load at startup. With 2-3 Gunicorn workers, you’re looking at 130-200 MB.

PHP with Laravel is a special case we’ll discuss below. Per-worker memory is moderate (30-60 MB), but the per-request bootstrap cost of 50-200 ms is hidden overhead that doesn’t show up in idle measurements.

Java with Spring Boot is the canonical heavyweight at 250-500 MB. The JVM alone claims ~100 MB. Spring’s auto-configuration, embedded Tomcat, dependency injection container, and a default thread pool of 200 threads (each consuming ~1 MB of stack space) add the rest. Startup takes 2.5-5 seconds for a simple application, longer for complex ones.

But Spring Boot’s weight buys the deepest ecosystem of any web framework: Spring Security (OAuth2, SAML, LDAP), Spring Data (JPA, MongoDB, Redis, Elasticsearch), Spring Cloud (service discovery, circuit breakers, distributed tracing), Spring Batch, and diagnostic tooling (Java Flight Recorder, async-profiler) that no other runtime matches for production troubleshooting.

Sources: Baeldung: Spring Boot Memory Usage Optimization, DeployHQ: Ruby Application Servers 2025, Kevin Dees: PHP-FPM Scalable Configuration

The PHP Paradox

PHP deserves its own section because its resource model is fundamentally different from every other language on this list.

PHP-FPM with pm=ondemand can be the most idle-efficient framework on this list — zero workers run when there’s no traffic, and the master process claims just ~20 MB. But every single request pays a bootstrap cost that other frameworks pay once at startup. For plain PHP, that’s under 5 ms per request. For Laravel, it’s 50-200 ms per request as the framework rebuilds its service container, loads configuration, and registers middleware — every time.

This shared-nothing architecture means PHP is simultaneously the most idle-efficient and least request-efficient framework in this comparison. Opcache eliminates re-compilation of PHP bytecode, but it cannot eliminate the re-execution of framework bootstrap logic.

Laravel Octane (with Swoole or FrankenPHP) changes this equation by keeping the framework warm between requests, similar to how Node.js or Go work. Benchmarks show a 5x improvement over traditional PHP-FPM, but it requires rethinking how you handle state in PHP code — a significant shift for a language built around the shared-nothing model.

The GraalVM Compromise

Java developers have a middle path between the JVM’s heavyweight overhead and losing access to the JVM ecosystem entirely: GraalVM Native Image.

Metric	Traditional JVM	GraalVM Native Image
Idle Memory	250-500 MB	50-80 MB
Startup Time	2.5-5 seconds	30-90 ms
Docker Image	250-430 MB	~136 MB
Peak Throughput	Higher (after warmup)	~80% of JVM peak
Build Time	~25 seconds	5-15 minutes

GraalVM compiles Java bytecode ahead of time into a native binary, eliminating the JVM’s runtime overhead. The trade-off is clear: you lose about 20% of peak throughput (because the JIT compiler’s runtime optimizations aren’t available) and gain a 5-10x reduction in memory and a 30-100x reduction in startup time.

A Spring PetClinic benchmark by Vincenzo Racca measured this directly: the JVM version peaked at 12,800 req/s while the Native Image version peaked at 10,249 req/s (80% of JVM), but the Native Image used just 694 MB of RSS versus the JVM’s 1,751 MB, and started in 0.22 seconds versus 7.18 seconds.

For long-running monoliths, the traditional JVM is usually the right choice — you pay the startup cost once and the JIT compiler’s optimizations compound over hours of runtime. For microservices, serverless, or Kubernetes environments where instances scale up and down frequently, Native Image makes the standing cost far more palatable.

Sources: Vincenzo Racca: Spring Boot vs GraalVM, Java Code Geeks: GraalVM Native Image vs Traditional JVM, GraalVM Performance (InfoQ)

How Concurrency Models Affect Scaling Cost

The biggest factor in how memory grows under load isn’t the idle footprint — it’s how the framework handles concurrent connections. This determines whether adding 10,000 users costs you kilobytes or gigabytes.

Concurrency Model	Memory Per Connection	Max Practical Connections	Used By
BEAM processes	~0.5-2.5 KB	Millions	Elixir/Erlang
Async tasks (tokio)	~few hundred bytes	Millions	Rust
Goroutines	~2-4 KB	Millions	Go
Java Virtual Threads (21+)	~few KB	Millions	Java (modern)
Event loop callbacks	~few hundred bytes	100K+	Node.js
OS threads (classic)	~1 MB	~10,000	Java (traditional Tomcat)
Pre-forked workers	~16-60 MB/worker	Worker count	PHP-FPM, Gunicorn, Puma

The difference is striking. Go handling 100,000 concurrent connections adds about 200-400 MB of goroutine memory. Traditional Java with OS threads would need 100 GB for the same number of connections. This is why Java’s thread-per-request model with Tomcat has a hard ceiling, and why Java 21’s Virtual Threads were such a significant addition — they bring Java’s per-connection cost in line with Go’s goroutines.

Pre-forked worker models (PHP-FPM, Gunicorn, Puma) are the most expensive. Each worker is a full copy of the application in a separate OS process. Adding concurrency means adding workers, and each worker costs 16-60 MB depending on the framework. This is why PHP and Ruby applications often scale out to more servers rather than scaling up connections per server.

Sources: Why you can have millions of Goroutines but only thousands of Java Threads, Phoenix: Road to 2 Million WebSocket Connections, Hauleth: BEAM Process Memory Usage, Go: Managing 10K+ Concurrent Connections

What This Means for Your Cloud Bill

Let’s make this concrete. On an 8 GB server (leaving ~1 GB for the OS), how many instances of each framework can you run?

Framework	Memory Per Instance	Instances on 7 GB	Relative Density
Rust (Actix)	~10-30 MB	230-700	100x
Go (Gin)	~25-70 MB	100-280	40x
Elixir (Phoenix)	~50-100 MB	70-140	20x
Node.js (Express)	~40-80 MB	87-175	25x
C# (ASP.NET Core)	~100-300 MB	23-70	10x
Python (Flask, 4 workers)	~200-300 MB	23-35	5x
PHP-FPM (10 workers)	~200-300 MB	23-35	5x
Java (Spring Boot, tuned)	~128-256 MB	27-54	8x
Java (Spring Boot, default)	~256-512 MB	13-27	4x
Ruby (Rails, 4 workers)	~400-600 MB	11-17	2x

The infrastructure cost multiplier: 50 microservices at 512 MB (Java default) = 25 GB of memory. The same 50 services at 50 MB (Go) = 2.5 GB. That’s a 10x difference in infrastructure costs — potentially thousands of dollars per month in cloud spend for a large microservices deployment.

But this comparison has a crucial asterisk: one Java monolith running all 50 services shares the JVM overhead once. The JVM’s per-instance cost is high, but a well-tuned Java monolith serving the same workload as 50 Go microservices may actually use less total memory. Architecture decisions matter as much as language decisions.

When the Standing Cost Doesn’t Matter

Before you rewrite everything in Rust, consider the scenarios where idle resource cost is irrelevant:

Developer cost vs. server cost. For a team of 10 engineers at $150K+ each, saving $500/month on cloud infrastructure by choosing Go over Spring Boot is noise if Spring saves each developer 2 hours per week. Developer time is almost always more expensive than server time.

Monolithic deployments. The JVM’s overhead is paid once and shared across all endpoints. A single well-tuned JVM monolith can be more efficient than a fleet of lightweight microservices, each with their own routing, logging, and connection pools.

I/O-bound applications. When your application spends 95% of its time waiting on database queries, external APIs, or file systems, the difference between a 15 MB Go process and a 300 MB Spring Boot process is a rounding error in your total infrastructure cost. The database server dwarfs everything else.

Low-traffic applications. If you’re serving 100 requests per minute, every language on this list handles it trivially. Optimize for developer productivity and time to market, not for server efficiency.

Coming Next: What Happens Under Load

The standing cost tells you what you’re paying to keep the lights on. But it says nothing about what happens when traffic arrives. Some frameworks with high idle costs become remarkably efficient under load (the JVM’s JIT compiler is a prime example). Others with low idle costs hit scaling walls that the heavier frameworks sail past.

In Part 2, we’ll cover warmed-up throughput at 100, 1,000, and 10,000 concurrent connections. We’ll answer the question every Java developer asks: does the JVM’s warmup investment actually pay off? And we’ll show the data on what we call “the database equalizer” — the surprising way that adding a real database compresses the performance gap between all of these languages.

Want to try any of these languages? Every language linked in this article has a dedicated page on CodeArchaeology with Hello World tutorials and Docker images to get you running in minutes. Browse our complete collection of 70+ languages.

The Weight of Your Web Stack, Part 1: What Your Backend Costs at Rest

The Standing Cost: Every Framework Ranked

What You’re Actually Paying For

The Weight Stack

The Three Weight Classes, Explained

Ultralight: Rust and Go (3-15 MB)

Medium: Node.js, Python, C#, Elixir (24-80 MB)

Heavy and Heavyweight: Rails, Django, Laravel, Spring Boot (70-500 MB)

The PHP Paradox

The GraalVM Compromise

How Concurrency Models Affect Scaling Cost

What This Means for Your Cloud Bill

When the Standing Cost Doesn’t Matter

Coming Next: What Happens Under Load

Sources

Primary Benchmarks

Java / Spring Boot

ASP.NET Core / .NET

Python

PHP / Laravel

Go / Rust / Elixir

Node.js / Ruby

Concurrency and Architecture