Last week we looked at the top web programming languages and the frameworks developers are choosing in 2026. But there’s a question those rankings don’t answer: what does each of those choices actually cost your server?
Not in dollars (though we’ll get there), but in memory, startup time, and disk space. Before your application serves a single request, it has already claimed resources just by existing. Some frameworks claim 3 MB. Others claim 500 MB. That difference matters when you’re running dozens of services, scaling in Kubernetes, or paying per-gigabyte on cloud infrastructure.
We dug into benchmark data from Sharkbench, TechEmpower, framework documentation, and 40+ independent benchmark studies to quantify the standing cost of 14 popular web backend combinations.
This is Part 1 of a two-part series. This post covers what your backend costs at rest — idle memory, startup time, Docker image sizes, and concurrency models. Part 2 covers what happens when traffic arrives.
The Standing Cost: Every Framework Ranked
Here’s what each framework consumes when it’s running but idle — no traffic, just waiting for requests. All measurements use default configurations unless otherwise noted.
| Framework | Idle RAM | Startup Time | Docker Image | Weight Class |
|---|---|---|---|---|
| Rust (Actix/Axum) | 3-15 MB | <5 ms | 5-10 MB | Ultralight |
| Go (net/http) | 8-15 MB | <10 ms | 5-12 MB | Ultralight |
| C# (Native AOT) | 17-23 MB | 14-17 ms | 18-90 MB | Light |
| Elixir (Phoenix) | 24-70 MB | 1-3 s | 25-80 MB | Light |
| PHP plain (FPM) | 5-15 MB/worker | <5 ms/req | 51-80 MB | Light |
| Python (Flask) | 30-50 MB | 0.3-1 s | 50-70 MB | Medium |
| Python (FastAPI) | 40-60 MB | 0.5-1.5 s | 50-80 MB | Medium |
| C# (ASP.NET Core) | 40-80 MB | 70-80 ms | 120-216 MB | Medium |
| Node.js (Express) | 50-55 MB | 200-500 ms | 100-180 MB | Medium |
| Java (Spring Native) | 50-80 MB | 30-90 ms | ~136 MB | Medium |
| Python (Django) | 70-130 MB | 2-5 s | 80-120 MB | Heavy |
| Ruby (Rails/Puma) | 80-150 MB | 3-8 s | 180-300 MB | Heavy |
| PHP (Laravel/FPM) | 30-60 MB/worker | 50-200 ms/req | 100-150 MB | Heavy |
| Java (Spring Boot/JVM) | 250-500 MB | 2.5-5 s | 250-430 MB | Heavyweight |
Sources: Sharkbench Web Framework Benchmark, Markaicode Rust Framework Benchmark 2025, Thinktecture: Native AOT with ASP.NET Core, Baeldung: Spring Boot Default Memory Settings, DeployHQ: Ruby Application Servers 2025
The spread is enormous. A Rust Axum binary sits at 3 MB of RAM in a 5 MB Docker image. A default Spring Boot application with embedded Tomcat claims 250-500 MB of RAM in a 250-430 MB Docker image. That’s a 30-100x difference in idle resource consumption.
But as we’ll see, those numbers don’t tell the full story.
What You’re Actually Paying For
The idle cost isn’t waste — it’s a pre-investment in different things depending on the runtime. Understanding what each layer buys you explains why some frameworks are deliberately heavy.
The Weight Stack
Think of backend overhead as layers, each adding cost for a reason:
Layer 5: Framework Spring: +200MB Rails: +400MB Express: +20MB Gin: +8MB
Layer 4: Standard Library Java: large Ruby: large Node: moderate Go: moderate
Layer 3: Concurrency Model Threads: 1MB ea GVL-limited Event loop Goroutines: 4KB
Layer 2: Memory Management JVM GC Ruby GC V8 GC Go GC: low-pause
Layer 1: Execution Engine JVM: ~100MB Ruby: ~25MB V8: ~25MB Go: ~5MB
Layer 0: Operating System [Shared by all]
Java is heavyweight because it has substantial overhead at every layer. Go is lightweight because layers 1 through 3 are minimal. Rust is ultralight because layers 1 through 4 are essentially zero — there’s no runtime, no garbage collector, no virtual machine. Elixir is a special case: the BEAM VM adds moderate overhead at layer 1, but its concurrency model at layer 3 is the most memory-efficient available.
Every layer of overhead exists to provide a capability:
| Overhead | What It Buys You |
|---|---|
| JIT compiler (JVM, V8) | Peak throughput that can exceed ahead-of-time compiled code for long-running processes |
| Garbage collector | Freedom from manual memory management and entire classes of bugs |
| Class loading / reflection | Runtime flexibility, dependency injection frameworks |
| Thick framework (Spring, Rails) | Developer productivity, convention over configuration, batteries included |
| OS threads | Simple synchronous programming model that’s easy to reason about |
Source: Red Hat: How the JVM Uses and Allocates Memory
The Three Weight Classes, Explained
Ultralight: Rust and Go (3-15 MB)
Rust with Actix Web or Axum produces a single static binary with no runtime, no garbage collector, and no virtual machine. Idle memory is 3-15 MB. Docker images built on scratch are 5-10 MB. Startup is effectively instant — under 5 milliseconds.
Go is nearly as lean. The Go runtime adds a small scheduler and garbage collector (about 5-7 MB), but goroutines cost only 2-4 KB each compared to the ~1 MB per OS thread in traditional Java. Docker images are 5-12 MB with a static binary.
Both languages compile to native binaries that need nothing else to run. No JDK, no interpreter, no node_modules. This is why they dominate in serverless and edge computing where cold start time is everything.
Best for: Microservices at scale, serverless functions, Kubernetes sidecars, edge computing, any environment where you’re paying per-MB of memory.
Sources: Sharkbench Rust Benchmarks (16.6 MB Actix, 8.5 MB Axum under load), Sharkbench Go Benchmarks (13.4 MB FastHTTP, 16.7 MB Gin under load)
Medium: Node.js, Python, C#, Elixir (24-80 MB)
This tier uses lightweight runtimes — V8, CPython, CoreCLR, BEAM — that add moderate baseline overhead but provide significant developer productivity benefits.
Node.js with Express claims 50-55 MB at idle. The V8 engine is 20-30 MB alone, with the rest going to the event loop infrastructure and compiled bytecode cache. The single-threaded event loop model means one process handles many concurrent connections via callbacks and promises, keeping memory scaling modest. TypeScript adds a compilation step but no runtime overhead once compiled.
Python with FastAPI sits at 40-60 MB for a single Uvicorn worker. FastAPI’s async model (built on ASGI) handles concurrent I/O via coroutines, but CPU-bound work blocks the event loop. Real deployments typically run 2-8 workers, multiplying memory to 120-400 MB.
C# with ASP.NET Core is 40-80 MB with the standard CLR. But Microsoft’s Native AOT compilation drops this to 17-23 MB with 14-17 ms startup — competitive with Go. The catch is that Native AOT doesn’t support all .NET features (reflection, dynamic loading), so it’s not a drop-in replacement for every application.
Elixir with Phoenix starts at 24-70 MB. The BEAM VM pre-allocates one scheduler per CPU core and maintains the OTP supervision tree, which adds overhead. But Phoenix’s lightweight processes (starting at ~0.5 KB each) mean it handles massive concurrency without significant additional memory per connection. The BEAM’s per-process garbage collection avoids the stop-the-world pauses that plague other runtimes.
Sources: Node.js: Understanding and Tuning Memory, Better Stack: Flask vs FastAPI, Thinktecture: Native AOT with ASP.NET Core, Phoenix: The Road to 2 Million WebSocket Connections
Heavy and Heavyweight: Rails, Django, Laravel, Spring Boot (70-500 MB)
These frameworks trade server resources for developer productivity. The overhead isn’t accidental — it’s the cost of having an ORM, admin interface, middleware stack, security framework, and template engine loaded and ready.
Ruby on Rails is the heaviest framework-plus-language combination at 80-150 MB for a single Puma process. In cluster mode with 4 workers, that’s 200-300 MB. Rails eager-loads application code, Active Record, the middleware stack, and the asset pipeline at startup, which takes 3-8 seconds. The payoff is the most productive CRUD development experience available — a single rails generate scaffold builds a full REST resource with model, controller, views, and migrations.
Python with Django claims 70-130 MB. Django’s ORM, admin interface, middleware stack, and template engine all load at startup. With 2-3 Gunicorn workers, you’re looking at 130-200 MB.
PHP with Laravel is a special case we’ll discuss below. Per-worker memory is moderate (30-60 MB), but the per-request bootstrap cost of 50-200 ms is hidden overhead that doesn’t show up in idle measurements.
Java with Spring Boot is the canonical heavyweight at 250-500 MB. The JVM alone claims ~100 MB. Spring’s auto-configuration, embedded Tomcat, dependency injection container, and a default thread pool of 200 threads (each consuming ~1 MB of stack space) add the rest. Startup takes 2.5-5 seconds for a simple application, longer for complex ones.
But Spring Boot’s weight buys the deepest ecosystem of any web framework: Spring Security (OAuth2, SAML, LDAP), Spring Data (JPA, MongoDB, Redis, Elasticsearch), Spring Cloud (service discovery, circuit breakers, distributed tracing), Spring Batch, and diagnostic tooling (Java Flight Recorder, async-profiler) that no other runtime matches for production troubleshooting.
Sources: Baeldung: Spring Boot Memory Usage Optimization, DeployHQ: Ruby Application Servers 2025, Kevin Dees: PHP-FPM Scalable Configuration
The PHP Paradox
PHP deserves its own section because its resource model is fundamentally different from every other language on this list.
PHP-FPM with pm=ondemand can be the most idle-efficient framework on this list — zero workers run when there’s no traffic, and the master process claims just ~20 MB. But every single request pays a bootstrap cost that other frameworks pay once at startup. For plain PHP, that’s under 5 ms per request. For Laravel, it’s 50-200 ms per request as the framework rebuilds its service container, loads configuration, and registers middleware — every time.
This shared-nothing architecture means PHP is simultaneously the most idle-efficient and least request-efficient framework in this comparison. Opcache eliminates re-compilation of PHP bytecode, but it cannot eliminate the re-execution of framework bootstrap logic.
Laravel Octane (with Swoole or FrankenPHP) changes this equation by keeping the framework warm between requests, similar to how Node.js or Go work. Benchmarks show a 5x improvement over traditional PHP-FPM, but it requires rethinking how you handle state in PHP code — a significant shift for a language built around the shared-nothing model.
The GraalVM Compromise
Java developers have a middle path between the JVM’s heavyweight overhead and losing access to the JVM ecosystem entirely: GraalVM Native Image.
| Metric | Traditional JVM | GraalVM Native Image |
|---|---|---|
| Idle Memory | 250-500 MB | 50-80 MB |
| Startup Time | 2.5-5 seconds | 30-90 ms |
| Docker Image | 250-430 MB | ~136 MB |
| Peak Throughput | Higher (after warmup) | ~80% of JVM peak |
| Build Time | ~25 seconds | 5-15 minutes |
GraalVM compiles Java bytecode ahead of time into a native binary, eliminating the JVM’s runtime overhead. The trade-off is clear: you lose about 20% of peak throughput (because the JIT compiler’s runtime optimizations aren’t available) and gain a 5-10x reduction in memory and a 30-100x reduction in startup time.
A Spring PetClinic benchmark by Vincenzo Racca measured this directly: the JVM version peaked at 12,800 req/s while the Native Image version peaked at 10,249 req/s (80% of JVM), but the Native Image used just 694 MB of RSS versus the JVM’s 1,751 MB, and started in 0.22 seconds versus 7.18 seconds.
For long-running monoliths, the traditional JVM is usually the right choice — you pay the startup cost once and the JIT compiler’s optimizations compound over hours of runtime. For microservices, serverless, or Kubernetes environments where instances scale up and down frequently, Native Image makes the standing cost far more palatable.
Sources: Vincenzo Racca: Spring Boot vs GraalVM, Java Code Geeks: GraalVM Native Image vs Traditional JVM, GraalVM Performance (InfoQ)
How Concurrency Models Affect Scaling Cost
The biggest factor in how memory grows under load isn’t the idle footprint — it’s how the framework handles concurrent connections. This determines whether adding 10,000 users costs you kilobytes or gigabytes.
| Concurrency Model | Memory Per Connection | Max Practical Connections | Used By |
|---|---|---|---|
| BEAM processes | ~0.5-2.5 KB | Millions | Elixir/Erlang |
| Async tasks (tokio) | ~few hundred bytes | Millions | Rust |
| Goroutines | ~2-4 KB | Millions | Go |
| Java Virtual Threads (21+) | ~few KB | Millions | Java (modern) |
| Event loop callbacks | ~few hundred bytes | 100K+ | Node.js |
| OS threads (classic) | ~1 MB | ~10,000 | Java (traditional Tomcat) |
| Pre-forked workers | ~16-60 MB/worker | Worker count | PHP-FPM, Gunicorn, Puma |
The difference is striking. Go handling 100,000 concurrent connections adds about 200-400 MB of goroutine memory. Traditional Java with OS threads would need 100 GB for the same number of connections. This is why Java’s thread-per-request model with Tomcat has a hard ceiling, and why Java 21’s Virtual Threads were such a significant addition — they bring Java’s per-connection cost in line with Go’s goroutines.
Pre-forked worker models (PHP-FPM, Gunicorn, Puma) are the most expensive. Each worker is a full copy of the application in a separate OS process. Adding concurrency means adding workers, and each worker costs 16-60 MB depending on the framework. This is why PHP and Ruby applications often scale out to more servers rather than scaling up connections per server.
Sources: Why you can have millions of Goroutines but only thousands of Java Threads, Phoenix: Road to 2 Million WebSocket Connections, Hauleth: BEAM Process Memory Usage, Go: Managing 10K+ Concurrent Connections
What This Means for Your Cloud Bill
Let’s make this concrete. On an 8 GB server (leaving ~1 GB for the OS), how many instances of each framework can you run?
| Framework | Memory Per Instance | Instances on 7 GB | Relative Density |
|---|---|---|---|
| Rust (Actix) | ~10-30 MB | 230-700 | 100x |
| Go (Gin) | ~25-70 MB | 100-280 | 40x |
| Elixir (Phoenix) | ~50-100 MB | 70-140 | 20x |
| Node.js (Express) | ~40-80 MB | 87-175 | 25x |
| C# (ASP.NET Core) | ~100-300 MB | 23-70 | 10x |
| Python (Flask, 4 workers) | ~200-300 MB | 23-35 | 5x |
| PHP-FPM (10 workers) | ~200-300 MB | 23-35 | 5x |
| Java (Spring Boot, tuned) | ~128-256 MB | 27-54 | 8x |
| Java (Spring Boot, default) | ~256-512 MB | 13-27 | 4x |
| Ruby (Rails, 4 workers) | ~400-600 MB | 11-17 | 2x |
The infrastructure cost multiplier: 50 microservices at 512 MB (Java default) = 25 GB of memory. The same 50 services at 50 MB (Go) = 2.5 GB. That’s a 10x difference in infrastructure costs — potentially thousands of dollars per month in cloud spend for a large microservices deployment.
But this comparison has a crucial asterisk: one Java monolith running all 50 services shares the JVM overhead once. The JVM’s per-instance cost is high, but a well-tuned Java monolith serving the same workload as 50 Go microservices may actually use less total memory. Architecture decisions matter as much as language decisions.
When the Standing Cost Doesn’t Matter
Before you rewrite everything in Rust, consider the scenarios where idle resource cost is irrelevant:
Developer cost vs. server cost. For a team of 10 engineers at $150K+ each, saving $500/month on cloud infrastructure by choosing Go over Spring Boot is noise if Spring saves each developer 2 hours per week. Developer time is almost always more expensive than server time.
Monolithic deployments. The JVM’s overhead is paid once and shared across all endpoints. A single well-tuned JVM monolith can be more efficient than a fleet of lightweight microservices, each with their own routing, logging, and connection pools.
I/O-bound applications. When your application spends 95% of its time waiting on database queries, external APIs, or file systems, the difference between a 15 MB Go process and a 300 MB Spring Boot process is a rounding error in your total infrastructure cost. The database server dwarfs everything else.
Low-traffic applications. If you’re serving 100 requests per minute, every language on this list handles it trivially. Optimize for developer productivity and time to market, not for server efficiency.
Coming Next: What Happens Under Load
The standing cost tells you what you’re paying to keep the lights on. But it says nothing about what happens when traffic arrives. Some frameworks with high idle costs become remarkably efficient under load (the JVM’s JIT compiler is a prime example). Others with low idle costs hit scaling walls that the heavier frameworks sail past.
In Part 2, we’ll cover warmed-up throughput at 100, 1,000, and 10,000 concurrent connections. We’ll answer the question every Java developer asks: does the JVM’s warmup investment actually pay off? And we’ll show the data on what we call “the database equalizer” — the surprising way that adding a real database compresses the performance gap between all of these languages.
Want to try any of these languages? Every language linked in this article has a dedicated page on CodeArchaeology with Hello World tutorials and Docker images to get you running in minutes. Browse our complete collection of 70+ languages.
Sources
Primary Benchmarks
- TechEmpower Framework Benchmarks (Round 22/23)
- Sharkbench Web Framework Benchmark
- Markaicode: Rust Web Frameworks Performance Benchmark 2025
Java / Spring Boot
- Baeldung: Spring Boot Default Memory Settings
- Baeldung: Spring Boot Memory Usage Optimization
- Vincenzo Racca: Spring Boot vs GraalVM Performance
- Java Code Geeks: GraalVM Native Image vs Traditional JVM
- Red Hat: How the JVM Uses and Allocates Memory
ASP.NET Core / .NET
- Thinktecture: Native AOT with ASP.NET Core
- Microsoft Learn: Memory Management in ASP.NET Core
- Microsoft Learn: Native AOT Deployment
Python
PHP / Laravel
- Kevin Dees: PHP-FPM Scalable Configuration
- Laravel Octane: Drivers, Benchmarks & Safe Adoption
- Platform.sh: PHP-FPM Sizing
Go / Rust / Elixir
- Go: Managing 10K+ Concurrent Connections
- Phoenix: Road to 2 Million WebSocket Connections
- Why you can have millions of Goroutines but only thousands of Java Threads
- Hauleth: BEAM Process Memory Usage