26.2From single to multithreaded
The single-threaded server (lesson 26.1) serves one visitor at a time, and we proved a slow request blocks everyone behind it. The fix is concurrency, the chapter-22 material, now applied to a real problem. But there's a design decision to make first, and thinking it through is as valuable as the code. This short lesson is the conversation a careful engineer has before writing the multithreaded version: what are the options, why is the obvious one wrong, and what should we build instead?
The naive fix: a thread per request
The most obvious approach: spawn a new thread for every connection (lesson 22.1). Each request gets its own thread, so a slow one no longer blocks the others, they run concurrently.
for stream in listener.incoming() {
let stream = stream.unwrap();
thread::spawn(|| {
handle_connection(stream);
});
}
This works, and for a toy server it's even fine. With this change, /sleep in one tab no longer blocks / in another: each runs on its own thread. If concurrency were the only goal, we'd be done. So why does the chapter keep going?
Why thread-per-request doesn't scale
Because of what you learned in lesson 23.1: threads are heavy. Each one reserves a sizable stack and burdens the OS scheduler. Spawning one per request is fine at ten requests a second and catastrophic at ten thousand. Worse, it's an open invitation to denial of service: an attacker who opens a flood of connections makes your server spawn a flood of threads, until it exhausts memory and falls over. Unbounded thread creation means unbounded resource use driven by outside input, which is never acceptable in a server. We need concurrency, but we need it bounded: a fixed, controlled number of threads, no matter how many requests arrive.
(The other scalable answer is async, chapter 23, lots of connections on a few threads. That's a great fit for a real server. We're building a thread pool instead because it exercises the chapter-21/22 tools directly and keeps the capstone dependency-free; the async version is a natural variation once you've seen this one.)
The thread pool
The solution is a thread pool: a fixed group of worker threads, created once at startup, that sit waiting for work. When a request arrives, instead of spawning a new thread, we hand the work to an idle worker in the pool. If all workers are busy, the request waits briefly in a queue until one frees up. The number of threads is capped, say four, or eight, so no matter how many requests flood in, you never have more than that many threads running. Concurrency, bounded.
Picture it concretely: four workers and a shared queue of jobs. Each worker loops forever, take a job from the queue, run it, take the next. The main thread, on each connection, just drops a job (handle this connection) into the queue and moves on to accept the next connection. The workers and the queue do the rest. This is one of the oldest patterns in concurrent programming, and building it ourselves will use nearly everything from chapters 21 and 22.
Why we design the interface first
We're going to write the pool's usage before its implementation, a technique called compiler-driven development. We'll write the main we wish we had, with a ThreadPool type that doesn't exist yet, let it fail to compile, and then build exactly what the errors demand. Designing the interface from the caller's side first tends to produce a cleaner API than building bottom-up and hoping it's pleasant to use. It's the same instinct as writing a test first (lesson 20.6): state what you want, then make it real.
The interface we want
Here's the main we're aiming for. It should read almost like the thread-per-request version, but with a pool in place of raw spawning:
fn main() {
let listener = TcpListener::bind("127.0.0.1:7878").unwrap();
let pool = ThreadPool::new(4); // four workers, created once
for stream in listener.incoming() {
let stream = stream.unwrap();
pool.execute(|| {
handle_connection(stream);
});
}
}
Two methods define the whole interface. ThreadPool::new(4) creates a pool with four worker threads up front. pool.execute(closure) hands a closure to the pool to be run by some worker, the same signature shape as thread::spawn, on purpose, so it's familiar. Notice what execute takes: a closure (chapter 19) that the pool will run on another thread. That single requirement, "store a closure now, run it on a different thread later", is what makes the next lesson a tour of chapters 19, 21, and 22 at once: it needs Send (to cross threads), it needs a heap-allocated Box<dyn FnOnce> (because closures have no nameable type), and it needs a channel and a mutex to get the closure from main to a waiting worker. The interface is four lines; honoring it is the whole next lesson.
Quiz time
Question #1
Spawning a thread per request actually fixes the blocking problem. Why isn't it good enough for a real server?
Show solution
Because threads are heavyweight (each reserves a large stack and burdens the scheduler), so spawning one per request doesn't scale, and it's unbounded: the number of threads is driven by incoming requests, so a flood of connections (accidental or a deliberate denial-of-service attack) spawns a flood of threads until the server exhausts memory and crashes. A server needs bounded concurrency, a capped number of threads regardless of load.
Question #2
What is a thread pool, and how does it bound resource use?
Show solution
A thread pool is a fixed group of worker threads, created once at startup, that wait for work. Incoming jobs are handed to idle workers (or briefly queued if all are busy) rather than spawning a new thread each time. Because the number of threads is capped (e.g. 4), no matter how many requests arrive, the server never runs more than that many threads, bounding memory and scheduling cost. It provides concurrency without unbounded thread creation.
Question #3
What two methods make up the ThreadPool interface we're building toward, and what does each do?
Show solution
ThreadPool::new(size) creates a pool with size worker threads up front. pool.execute(closure) hands a closure to the pool to be run by one of its workers (mirroring thread::spawn's shape). The execute requirement, store a closure now and run it on a different thread later, is what forces the implementation to use Send, a boxed dyn FnOnce, and a channel/mutex to deliver the closure to a waiting worker.
We've decided what to build and designed its interface. The next lesson (26.3) implements it, and it's the chapter's centerpiece: workers, a channel to send jobs (chapter 22), and the Box<dyn FnOnce + Send> type that pulls chapters 16, 19, and 21 together in a single signature.