HTTP - Servers and Protocols

July 21, 2025 · 8 min read

High Performance Developer

Notes on HTTP2 and web server threading models. Covers servlets, web server/container architecture (Jetty), HTTP2, HTTP streams (prioritization, flow control) and SSE.

asciisunset

What are Servlets?

A server side Java API/programming model for handling HTTP requests and sending back responses. A lot Java based frameworks use Servlets under the hood.

A Servlet container is created to encapsulate Servlets and decouple them from the web server implementation - you can attach a servlet container to a web server.

You can annotate a class with @WebServlet (jakarta.servlet.annotation) to specify the path. Here are some examples for a GET and a POST:

GET:

@WebServlet("/product")
public class ProductServlet extends HttpServlet {

@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        // do stuff
    }

}

POST:

@WebServlet("/login")
public class LoginServlet extends HttpServlet {


@Override
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    // do stuff
}

}

In some implementations, there is some delegation involved in the implementation, for example:

 public void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        this.doGet(request, response);
    }

ServletContext

Created at startup, used to register Servlets, Filters and Listeners. Annotation configurable init() method called on startup.

This context is shared across all requests and all sessions.

Requests and Responses

The Servlet container is responsible for the instantiation/construction (there is also reuse of requests and responses) of HttpServletRequest and HttpServletResponse objects and passing them through the filter chain via doFilter().

The service() method is called for Servlets which determines the specific do.*() method to call based on the HTTP verb.

Conceptually only for the lifetime of the request and the completion of the response.

HTTPSession

Shared across all requests in the same session (see HTTP protocol details below). Created for the first time when a client connects/visits your page.

Threading

Servlets, Filters and Listeners are shared across all requests and sessions.

Multiple threads can be calling your Servlet instances.

You should not have state or assign instance vars:

public class ExampleServlet extends HttpServlet {

private Object thisIsNOTThreadSafe;

protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
Object thisIsThreadSafe;

thisIsNOTThreadSafe = request.getParameter("foo"); // BAD!! Shared among all requests!
thisIsThreadSafe = request.getParameter("foo"); // OK, this is thread safe.
} 
}

Web Server Architecture

Here we'll take the Eclipse Jetty project as an example.

It's also a good example of a pretty nicely modularized Maven project.

Jetty Component Architecture

HTTP clients and servers are made by composing various classes

alt text

Apps are built as component trees - components have lifecycle and event listeners.

Jetty Threading Architecture

alt text

Jetty produces Tasks which can be a wrapper around an NIO SelectionEvent or a wrapper around blocking application code (servlet being invoked to handle a HTTP Request).

Tasks are run by a threadpool - handle request bursts, processing of IO failures.

The other thing to consider here is that multiplexed protocol implementations must be able to write in order to progress reads.

A Task can be consumed in 1 of 3 ways:

Produce-Consume

Producer thread loops to produce a Task which gets run on that same Producer thread.

Thread per selector mode under NIO.

Cache efficiency - same CPU
Head of line blocking - Tasks being processed can delay subsequent tasks

For scenarios where you can handle head of line blocking well semantically (domain based).

Produce-Execute-Consume

Producer thread loops to produce Task which are submitted to an Executor - the task is queued and then executed by a worker thread like a CompleteableFuture.

Cache inefficient - task is accessed and processed by a non-producing thread, so won't have data in it's caches.
Unbounded Task queue
Queuing latency

Execute-Produce-Consume

Producer loops to produce a Task. The production of tasks is then handed off to another thread via an Executor, whilst the current thread (the original producer) actually executes the task.

Execution of the task occurs on the producing thread, so more cache efficient.

Jetty IO Architecture

Jetty uses a wrapper around an NIO Selector:

alt text

Create SocketChannel
Pass SocketChannel to SelectorManager
Pass SocketChannel to ManagedSelector
Register NIO Selector

Endpoints and Connections

SocketChannels get wrapped into Endpoints and Connections.

Endpoint - abstraction for reading or writing
Connection - abstraction for reading from an Endpoint and deserializing it. Read and parse incoming bytes into a particular protocol.

alt text

However your writing abstraction is implemented, eventually bytes are written out via the Endpoint.

Endpoints and Connections can be chained to implement more complex functionality e.g. TLS:

alt text

Content Source

A non-blocking read model where a read returns a content.chunk - can be a normal or a failure chunk.

A content sink provides an offer() bytes abstraction.

HTTP2

Builds upon the core concepts of HTTP1 (verbs, status codes, header fields etc) to enable:

Multiplexing of requests and responses
Compression of HTTP header fields (some standardized tables from Huffman encoding and a shared compression context at the connection level) - HPACK
Server push (PUSH_PROMISE)
Request priorities
Framed binary encoding of messages in transit vs. newline delimited plaintext as in HTTP1
Persistent, single TCP connection reuse

alt text Source: HPBN

Streams, Messages and Frames

Messages can belong to a particular stream. These are encoded into frames which are sent over a single TCP connection which allows for bidi communication.

Each stream is identified by it's ID and has optional priority information.

Each message is a logical HTTP message - a request or a response for example.

Binary framing allows for multiplexing components of multiple messages across multiple streams across the same TCP connection.

alt text Source: HPBN

Stream Prioritization

Each stream can be assigned a priority and dependencies on other streams - this allows for the construction of a prioritization tree which dictates and guides the order in which frames are interleaved and delivered.

These are a preference, not a requirement with guarantees.

Browsers can prioritize requests based on several factors:

Type of asset e.g. JS, HTML, CSS
Location on the page
Previous history or usage

This also impacts things like FCP (first contentful paint) and ILS (incremental layout shift).

Stream Flow Control

Flow control can occur at the stream level not just the TCP level. It uses a windowing technique, based on bytes as opposed to messages. The window is reduced on DATA frames. The window is incremented on WINDOW_UPDATE frames which are sent by the receiver.

A SETTINGS frame is used on connection establishment to set the window size in both directions.

Server Push

The server can provide multiple responses for a request. Initiated via PUSH_PROMISE frames from the server to indicate intent. This has to occur before the actual response is sent as DATA frames.

Each resource being pushed is a stream - that means it can be individually prioritized, multiplexed and processed by the client.

When a client gets a PUSH_PROMISE is can decline via RST_STREAM (e.g. it has cached the resource already).

Frame Types

DATA - Used to transport HTTP message bodies

HEADERS - Used to communicate header fields for a stream

PRIORITY - Used to communicate sender-advised priority of a stream

RST_STREAM - Used to signal termination of a stream

SETTINGS - Used to communicate configuration parameters for the connection

PUSH_PROMISE - Used to signal a promise to serve the referenced resource

PING - Used to measure the roundtrip time and perform "liveness" checks

GOAWAY - Used to inform the peer to stop creating streams for current connection

WINDOW_UPDATE - Used to implement flow stream and connection flow control

CONTINUATION - Used to continue a sequence of header block fragments

Stream ID Collisions

Client initiated streams - odd numbered stream IDs
Server initiated streams - even numbered stream IDs

SSE

Streaming text over HTTP - uses a content-type of "text/event-stream". Implemented on the client side using EventSource API. Good for unidirectional server to client communication e.g. streaming data updates, LLMs and MCP servers etc.

The EventSource interface is meant to allow you to receive push notifications as DOM events. Provides auto-reconnect and tracking of last seen message - can advertise the last seen sequence on reconnects with the server.

Efficient, cross-browser impl of XHR streaming. See XHR for more details.

Events sent from the server can have optional id or event fields. Newlines demarcate event boundaries.

alt text Source: HPBN

Please note: I think the above image is a bit out of date, SSE now also uses HTTP2 not 1.1

HTTP3

Uses UDP (QUIC) instead of TCP.

Reduces resource consumption overall as opposed to latency. Latency reduction was not really a primary goal.

Links

https://hpbn.co/http2/ - Ilya Grigorik (Web Performance Engineer at Google, Co-Chair W3C Performance Working Group), High Performance Browser Networking

https://jetty.org/docs/jetty/12/programming-guide/arch/bean.html - Jetty Architecture documentation

https://stackoverflow.com/questions/3106452/how-do-servlets-work-instantiation-sessions-shared-variables-and-multithreadi - great shares from Bauke Scholtz

https://stackoverflow.com/questions/8011138/servlet-seems-to-handle-multiple-concurrent-browser-requests-synchronously

What are Servlets?​

ServletContext​

Requests and Responses​

HTTPSession​

Threading​

Web Server Architecture​

Jetty Component Architecture​

Jetty Threading Architecture​

Produce-Consume​

Produce-Execute-Consume​

Execute-Produce-Consume​

Jetty IO Architecture​

Endpoints and Connections​

Content Source​

HTTP2​

Streams, Messages and Frames​

Stream Prioritization​

Stream Flow Control​

Server Push​

Frame Types​

Stream ID Collisions​

SSE​

HTTP3​

Links​

What are Servlets?

ServletContext

Requests and Responses

HTTPSession

Threading

Web Server Architecture

Jetty Component Architecture

Jetty Threading Architecture

Produce-Consume

Produce-Execute-Consume

Execute-Produce-Consume

Jetty IO Architecture

Endpoints and Connections

Content Source

HTTP2

Streams, Messages and Frames

Stream Prioritization

Stream Flow Control

Server Push

Frame Types

Stream ID Collisions

SSE

HTTP3

Links