Computer Networking — Part Two
Understanding backend communication is a crucial aspect of mastering network programming. Backend communication generally follows a few common patterns that shape the ways in which data is transferred across different systems. These patterns play an integral role in the performance, scalability, and reliability of the systems we use daily. Today, we will delve into the nuances of these patterns, explain when to use them, and provide examples for a more concrete understanding.
Request-Response Pattern
The Request-Response model is the most ubiquitous pattern. It’s straightforward and it’s everywhere, from web browsing, and database queries, to remote procedure calls (RPC).
A client sends a Request to a server, which is parsed and processed by the server. The server then sends back a Response, which is further parsed and consumed by the client. In a nutshell, the process flows like this:
Client -------------> Request ------------> Server
^ <------------ Response <----
|
Parses & Consumes
Anatomy of a request/response is defined by the protocol and message format.
Case Study: Building an Upload Image Service
In a scenario where you want to build an image upload service, you can either send a large request with the image (simple) or chunk the image and send a request per chunk (resumable).
However, the Request-Response pattern isn’t suitable for all use-cases. It falls short in scenarios like a notification service, chatting application, or long requests. We’ll explore alternatives to these cases later in this article.
Synchronous vs Asynchronous
In terms of request processing, operations can be classified as either synchronous or asynchronous.
Synchronous I/O
In synchronous operations, the caller (or client) sends a request and then waits (or “blocks”) for the response. The caller cannot execute any other code in the meantime. The receiver (or server) processes the request and sends a response back, which “unblocks” the caller.
An example of an operating system (OS) synchronous I/O would be when a program asks the OS to read from the disk. The main thread of the program is taken off of the CPU until the read operation completes, after which the program can resume execution.
Asynchronous I/O
On the contrary, in asynchronous operations, the caller sends a request and then continues to work on other tasks without waiting for the response. The caller can either periodically check if the response is ready or establish a callback function that the receiver can invoke when it’s done.
For instance, in NodeJS, the program can spin up a secondary thread to read from the disk (while the OS blocks it), allowing the main program to continue executing other code. Once the secondary thread finishes reading, it notifies the main thread.
Asynchronous operations are found in various applications like asynchronous programming (promises/futures), backend processing, commits in PostgreSQL, IO in Linux (epoll, io_uring), replication, and OS fsync (fs cache).
Synchronous vs Asynchronous in Real Life
To put this in context, consider the act of asking a question in a meeting (synchronous).
You pose your question and then wait for the response, during which time you do not engage in other activities. This is analogous to a synchronous operation.
On the other hand, consider sending an email with a question (asynchronous). You do not wait for the recipient to reply before continuing with your other tasks. The response to your question comes later, and you can handle it when it arrives. This is akin to an asynchronous operation.
Publish-Subscribe Pattern
The Publish-Subscribe model (often abbreviated as Pub-Sub) is a messaging pattern where senders, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers. Instead, the messages are categorized into classes, and published without the knowledge of which subscribers, if any, there may be.
Subscribers express interest in one or more classes, and only receive messages that are of interest, without knowledge of which publishers, if any, there are. This pattern is prevalent in real-time systems where latency is a significant factor.
The flow of the Pub-Sub model can be described as follows:
Publisher ------> Publish Message ------> Broker
|
|---------> Distribute Message ------> Subscriber
Case Study: Building a News Distribution Service
Consider a news distribution service where publishers (news channels) publish news, and subscribers (users) subscribe to various categories (like sports, politics, etc.) of news.
The service will use a broker (or message queue) that receives published news and forwards it to all interested subscribers. The news is pushed to subscribers as soon as it’s available, allowing real-time distribution.
Publish-Subscribe Timeline
The time elapsed from publishing the message to the subscriber receiving it can be depicted like so:
t0 ---- Publish -----> Broker ---- t2
|
t3 <---- Distribute ---- Subscriber ---- t5
t0
and t2
represent the publish time, t3
and t5
represent the distribution time.
Push-Pull Pattern
The Push-Pull model is a pattern in which the server (push side) sends data to the client (pull side), which consumes it. In essence, the server produces data, and the client pulls the data when it’s ready. This pattern allows the client to control the rate of data flow and is beneficial for load balancing in distributed systems.
The flow of the Push-Pull model can be described as follows:
Producer -------> Push Message -------> Queue/Buffer
|
|---------> Pull Message -------> Consumer
Case Study: Building a Job Queue System
Consider a job queue system where producers create jobs, and consumers process them. The producers will push the jobs into a queue/buffer, and consumers will pull the jobs when they are ready.
This pattern ensures that no consumer is overloaded with jobs, as they can control the pace of their work. Furthermore, if a consumer crashes or slows down, the jobs can be redirected to other consumers, ensuring high availability and fault tolerance.
Push-Pull Timeline
The time elapsed from pushing the message to pulling the message can be depicted like so:
t0 ---- Push -----> Queue/Buffer ---- t2
|
t3 <---- Pull ---- Consumer ---- t5
t0
and t2
represent the push time, t3
and t5
represent the pull time.
Polling vs Long Polling
Polling and long polling are two techniques used to get information from a server in real-time.
In polling, the client periodically sends requests to the server to check for new data. This is simple to implement, but it’s inefficient because the client is continually making requests, even when there’s no new data. This could lead to unnecessary network traffic and server load.
In long polling, the client sends a request to the server, just like in polling. However, if there’s no new data available, instead of sending an empty response, the server holds onto the request and waits until new data becomes available or until a certain period has elapsed. Once the data is available, it sends the response back to the client. The client, upon receiving the response, immediately sends another long-poll request. This gives the illusion of a real-time connection, with fewer unnecessary requests than traditional polling.
Kafka uses long polling as it allows consumers to fetch data as soon as it is available. Kafka consumers can specify a timeout value for long polling, which allows them to wait for data to become available on the broker. This way, consumers can get data almost immediately after it is produced into Kafka, while reducing the network overhead that would result from constant polling.
Server-Sent Events (SSE)
Server-Sent Events (SSE) is a standard that allows a web server to push updates to a client whenever new data is available. It uses HTTP as the protocol for communication. The key characteristic of SSE is that it’s a one-way communication channel from the server to the client.
In SSE, the client initiates a connection with the server and the server keeps this connection open, using it to send data to the client when it’s available. The connection will remain open until explicitly closed by the client or due to some network failure or server decision. This allows real-time updates from the server to the client without any polling or long polling from the client side.
Regarding the six connections limit in browsers like Chrome, SSE can bypass this limit as it reuses the same connection for sending multiple events, whereas a normal HTTP-based solution would require separate connections for each event.
To make use of SSE, your server must be able to handle multiple open connections simultaneously, and the clients must support the EventSource API, which is built into most modern web browsers. SSE is particularly useful for applications where real-time updates are essential, such as live news updates, social media feeds, or real-time analytics.
Multiplexing vs Demultiplexing
Multiplexing is the process of combining multiple signals into one, so they can be sent across a single channel, reducing the number of separate connections needed. Demultiplexing, on the other hand, is the opposite process, where one signal is split into multiple separate signals at the destination.
In the context of HTTP/2, this is key to one of its main improvements over HTTP/1.1.
In HTTP/1.1, each request/response pair requires a separate TCP connection. This results in a higher overhead, especially for websites with many resources to load (images, scripts, etc).
HTTP/2 introduces multiplexing, allowing multiple request/response pairs to be sent over a single TCP connection simultaneously. This reduces the overhead and latency and improves performance.
However, there might be considerations in a complex backend architecture. For instance, multiplexing HTTP/2 requests on the backend might lead to higher CPU usage because the server needs to handle multiple requests in one connection simultaneously. Alternatively, using a proxy server (like Envoy) to demultiplex the HTTP/2 connection into multiple HTTP/1.1 connections can reduce CPU usage on the backend servers but might also decrease the throughput due to the overhead of managing more connections.
Connection Pooling
Connection pooling is a technique used to manage and maintain a cache of database connections to handle multiple concurrent requests efficiently. It minimizes the cost of establishing new connections by reusing existing ones, leading to a significant performance improvement.
HTTP/2, QUIC, MPTCP
HTTP/2 is the second major version of the HTTP protocol. It introduces binary framing, header compression, and request/response multiplexing over a single TCP connection, among other improvements.
QUIC (Quick UDP Internet Connections) is a transport layer protocol developed by Google. It provides a combination of features previously available only separately through TCP (reliable transmission) and UDP (low latency), along with built-in encryption similar to TLS.
MPTCP (Multipath TCP) is an extension to the TCP protocol that enables TCP connections to use multiple paths to maximize resource usage and increase redundancy.
Stateless vs Stateful
Stateless and stateful describe whether a server stores information about a client between different requests.
A stateful server stores data about a client’s session and can refer to this data in subsequent interactions with the client. In contrast, a stateless server does not store session data; each request from a client is treated independently, without regard to prior requests or future ones.
While stateful systems can offer advantages like richer interactions and improved performance, they also introduce issues like scalability limitations and increased complexity. Stateless systems, on the other hand, are more scalable and simple but might require more requests or larger payloads to send necessary information with each interaction.
The discussion extends to protocols too. For example, TCP is a stateful protocol (keeps track of connection state, sequence numbers), while UDP is generally considered stateless.
Sidecar Pattern
The sidecar design pattern involves attaching an additional service to the main application (the “sidecar”) to extend or enhance its functionality. This pattern is beneficial in microservices architectures, allowing developers to abstract out certain features, such as monitoring, networking, and security, away from the main application.
When it comes to handling protocol communications, sidecar proxies are useful in decoupling the main application from the specific communication protocols, making the code cleaner and more maintainable. Changes to the communication protocols (e.g., upgrading from HTTP/1.1 to HTTP/3) can be handled in the proxy, without affecting the main application. A common example of a sidecar proxy is Envoy.
However, introducing sidecar proxies can add an extra hop in the network path, potentially increasing latency. They also add more components to manage, increasing the system’s complexity.