System Design of Collaborative Editing Tool

Sanket Saxena
6 min readJun 17, 2023

--

The widespread shift towards remote and collaborative work necessitates advanced tools that can handle simultaneous operations from multiple users. These applications, like Google Docs, Microsoft Office 365, and Apple Notes, should be capable of supporting thousands or even millions of users while ensuring data consistency, avoiding versioning conflicts, and offering a seamless user experience.

In this article, we’ll dive into the technical aspects of designing a collaborative editing tool. Our focus will be on handling concurrent edits, lock mechanisms, versioning and conflict resolution, Operational Transformation, Differential Synchronisation, and maintaining an edit history. Additionally, we will discuss how these platforms can be scaled to support millions of users.

Concurrency

Concurrency refers to the ability of a system to handle multiple operations at the same time. In a collaborative editing tool, it is necessary to manage concurrent edits from multiple users on the same document, and this poses a significant challenge due to the complexity of tracking all changes and maintaining data consistency.

Optimistic vs Pessimistic Locking

Locking mechanisms play a critical role in handling concurrency. The two major types of locking mechanisms used are optimistic and pessimistic locking.

Pessimistic locking assumes conflicts will occur and locks the shared resource when any user accesses it. This approach ensures data consistency but can limit the collaborative aspect of the tool as only one user can modify a resource at a given time.

In contrast, optimistic locking is more permissive and allows multiple users to access and modify a shared resource simultaneously. The conflicts are only checked during the final update to the shared resource. Though it enhances collaboration, the challenge lies in effectively managing and resolving these conflicts.

Versioning and Conflict Resolution

Versioning and conflict resolution strategies are vital to maintaining data consistency in collaborative editing tools. When multiple users edit a document simultaneously, conflict resolution is needed to determine the final state of the document.

Version control systems generally use either a state-based or operation-based approach. The state-based approach, like the one used in Git, stores and manages different versions of the document. In contrast, the operation-based approach tracks and stores the sequence of operations performed on the document.

Conflict resolution can be automatic, where the system decides the final state, or manual, where the users involved in the conflict decide the final state.

Operational Transformation

Operational Transformation (OT) is an algorithm for synchronizing shared state in real-time collaborative applications. It’s a concurrency control mechanism that enables multiple users to perform operations independently and concurrently on shared data, and it ensures data consistency across all replicas.

Here’s a simple example of OT. Suppose two users, User1 and User2, are editing a shared document that initially contains the word “apple”. User1 wants to change “apple” to “ample” by deleting the second ‘p’, while User2 wants to change “apple” to “apply” by changing the ‘e’ to ‘y’. Both operations start concurrently.

Without OT, if we apply User1’s operation first followed by User2’s operation, we get “amplp”. Conversely, if we apply User2’s operation first followed by User1’s operation, we get “aply”. Both results are inconsistent and incorrect.

With OT, the system transforms or adjusts the operations to take into account the effects of concurrently executed operations. So regardless of the order in which the operations are applied, we would either get “ampl” (if User1’s operation takes precedence) or “aply” (if User2’s operation takes precedence). Thus, OT ensures consistent and correct results.

Differential Synchronisation

Differential Synchronisation (DiffSync) is another algorithm for synchronizing shared data in real-time collaborative applications. It was developed by Neil Fraser from Google and used in Google Docs.

DiffSync maintains two copies of the document on each client and server — the ‘shadow’ and the ‘backup’. The shadow copy is a version of the document that syncs with the server, while the backup copy saves the previous synced version. Any changes are calculated as ‘diffs’ between the shadow and the current document, and these diffs are sent to the server for synchronization.

Now, why do servers in DiffSync have to maintain a snapshot before applying the patch, and why should this process be isolated? This is because when the server receives a patch from a client, it has to make sure that the patch applies correctly to the shadow copy. If other changes occur simultaneously, it could result in a state where the received patch cannot be applied correctly, leading to inconsistencies. To avoid this, the server takes a snapshot of the document and applies the patch in isolation.

Maintaining Edit History

Maintaining an edit history is an important feature for any collaborative editing tool. It allows users to understand the evolution of the document and to revert changes if required.

This can be implemented by storing every change or operation made on a document. Each operation includes information such as the type of operation (add, delete, modify), the content of the operation, the timestamp, and the author. Using this information, we can reconstruct the state of the document at any point in time.

Scaling Collaborative Tools

Scalability is a major consideration in the design of collaborative editing tools. To handle millions of users, the system must be able to scale both vertically (adding more resources to a single node in the system) and horizontally (adding more nodes to the system).

Horizontal scalability can be achieved through sharding, where the data is partitioned and distributed across multiple servers. Each shard handles a subset of users, reducing the load on any single server.

Load balancing is another strategy used to distribute the workload evenly across servers, preventing any single server from becoming a bottleneck. This can be done using techniques such as Round Robin, Least Connections, and IP Hashing.

In summary, designing a collaborative editing tool involves several complex considerations. By understanding and properly implementing concepts like concurrency, locking mechanisms, Operational Transformation, Differential Synchronisation, versioning, conflict resolution, and scalability, we can develop robust and efficient tools that support seamless collaboration.

Components of a Collaborative Editing Tool

Building a collaborative editing tool involves several components that work together to deliver a smooth user experience. Let’s look at the critical components in detail.

API Gateway

An API Gateway acts as a facade for all the incoming requests and routes them to the appropriate internal services. It handles security by providing features like authentication and authorization, rate-limiting to prevent abuse of the system, and circuit breaking to prevent system failure when a service is overwhelmed.

Authentication Service

An Authentication service handles the identity of the user. It ensures that the user is who they claim to be and provides the user with a token that can be used to verify their identity in subsequent requests. This is essential to ensure that only authorized users can access and modify documents.

Comment Service

The Comment service handles all operations related to comments in the document. This includes adding new comments, editing or deleting existing comments, and retrieving comments. Each comment is linked to a specific user and a specific part of the document.

App Server APIs

App Server APIs handle various functions related to document management, such as importing and exporting documents. They can convert the document into different formats (like .docx, .pdf, .txt) and also provide previews of the document.

Session Server

A Session Server, often backed by a time-series database, keeps track of the edit history of a document. Each edit action on a document is stored as a data point in the time series. This allows the system to reconstruct the state of the document at any point in time.

Operation Queue

The Operation Queue is an essential component that holds operations to be executed on the documents. It helps manage concurrency, resolves conflicts between operations, and ensures operations are idempotent (repeated operations have the same effect as a single operation).

WebSocket with Node.js

WebSockets provide a full-duplex communication channel over a single TCP connection. This is particularly useful for maintaining a persistent connection between the client and the server, which is necessary for real-time collaborative editing. Node.js is a popular technology for implementing WebSockets due to its event-driven, non-blocking I/O model, which makes it lightweight and efficient.

By understanding these components and how they interact, we can better design a collaborative editing tool that effectively manages concurrent edits, ensures data consistency, provides a responsive user experience, and scales to support millions of users.

--

--

Sanket Saxena
Sanket Saxena

Written by Sanket Saxena

I love writing about software engineering and all the cool things I learn along the way.

Responses (1)