Skip to content

Conversation

@chriszarate
Copy link
Contributor

What?

A proof-of-concept exploring a default Yjs provider based on long-polling (server-sent events) and state stored in the WordPress database. See #74085

Why?

The current default provider is based on WebRTC and is unreliable in certain network conditions. Making it reliable requires centralized infrastructure that is probably unattainable.

An alternative default transport could use long-polling against an internal endpoint with state stored in the WordPress database. This approach would face performance issues on medium-to-large sites but would allow users to explore collaborative editing under some protective limits (e.g., a maximum of two simultaneous collaborators). Moving beyond these protective limits would require a more robust host-provided transport such as WebSockets.

How?

  1. Implement a new Yjs provider: HttpSseProvider
  2. Register a new REST API endpoint: /sync/v1/messages. The new provider will connect to this endpoint.
    • Messages (Yjs document updates) are sent to this endpoint via POST requests.
    • Messages are consumed via EventSource connections to this endpoint.
      • If a client connects and no other clients have connected recently, the server closes the connection and the client will reconnect after a short time. This is a naïve implementation of "lazy connections" that will reduce the number of consumed connections overall.
      • If other clients have recently connected, then the connection is kept open and new messages are sent as server-sent events.
  3. There is no PHP Yjs library. Without writing one, we have no ability to apply updates from connected clients onto a central document stored on the server. The server can only naïvely replay messages received by the server to other connected clients.
  4. Not all WordPress installations implement a shared object cache. PHP processes are isolated and there is no universal method for cooperative communication. Therefore, we store messages in transients, which are persisted in the WordPress database—or persistent object cache, if configured.
  5. The server cannot indefinitely store messages for replay; long-running sessions would eventually exhaust memory and storage. Nor can the server reliably determine when a user has left the session.
    • As a workaround, we ask clients to periodically send a "snapshot" of their local document and the ID of the last message applied to it. Upon receiving this snapshot, the server can (a) confirm that it is the "latest" available snapshot (b) store that snapshot as an update, and (c) delete messages older than the indicated last message.

Limitations and considerations

  1. There are likely some state bugs or race conditions in this new client-server interchange. This is simply a proof-of-concept for discussion.
  2. The sync manager creates a new provider instance for each entity being synced. Currently, the HttpSseProvider opens a new EventSource connection for each instance. As we provide support for additional entity syncing, this will consume more and more HTTP connections, overwhelming lower-resourced hosts.
    • If we move forward with this approach, we will probably want to reuse a single EventSource connection, which will require separately tracking the rooms and last_message_ids for each client.
  3. Yjs providers are meshable. We could ship multiple default providers that provide progressive enhancement depending on the host's configuration and resources.

Testing Instructions

  1. Check out this PR.
  2. Enable on the collaborative editing experiment (Gutenberg > Experiments).
  3. Open a post for editing in two browsers.

Testing Instructions for Keyboard

n/a

Screenshots or screencast

sse-sync.mov

@chriszarate chriszarate added [Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration [Type] Experimental Experimental feature or API. labels Jan 2, 2026
@github-actions
Copy link

github-actions bot commented Jan 2, 2026

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: chriszarate <czarate@git.wordpress.org>
Co-authored-by: maxschmeling <maxschmeling@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@maxschmeling
Copy link

How will this handle the situation where the transient is deleted early? My initial impression is that it would stop being able to sync until a full snapshot is sent again? What if the transient storage is under pressure and being deleted frequently? Do we need to worry about that scenario?

Everyone seems to misunderstand how transient expiration works, so the long and short of it is: transient expiration times are a maximum time. There is no minimum age. Transients might disappear one second after you set them, or 24 hours, but they will never be around after the expiration time.

https://developer.wordpress.org/apis/transients/

@chriszarate
Copy link
Contributor Author

chriszarate commented Jan 5, 2026

How will this handle the situation where the transient is deleted early? My initial impression is that it would stop being able to sync until a full snapshot is sent again?

Sync would continue to function, but individual peers may not have an up-to-date representation of the Yjs document state until they receive a more complete snapshot from another peer.

What if the transient storage is under pressure and being deleted frequently?

Great point. Because eviction is not controlled by the application, neither transients nor object cache are ideal persistence layers for sync data. Under severe pressure where sync data cannot survive longer >30s, I'd guess that syncing may cease to reliably function.

This PR is really just to show that it's possible to implement a Yjs provider backed by the WordPress database (and not some other network service), and therefore provide a sync transport that works (in theory) on every WordPress installation. Instead of transients, maybe we should target a new built-in post type and manage evictions manually? Or perhaps a better idea will emerge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration [Package] Sync [Type] Experimental Experimental feature or API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants