Hi, Suppose you had 3 highly availability web servers being served by a load balancer,

Storage agnostic implementation about yjs HOT 9 CLOSED

yjs commented on May 18, 2024

Storage agnostic implementation

from yjs.

Comments (9)

dmonad commented on May 18, 2024 2

In case you intend to handle persistence yourself you can use the y-memory storage adapter on the server-side, because it is much faster. @Joeao persists the shared document in a regular interval to the database. If there is no Yjs server available he populates the Yjs instance with the current database version of the shared document.

Just come back to me if you need that "special connector". I'm also available on gitter if you have a quick question.

from yjs.

dmonad commented on May 18, 2024

For load balancing I'd recommend to partition the connections based on the room-name (the data to share). Simple example: All connections to rooms starting with A-H go to server 1, and I-Z to server 2. But there are probably more sophisticated methods.

It is possible to write your own database adapter. You could write your own mongodb database adapter. But it will probably take some time to do that (and you would have to keep up with my modifications to the API).

I'd recommend to use one of the existent database adapters (y-leveldb on the server). The database populated by Yjs is full of cryptic nonsense - so you are not supposed to read from it anyway.

If you need to read or modify the data server-side I'd recommend to start a new Yjs instance, and close it when you are finished.

Does this answer your question?

from yjs.

jfstephe commented on May 18, 2024

Thanks for your response. This doesn't really solve the availability/resilience issue but it does address scalability to a point. If server 1 (serving A-H) went down we would still be toast.

I suppose what would be great (from my point of view at least ;-) ), would be do separate the storage completely. Have something that raised events on the server and then the implementation could decide to persist this or not. Maybe that's what you've got when you say about writing the mongodb adapter, but I don't see any documentation on how to do this (there's a lot of good stuff and I could have missed it!). Could you give me pointers as to where to look in the docs or code?

Could the above ever possibly work if the database was a source of change, e.g. a client connected to server 1 makes a change to the DB, server 2 picks up the change via DB triggers and propagates this change throughout it's clients?

Tip: If you aren't using semver for your API version management you (or at least your API consumers) will greatly benefit from it :-).

from yjs.

dmonad commented on May 18, 2024

If you want to implement your own DB adapter I can only offer you the source code as documentation (the existing DB adapters). If you really want to do that I can assist you, and improve the documentation. But I recommend not to use MongoDB. Usually I agree with your statement about separating storage and application. But for this specific use-case I think an embedded database is better: Yjs modifies the database in a very specific way, and you will experience a huge performance loss compared to LevelDB.

Also I wouldn't recommend to work with database triggers. If only one database change is not propagated to all clients, the shared document cannot be reconstructed anymore. In my experience tcp connections are too fragile for this.

But there is a solution that comes pretty close to what you want. You can configure the servers to exchange updates with each other using a special connector implementation. The servers share data with each other in a totally connected network, and clients connect to any of the servers. You would end up with the following network topology:

In this picture the servers (blue, light blue, orange) are connected in a totally connected network, and clients (light-orange) connect to one of the servers. As long as one server is running, there is a copy of the shared data. This would require to implement a special connector. I thought about implementing this for some time, but haven't had the need for it. Would this fit your expectations?

from yjs.

dmonad commented on May 18, 2024

FYI: I'm using semver.

from yjs.

jfstephe commented on May 18, 2024

Re: Semver - cool - I should have checked |:-)

Re special connector - looks good - thanks for taking the time to explain. I'm trying to fit it all together in my head and I think it would work. I do want to persist the final objects to my own DB so I can query them, but I suppose I could set a client node up on the (mongodb) server and persist changes when they occur or something like that. Feels like it could get fruity as I suppose the DB client node would need to be involved in every edit session.

Regarding separating storage, I suppose I see Leveldb as an implementation detail, and for me it could be in memory (in the diagram you previously posted). All the cool stuff I'm sure that's going on with regards getting CRDT working in the DB, as a consumer I don't care about (too much). The resulting final objects could be published for something else to persist as they see fit, but I suppose this could be a client node as already mentioned. I've only touched on this stuff so far and will bow to your superior knowledge. Just thought I'm mention it :-).

from yjs.

WinstonFassett commented on May 18, 2024

Just saw this, but wanted to mention that y-leveldb uses levelup, which is just an interface, so y-leveldb is already somewhat storage agnostic and could support any valid levelup backend. https://github.com/Level/levelup/wiki/Modules.

I'm not sure if this is something @dmonad ever envisioned for y-leveldb, but I've tried several backends and most of them worked. I did have to fork y-leveldb, mainly to just switch out storage backends, which is just a line of code, but also to make a couple of optimizations for running the OS in-memory and mirroring writes out to the remote database.

I'm actually using MongoDB right now and it's been working well (sorry @dmonad ;), although it is a lightly customized version of y-leveldb.

My implementation currently uses mongodown to connect to a free tier mongodb instance on Heroku. Before that I was using sqldown to store the data in postgresql, but it was too many rows for the free tier. It was so easy to switch backends, it was fun to experiment. Redis is also possible (via redisdown). Dynamo is theoretically possible via dynamo-down but I haven't tried it.

I did encounter the issues @dmonad alluded to when he said, "for this specific use-case I think an embedded database is better: Yjs modifies the database in a very specific way, and you will experience a huge performance loss compared to LevelDB."

While I've continued to use y-leveldb, the real performance issues are with the data being remote, so I saw some real slowdown and also data loss when using remote levelup back-ends. However, I hadn't seen this thread, ha, so I worked around it and the approach has worked pretty well for several months.

Basically, I added an in-memory layer specifically for the os database where, instead of going direct to mongodown (or sqldown etc), I used memdown to run an-in memory instance of the database and then used level-live-stream to stream changes out to mongodown.

Frankly it was an experiment I didn't expect to work, but surprisingly I haven't had any problems with it.

I'm looking forward to trying the new persistence stuff that is coming in v13. Maybe I can get rid of the in-memory optimizations, in which case, it might be pretty trivial for anyone to swap out levelup backends for yjs. @dmonad I have no idea if this is something you've considered, so just wanted to let you know that it's possible and can even work well.

from yjs.

dmonad commented on May 18, 2024

You made it work? That's pretty impressive! I also envisioned to mirror the databases. There is actually a small problem with that: If the server goes down while writing to the database, the database could be in a inconsistent state. But that't pretty rare i guess

Also thanks for the info about the levelup modules. Didn't know about that!

from yjs.

dmonad commented on May 18, 2024

Also thanks for sharing your story :-)

from yjs.

Storage agnostic implementation about yjs HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent