Giter VIP home page Giter VIP logo

Comments (3)

aphyr avatar aphyr commented on August 11, 2024

That's a good question, and it's not something I have a great answer for--there are lots of easy ways we could build node specialization, but I'm not sure which way would be easiest and most general for folks going forward. I also know that later on we're going to want some notion of "what datacenter does this node belong to", and that might be communicated in the same way as node roles. One option is to add datacenter and roles fields to the init message, or add specialized init_datacenters, init_roles messages... but before going there, we have to think about what node roles mean, and how they'll be used.

Node roles are tricky because they have semantic meaning not only for the nodes themselves but also for the workload! In your bookkeeper test, you want to have clients only send requests to specific roles. But different algorithms are going to have different ideas about what kinds of roles there are. It'd be weird to, say, hardcode bookkeeper-specific roles into a general-purpose ledger workload.

We might consider having two separate-but-overlapping concepts of roles, where maelstrom has some concept of, say, "nodes which can write", "nodes which can read", "nodes which accept no requests", and then the implementation is free to extend that structure from there, but it's sort of... tricky, right? How many of each should you assign? How should they be distributed across datacenters? We need some way to specify that sort of thing, and I'm not sure what the right shape is.

So... in light of this uncertainty, I have two possible paths that you might want to follow.

One is that if Maelstrom's workloads are working for you as-is, you just... do the proxying. It's like ten lines of code, and that frees you to assign node roles internally, as opposed to having to inform Maelstrom of how to assign and interpret those roles. Note that you can reply to a client's message from ANY node, so it's perfectly valid to wrap up the whole request you receive in a type: proxy body, send it to whatever other node you like, and have it evaluate the proxied message and reply directly to the Maelstrom client. I think this is your best bet.

The other option is that you modify Maelstrom to define a workload specifically for testing Bookkeeper, and follow the broadcast workload's approach. You compute the bookkeeper-specific roles you'd like at the start of the test, and send a message to each node informing it of the roles you computed, and then generate operations based on those roles. That strongly couples the workload to bookkeeper itself, so it's not necessarily a workload you can reuse for other algorithms, but it might be advantageous for other reasons--you might want to, say, design a checker that verifies particular role-related invariants, and knowing what the bookkeeper roles are in the workload would let you do that.

from maelstrom.

Vanlightly avatar Vanlightly commented on August 11, 2024

Indeed generalising node roles is not an easy thing to design well. I think for now I will play with using existing workloads and proxying. BookKeeper is after all, log storage, and I can project anything on top of a log, like a KV store. In terms of checkers, there are things like checking that metadata, bookies and clients do not diverge, but any divergence that I care about (data loss and ordering) would be visible from existing checkers.

from maelstrom.

aphyr avatar aphyr commented on August 11, 2024

I've thought a little more about this--I think that for model-checking purposes, it might be nice to be able to require maelstrom itself as a library, and hook in your own custom workload somewhere. I'm not totally sure what that's going to look like, yet, but I'm keeping it on the back burner.

from maelstrom.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.