Giter VIP home page Giter VIP logo

dmutex's Introduction

Dmutex

Go Report Card GoDoc

Dmutex is a distributed mutex package written in Go. It takes a quorum based approach to managing locks across n distributed nodes.

Overview

Dmutex implements the Agarwal-El Abbadi quorum-based algorithm. A good overview of the algorithm can be found here starting on page 50.

Example

Dmutex is initialized with the local node's address, the addresses of the entire cluster, and a timeout for the gRPC calls. Optional file locations of a TLS certificate and key can be passed in order to secure cluster traffic.

import (
  "github.com/bparli/dmutex"
)

func dLockTest() {
  nodes := []string{"192.168.1.12", "192.168.1.13", "192.168.1.14", "192.168.1.15", "192.168.1.16"}
  dm := dmutex.NewDMutex("192.168.1.12", nodes, 3*time.Second, "server.crt", "server.key")

  if err := dm.Lock(); err != nil {
    log.Errorln("Lock error", err)
  } else {
    log.Debugln("Acquired Lock")
    time.Sleep(time.Duration(100) * time.Millisecond)
    dm.UnLock()
    log.Debugln("Released Lock")
    }
  }
}

Note the timeout setting is used solely for acquiring the lock and not for holding the lock. A supplemental method for sanity/health checking the lock owner is still alive and means to still hold the lock. Its important to set the timeout for longer than the maximum time the lock is expected to be held for.

Approach

Lock process

The lock process largely follows the Agarwal-El Abbadi quorum-based algorithm

The steps in the lock process are as follows:

  • broadcast lock Request message to all peers in the quorums
  • collect all responses within certain time-out window
    • responses from all peers in the local node's quorums are expected
  • if messages to some of the nodes in the quorum failed
    • calculate replacement paths based on the algorithm
    • generate unique substitute nodes and send the lock Request to them
    • if substitutes are not possible (according to the algorithm)
      • fallback to the naive approach and broadcast the lock Request to all remaining nodes in the cluster
      • if quorum met (at least n/2 + 1 responded) then grant lock
  • each node sends a Reply to the request at the head of its queue, indicating consent to the lock
  • if requests arrive out of order (according to timestamps)
    • an Inquire message is sent to the head of the queue
    • when a node receives an Inquire message it:
      • blocks on the Inquire if it has already been fully granted the lock
      • Yields the lock if it has not yet been fully granted the lock
  • if the lock was not granted within the timeout window Relinquish any lock Replies

Unlock process

To unlock a Relinquish message is sent to all nodes that were originally sent lock Requests. This is ideally only peers in the local node's quorums, but in a degraded state could be more, or even all of the nodes in the cluster (as described above).

Dealing with Stale Locks

A 'stale' lock is a lock that has been granted to a node, but for some reason that node has not released it in a reasonable timeframe. This could happen for any number of reasons; the node crashed, the process paused, the network, etc.

Each time a new lock request arrives the node sends a Validate message to the node at the head of its request queue since thats the one it believes has the lock. If the grpc message fails or the reply indicates that node no longer owns the lock (for whatever reason), the lock request is purged from the head of the queue and the next in line is processed.

Again, note there is no timeout for holding the lock. The Validate message is meant to clear stale locks

Logging

Log level is set by the environment variable LOG_LEVEL

dmutex's People

Contributors

bparli avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.