Giter VIP home page Giter VIP logo

memento's Introduction

memento

Fairly simple hashmap implementation built on top of an epoll TCP server. A toy project just for learning purpose, fascinated by the topic, i decided to try to make a simpler version of redis/memcached to better understand how it works. The project is a minimal Redis-like implementation with a text based protocol, and like Redis, can be used as key-value in-memory store.

Currently supports some basic operations, all commands are case insensitive, so not strictly uppercase.

Command Args Description
SET <key> <value> Sets <key> to <value>
GET <key> Get the value identified by <key>
DEL <key> <key2> .. <keyN> Delete values identified by <key>..<keyN>
INC <key> <qty> Increment by <qty> the value idenfied by <key>, if no <qty> is specified increment by 1
DEC <key> <qty> Decrement by <qty> the value idenfied by <key>, if no <qty> is specified decrement by 1
INCF <key> <qty> Increment by float <qty> the value identified by <key>, if no <qty> is specified increment by 1.0
DECF <key> <qty> Decrement by <qty> the value identified by <key>, if no <qty> is specified decrement by 1.0
GETP <key> Get all information of a key-value pair represented by <key>, like key, value, creation time and expire time
APPEND <key> <value> Append <value> to <key>
PREPEND <key> <value> Prepend <value> to <key>
FLUSH Delete all maps stored inside partitions
QUIT/EXIT Close connection

Distribution

Still in development and certainly bugged, it currently supports some commands in a distributed context across a cluster of machines. There isn't a replica system yet, planning to add it soon, currently the database can be started in cluster mode with the -c flag and by creating a configuration file on each node. As default behaviour, memento search for a ~/.memento file in $HOME, by the way it is also possible to define a different path.

The configuration file shall strictly follow some rules, a sample could be:

    # configuration file
    127.0.0.1   8181    node1   0
    127.0.0.1   8182    node2   1
    127.0.0.1   8183    node3   0

every line define a node, with an IP address, a port, his id-name and a flag to define if the IP-PORT-ID refers to self node (ie the machine where the file reside). PORT value refer to the actual listen port of the instance plus 100. So 8082 means a bus port 8182.

It is possible to generate basic configurations using the helper script build_cluster.py, it accepts just the address and port of every node, so to create 3 configuration file as the previous example just run (all on the same machine, but can be easily used to generate configurations for different nodes in a LAN):

$ python build_cluster.py 127.0.0.1:8081 127.0.0.1:8082 127.0.0.1:8083

This instruction will generate files node0.conf, node1.conf and node2.conf containing each the right configuration. It can be renamed to .memento and dropped in the $HOME path of every node, or can be passed to the program as argument using -f option. Every node of the cluster, will try to connect to other, if all goes well, every node should print a log message like this:

$ [INF][YYYY-mm-dd hh-mm-ss] - Cluster successfully formed

From now on the system is ready to receive connections and commands.

The distribution of keys and values follows a classic hashing addressing in a keyspace divided in equal buckets through the cluster. The total number of slots is 8192, so if the cluster is formed of 2 nodes, every node will get at most 4096 keys.

Build

To build the source just run make. A memento executable will be generated into a bin directory that can be started to listen on a defined hostname, ready to receive commands from any TCP client

$ ./bin/memento -a <hostname> -p <port>

-c for a cluster mode start

$ ./bin/memento -a <hostname> -p <port> -c

Parameter <hostname> and <port> fallback to 127.0.0.1 and 6737, memento accept also a -f parameter, it allows to specify a configuration file for cluster context.

$ ./bin/memento -a <hostname> -p <port> -c -f ./conf/node0.conf

The name of the node can be overridden with the -i option

$ ./bin/memento -a <hostname> -p <port> -c -f <path-to-conf> -i <name-id>

It is also possible to stress-test the application by using memento-benchmark, previously generating it with make memento-benchmark command

$ ./bin/memento-benchmark <hostname> <port>

To build memento-cli just make memento-cli and run it like the following:

$ ./bin/memento-cli <hostname> <port>

Experimental

The system is already blazing fast, testing it in a simple manner give already good results. Using a Linux box with an Intel(R) Core(TM) i5-4210U CPU @ 1.70Ghz and 4 GB ram I get

 Nr. operations 100000

 1) SET keyone valueone
 2) GET keyone
 3) SET with growing keys
 4) All previous operations in 10 threads, each one performing 8000 requests per type

 [SET] - Elapsed time: 0.871047 s  Op/s: 114804.37
 [GET] - Elapsed time: 0.889411 s  Op/s: 112433.95
 [SET] - Elapsed time: 0.923016 s  Op/s: 108340.48

Performance improvements

Still experimenting different paths, the main bottleneck is the server interface where clients are served; initially I rewrote the network interface as a simple epoll main thread that handle EPOLLIN and EPOLLOUT events, using a thread-safe queue to dispatch incoming data on descriptor to a pool of reader threads. After being processed, their responsibility was to set the descriptor to EPOLLOUT, in this case the main thread enqueued the descriptor to a write queue, and a pool of writer threads constantly polling it, had the responsibility to send out data.

Later I decided to adopt a model more nginx oriented, using a pool of threads to spin their own event loop, and using the main thread only for accepting incoming connection, demanding them to worker reader threads, this time, allowing the main thread to also handle internal communication between other nodes, semplifying a bit the process. Still a pool of writer threads is used to send out data to clients, this approach seems better, but it is still in development and testing.

Changelog

See the CHANGELOG file.

TODO

  • Implementation of a real consistent-hashing ring
  • Runtime nodes joining and rebalance of the cluster
  • Implementation of a basic gossip protocol to spread cluster state
  • Implementation of redundancy (e.g. slave nodes)

License

See the LICENSE file for license rights and limitations (MIT).

memento's People

Contributors

codepr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memento's Issues

Unrecognized command occurs when requests a key in remote node.

I will work on it.
Node A
[INF][2017-09-30 16:05:01] - Cluster succesfully formed
[DBG][2017-09-30 16:06:06] - Client connected 127.0.0.1:47766 72
[DBG][2017-09-30 16:06:11] - Command : SET
[DBG][2017-09-30 16:06:11] - Key : a
[DBG][2017-09-30 16:06:11] - Destination node: 1100 for key a
[DBG][2017-09-30 16:06:11] - Redirect toward cluster member 2735fb03b69348bbade3aae4674e390b
[DBG][2017-09-30 16:06:11] - Scheduled write to peer fd: 70
[DBG][2017-09-30 16:06:11] - Send to fd 70
[DBG][2017-09-30 16:06:11] - Handling request from 70
[DBG][2017-09-30 16:06:12] - Received data from peer node, message: mand)

[DBG][2017-09-30 16:06:12] - Answer to another node, processing inst: mand)

Node B
[INF][2017-09-30 16:05:03] - Cluster succesfully formed
[DBG][2017-09-30 16:06:11] - Unrecognized command
[DBG][2017-09-30 16:06:11] - Scheduled write to client
[DBG][2017-09-30 16:06:11] - Answering to client from worker 7
[DBG][2017-09-30 16:08:45] - Unrecognized command
[DBG][2017-09-30 16:08:45] - Scheduled write to client

can't make for ubuntu 14.04

mkdir -p ./release && gcc -std=gnu99 -Wall -lrt -lpthread -O3 -pedantic src/map.c src/util.c src/commands.c src/persistence.c src/networking.c src/serializer.c src/hashing.h src/cluster.c src/queue.c src/event.c src/list.c src/memento.c -o ./release/memento
/tmp/ccCxgVrQ.o: In function main': memento.c:(.text.startup+0x30d): undefined reference to pthread_create'
/tmp/ccN8pVI2.o: In function async_write': persistence.c:(.text+0x88): undefined reference to aio_write'
persistence.c:(.text+0x9c): undefined reference to aio_error' /tmp/ccjBougi.o: In function start_loop':
networking.c:(.text+0x6ec): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
make: *** [memento] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.