Giter VIP home page Giter VIP logo

dynomite's Introduction

dynomite logo

Dynomite

Build Status Dev chat at https://gitter.im/Netflix/dynomite Apache V2 License

Dynomite, inspired by Dynamo whitepaper, is a thin, distributed dynamo layer for different storage engines and protocols. Currently these include Redis and Memcached. Dynomite supports multi-datacenter replication and is designed for high availability.

The ultimate goal with Dynomite is to be able to implement high availability and cross-datacenter replication on storage engines that do not inherently provide that functionality. The implementation is efficient, not complex (few moving parts), and highly performant.

Workflow

Every branch numbered like v0.5.9, v0.5.8 etc is stable and safe to use in production unless marked as pre-release. The dev branch is the development unstable branch. Over time master branch has fallen behind and is not maintained. We will eventually delete it and may or may not recreate it.

For questions or contributions, please consider reading CONTRIBUTING.md.

Build

To build Dynomite from source with debug logs enabled and assertions disabled:

$ git clone [email protected]:Netflix/dynomite.git
$ cd dynomite
$ autoreconf -fvi
$ ./configure --enable-debug=yes
$ make
$ src/dynomite -h

To build Dynomite in debug mode:

$ git clone [email protected]:Netflix/dynomite.git
$ cd dynomite
$ autoreconf -fvi
$ CFLAGS="-ggdb3 -O0" ./configure --enable-debug=full
$ make
$ sudo make install

Help

Usage: dynomite [-?hVdDt] [-v verbosity level] [-o output file]
                  [-c conf file] [-p pid file] 

Options:
  -h, --help              : this help
  -V, --version           : show version and exit
  -t, --test-conf         : test configuration for syntax errors and exit
  -g, --gossip            : enable gossip (default: disabled)
  -d, --daemonize         : run as a daemon
  -D, --describe-stats    : print stats description and exit
  -v, --verbosity=N       : set logging level (default: 5, min: 0, max: 11)
  -o, --output=S          : set logging file (default: stderr)
  -c, --conf-file=S       : set configuration file (default: conf/dynomite.yml)
  -p, --pid-file=S        : set pid file (default: off)
  -x, --admin-operation=N : set size of admin operation (default: 0)

Configuration

Dynomite can be configured through a YAML 1.1 (YAML 1.1 is not JSON compatible) file specified by the -c or --conf-file command-line argument on process start. The configuration files parses and understands the following keys:

  • env: Specify environment of a node. Currently supports aws and network (for physical datacenter).
  • datacenter: The name of the datacenter. Please refer to architecture document.
  • rack: The name of the rack. Please refer to architecture document.
  • dyn_listen: The port that dynomite nodes use to inter-communicate and gossip.
  • enable_gossip: enable gossip instead of static tokens (default: false). Gossip is experimental.
  • gos_interval: The sleeping time in milliseconds at the end of a gossip round.
  • tokens: The token(s) owned by a node. Currently, we don't support vnode yet so this only works with one token for the time being.
  • dyn_seed_provider: A seed provider implementation to provide a list of seed nodes.
  • dyn_seeds: A list of seed nodes in the format: address:port:rack:dc:tokens (note that vnode is not supported yet)
  • listen: The listening address and port (name:port or ip:port) for this server pool.
  • timeout: The timeout value in msec that we wait for to establish a connection to the server or receive a response from a server. By default, we wait indefinitely.
  • preconnect: A boolean value that controls if dynomite should preconnect to all the servers in this pool on process start. Defaults to false.
  • data_store: An integer value that controls if a server pool speaks redis (0) or memcached (1) or other protocol. Defaults to redis (0).
  • auto_eject_hosts: A boolean value that controls if server should be ejected temporarily when it fails consecutively server_failure_limit times. See liveness recommendations for information. Defaults to false.
  • server_retry_timeout: The timeout value in msec to wait for before retrying on a temporarily ejected server, when auto_eject_host is set to true. Defaults to 30000 msec.
  • server_failure_limit: The number of consecutive failures on a server that would lead to it being temporarily ejected when auto_eject_host is set to true. Defaults to 2.
  • servers: A list of local server address, port and weight (name:port:weight or ip:port:weight) for this server pool. Currently, there is just one.
  • secure_server_option: Encrypted communication. Must be one of 'none', 'rack', 'datacenter', or 'all'. datacenter means all communication between datacenters is encrypted but within a datacenter it is not. rack means all communication between racks and regions is encrypted however communication between nodes within the same rack is not encrypted. all means all communication between all nodes is encrypted. And none means none of the communication is encrypted.
  • stats_listen: The address and port number for the REST endpoint and for accessing statistics.
  • stats_interval: set stats aggregation interval in msec (default: 30000 msec).
  • mbuf_size: size of mbuf chunk in bytes (default: 16384 bytes).
  • max_msgs: max number of messages to allocate (default: 200000).
  • datastore_connections: Maximum number of connections to the local datastore.
  • local_peer_connections: Maximum number of connections to a local DC peer.
  • remote_peer_connections: Maximum number of connections to a remote DC peer.
  • dyn_port: Port used by Dynomite servers to talk to each other.

For example, the configuration file in conf/dynomite.yml

Finally, to make writing syntactically correct configuration files easier, dynomite provides a command-line argument -t or --test-conf that can be used to test the YAML configuration file for any syntax error.

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

dynomite's People

Contributors

akashdeepgoel avatar akbarahmed avatar asgs avatar ayutthaya avatar balajivenki avatar charsyam avatar daa avatar danielhochman avatar diegopacheco avatar felixoid avatar hashbrowncipher avatar i-rinat avatar ipapapa avatar jasobrown avatar lampmanyao avatar manjuraj avatar matschaffer avatar mckelvin avatar mdaxini avatar paravoid avatar proger avatar remotezygote avatar rprevot avatar shailesh33 avatar smukil avatar sobkovych avatar sobolevn avatar tan-lawrence avatar timiblossom avatar wkruse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dynomite's Issues

Dynomite crashes on restart

We have a very high volume Dynomite installation (30k write ops/s). We cannot restart nodes in the cluster because seemingly the msg queue builds up faster than Dynomite can connect to its peers to distribute msgs.

Should Dynomite crash when it hits its msg queue limit? The code indicates that it should simply ignore subsequent msgs until the queue has space.

[Wed Jun  3 02:40:44 2015] dyn_message.c:215 allocated #msgs 200000 hit max allowable limit
[Wed Jun  3 02:40:44 2015] dyn_message.c:215 allocated #msgs 200000 hit max allowable limit
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [0] /lib64/libpthread.so.0() [0x384460f710]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [1] src/dynomite(msg_put+0x19) [0x4175f9]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [2] src/dynomite(req_recv_done+0x645) [0x419cd5]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [3] src/dynomite(msg_recv+0x21b) [0x417e9b]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [4] src/dynomite(core_core+0x139) [0x40ab49]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [5] src/dynomite(event_wait+0xcb) [0x430f3b]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [6] src/dynomite(core_loop+0xa7) [0x40a887]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [7] src/dynomite(main+0x648) [0x428008]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [8] /lib64/libc.so.6(__libc_start_main+0xfd) [0x384421ed5d]
[Wed Jun  3 02:40:44 2015] dyn_util.c:320 [9] src/dynomite() [0x408f69]
[Wed Jun  3 02:40:44 2015] dyn_signal.c:127 signal 11 (SIGSEGV) received, core dumping

Does this recent commit address this problem? dd05d3d

about fail node resume

I have two rack(rack1 rack2) and each have six node,each node have one redis service on backend.
my client set & read data with phpredis extension.

test case 1:
rack1 & rack2 all up, random set & get data to all node, all is ok,no problem.
test case 2:
rack1 up , rack 2 some node down, backend redis up,random set & get data all rack1 node(avoid all rack 2 node),still no problem.
but rack1 and rack2 node data is diffent.
teste case 3:
rack1 up,resume rack 2 down node,random get data to all node, some test case 2 set data is missing.

my question:

fail node resume, rack1 and rack2 node data is diffent,How can avoid to get some key is missing?

my config

node 1:

dyn_o_mite:
datacenter: default_dc
dyn_listen: 0.0.0.0:8101
dyn_port: 8101
dyn_seed_provider: simple_provider
dyn_seeds:

  • 72.1.121.12:8101:rc0:default_dc:3346660200
  • 72.1.121.16:8101:rc1:default_dc:1656170642
  • 72.1.121.13:8101:rc0:default_dc:278415870
  • 72.1.121.14:8101:rc1:default_dc:2691937952
  • 72.1.121.15:8101:rc1:default_dc:2565460974
    gos_interval: '500'
    listen: 0.0.0.0:6379
    rack: rc0
    redis: 'true'
    servers:
  • 127.0.0.1:6380:1
    timeout: 150000
    tokens: '400517191'

node 2:

dyn_o_mite:
datacenter: default_dc
dyn_listen: 0.0.0.0:8101
dyn_port: 8101
dyn_seed_provider: simple_provider
dyn_seeds:

  • 72.1.121.11:8101:rc0:default_dc:400517191
  • 72.1.121.16:8101:rc1:default_dc:1656170642
  • 72.1.121.13:8101:rc0:default_dc:278415870
  • 72.1.121.14:8101:rc1:default_dc:2691937952
  • 72.1.121.15:8101:rc1:default_dc:2565460974
    gos_interval: '500'
    listen: 0.0.0.0:6379
    rack: rc0
    redis: 'true'
    servers:
  • 127.0.0.1:6380:1
    timeout: 150000
    tokens: '3346660200'

.....

node 6

dyn_o_mite:
datacenter: default_dc
dyn_listen: 0.0.0.0:8101
dyn_port: 8101
dyn_seed_provider: simple_provider
dyn_seeds:

  • 72.1.121.11:8101:rc0:default_dc:400517191
  • 72.1.121.12:8101:rc0:default_dc:3346660200
  • 72.1.121.16:8101:rc1:default_dc:1656170642
  • 72.1.121.13:8101:rc0:default_dc:278415870
  • 72.1.121.14:8101:rc1:default_dc:2691937952
    gos_interval: '500'
    listen: 0.0.0.0:6379
    rack: rc1
    redis: 'true'
    servers:
  • 127.0.0.1:6380:1
    timeout: 150000
    tokens: '2565460974'

Data Distribution

Hi,

Can you explain in few lines the logic of the consistent hashing, data distribution and some clues related to tokens?

Is there a simple logic as we can find in Redis Cluster like HASH_SLOT = CRC16(key) mod 16384 ?

I would like to know if I can test myself if a given key is in the good place but for that, I should understand how Dynomite routes my key.

I know that Dynomite is mooving every day, we are following that with care and test it a lot since several months, it is quite hard to know what tag/version should be tested and what part of the doc/wiki is obsolete or not but we love the product and we'll try our best to contribute in our way.

Max

What would happen when DataCenter1 crash and reover?

step 1. Do i need to sync-up the data from Data Center 2 use the "SLAVEOF" command to full sync-up the data for each shard?

steop2. and then i need to stop the "SLAVEOF" and swith to master at some time?

step3. and then the Dynomite will start to replicate data to it?

How to ensure no data lost between step2 and step3.

Best Practice for Redis failure mgmt

Hi,

I try to simulate a Redis failure during dynomite service is still up & running.

On the host 192.168.33.40 I have two redis, one on 6379 and one on 6380.

dyn_o_mite:
  # Node Location
  env: network
  datacenter: dc1
  rack: rc1

  # Dynomite
  listen: 0.0.0.0:8102
  dyn_listen: 0.0.0.0:8101
  dyn_seed_provider: florida_provider
  preconnect: true

  dyn_seeds:
  - 192.168.33.41:8101:rc2:dc1:0

  # Node Info
  redis: true
  tokens: 0
  servers:
  - 127.0.0.1:6379:1
  - 127.0.0.1:6380:1

  # Time Management
  gos_interval: 10000
  timeout: 3000
  server_retry_timeout: 3000

  # Security
  secure_server_option: datacenter
  server_connections: 1
  auto_eject_hosts: true
  server_failure_limit: 3
  pem_key_file: /etc/dynomite/dynomite.pem

Everything are up:

root@redis3:~# redis-cli -h 192.168.33.40 -p 8102
192.168.33.40:8102> ping
PONG

I kill the redis-server running on 6379 (kill -9 redis-server process).

When I connect to dynomite the service is not available:


root@redis3:~# redis-cli -h 192.168.33.40 -p 8102
192.168.33.40:8102> ping
(error) ERR Storage: Connection refused

Is this case is managed? Am I doing something wrong?

Max.

dyn_dnode_server.c:281 dyn: accept on p 8 failed: Too many open files

I have this error after ~ 12 hours server start.
used commands: evalsha, expire, hmset, incrBy, keys, hmget, get
config:

dyn_o_mite:
  datacenter: default_dc
  dyn_listen: ostack03:8101
  dyn_seed_provider: simple_provider
  dyn_seeds:
  - ostack01:8101:rack:default_dc:2900888940
  - ostack02:8101:rack:default_dc:1094803232
  listen: ostack03:8102
  rack: rack
  servers:
  - 127.0.0.1:6379:1
  tokens: '1795242825'
  redis: true

Preserve twemproxy copyright

As Manju Rajashekhar notes, large portions of Dynomite appear to be copied straight from Twemproxy, but with the string 'nc' replaced by 'dyn'. As Twemproxy is Apache-2 licensed, you may want to re-read Section 4, which says:

  1. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
  • You must give any other recipients of the Work or Derivative Works a copy of this License; and
  • You must cause any modified files to carry prominent notices stating that You changed the files; and
  • You must retain, in the Source form of any Derivative Works that You distribute, all copyright,
    patent, trademark, and attribution notices from the Source form of the Work, excluding those notices
    that do not pertain to any part of the Derivative Works; ...

Since Twemproxy's README prominently features "Copyright 2012 Twitter, Inc.", it might be polite to acknowledge their contribution by retaining that copyright notice in Dynomite's README. If you want to get really polite, you might consider mentioning the project is a fork.

data sync

If I hope to the data sync between data center A and data center B in real time, what can I do?

Configuration example for replication

Looking for enhanced configuration examples that demonstrate how to configure nodes for replication. The existing examples demonstrate sharding to other nodes via the dyn_seeds. What is not clear is how to add nodes for replication. My attempts to add seeds belonging to another datacenter (or another rack w/in the same datacenter) results in a crash. What is the correct way to setup replication?

client TCP server on port 8102 dying

Unfortunately the logs are not helpful here, but the TCP server on port 8102 is going down. The peer server on 8101 and stats server on 22222 remain up with no problem. The only "troublesome" event that appears in the log prior to the TCP server crashing is several thousand alloc_msg_count: ###### messages and then some "Connection reset by peer" messages.

What "skip swap ... as aggregator is busy" means?

Hello,

I'm doing a simple tcp-check through HAproxy:

tcp-check send *1\r\n
tcp-check send $4\r\n
tcp-check send PING\r\n
tcp-check expect string +PONG

From the HAproxy's point of view the nodes are up and respond the excepted "PONG" but if I check the Dynomite logs the connections seams to be dropped. The reccurent messages are "Connection reset by peer" coupled with "skip swap...".

I've pasted my log for one call:

Thanks a lot for your help.

Verbose 7

[Thu Apr 23 13:30:18 2015] dyn_proxy.c:341 accepted c 10 on p 7 from '192.168.30.39:17312'
[Thu Apr 23 13:30:18 2015] dyn_message.c:221 alloc_msg_count : 1
[Thu Apr 23 13:30:18 2015] dyn_message.c:221 alloc_msg_count : 2
[Thu Apr 23 13:30:18 2015] dyn_core.c:324 close c 10 'unknown' on event FF00FF eof 0 done 0 rb 14 sb 7: Connection reset by peer

Verbose 11

[Thu Apr 23 13:29:33 2015] dyn_epoll.c:261 epoll 0001 triggered on conn 0x1502c80
[Thu Apr 23 13:29:33 2015] dyn_core.c:455 event 00FF on p 7
[Thu Apr 23 13:29:33 2015] dyn_client.c:51 ref conn 0x150d450 owner 0x14fb2a0 into pool 'bbc_test'
[Thu Apr 23 13:29:33 2015] dyn_connection.c:324 get conn 0x150d450 client 1
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1247 metric 'client_connections' in pool 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1264 incr field 'client_connections' to 1
[Thu Apr 23 13:29:33 2015] dyn_proxy.c:341 accepted c 9 on p 7 from '192.168.30.39:17303'
[Thu Apr 23 13:29:33 2015] dyn_proxy.c:285 accept on p 7 not ready - eagain
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1198 skip swap of current 0x14fe390 shadow 0x150e410 as aggregator is busy
[Thu Apr 23 13:29:33 2015] dyn_core.c:575 timeout = 30000
[Thu Apr 23 13:29:33 2015] dyn_core.c:555 length of C2G_OutQ : 0
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:249 epoll ep 5
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:261 epoll 0005 triggered on conn 0x150d450
[Thu Apr 23 13:29:33 2015] dyn_core.c:455 event FFFF on c 9
[Thu Apr 23 13:29:33 2015] dyn_message.c:351 get msg 0x1514ee0 id 3 request 1 owner sd 9
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:108 get mbuf 0x1519008
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:197 insert mbuf 0x1519008 len 0
[Thu Apr 23 13:29:33 2015] dyn_connection.c:468 recv on sd 9 14 of 16328
[Thu Apr 23 13:29:33 2015] dyn_redis.c:1621 parsed req 3 res 0 type 80 state 0 rpos 14 of 14
00000000  2a 31 0d 0a 24 34 0d 0a  50 49 4e 47 0d 0a         |*1..$4..PING..|
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1247 metric 'client_read_requests' in pool 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1264 incr field 'client_read_requests' to 2
[Thu Apr 23 13:29:33 2015] dyn_server.c:902 server_get_dc pool  'dc1'
[Thu Apr 23 13:29:33 2015] dyn_server.c:932 server_get_rack   'rc1'
[Thu Apr 23 13:29:33 2015] dyn_vnode.c:72 peer name       : '0.0.0.0'
[Thu Apr 23 13:29:33 2015] dyn_vnode.c:73 peer rack       : 'rc1'
[Thu Apr 23 13:29:33 2015] dyn_vnode.c:74 peer dc       : 'dc1'
[Thu Apr 23 13:29:33 2015] dyn_vnode.c:75 peer->processed = 1
[Thu Apr 23 13:29:33 2015] dyn_dnode_peer.c:1217 dyn: key '' on dist 3 maps to server '0.0.0.0:8101'
[Thu Apr 23 13:29:33 2015] dyn_message.c:171 insert msg 3 into tmo rbt with expiry of 10000 msec
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'in_queue' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1384 incr field 'in_queue' to 1
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'in_queue_bytes' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1418 incr by field 'in_queue_bytes' to 14
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'read_requests' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1384 incr field 'read_requests' to 2
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'read_request_bytes' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1418 incr by field 'read_request_bytes' to 28
[Thu Apr 23 13:29:33 2015] dyn_request.c:524 local forward from c 9 to s 6 req 3 len 14 type 80 with key ''
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1198 skip swap of current 0x14fe390 shadow 0x150e410 as aggregator is busy
[Thu Apr 23 13:29:33 2015] dyn_core.c:575 timeout = 10000
[Thu Apr 23 13:29:33 2015] dyn_core.c:555 length of C2G_OutQ : 0
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:249 epoll ep 5
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:261 epoll 0004 triggered on conn 0x1502ad0
[Thu Apr 23 13:29:33 2015] dyn_core.c:455 event FF00 on s 6
[Thu Apr 23 13:29:33 2015] dyn_request.c:749 send next req 3 len 14 type 80 on s 6
[Thu Apr 23 13:29:33 2015] dyn_connection.c:519 sendv on sd 6 14 of 14 in 1 buffers
[Thu Apr 23 13:29:33 2015] dyn_request.c:763 send done req 3 len 14 type 80 on s 6
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'in_queue' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1401 decr field 'in_queue' to 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'in_queue_bytes' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1435 decr by field 'in_queue_bytes' to 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'out_queue' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1384 incr field 'out_queue' to 1
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'out_queue_bytes' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1418 incr by field 'out_queue_bytes' to 14
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1198 skip swap of current 0x14fe390 shadow 0x150e410 as aggregator is busy
[Thu Apr 23 13:29:33 2015] dyn_core.c:575 timeout = 10000
[Thu Apr 23 13:29:33 2015] dyn_core.c:555 length of C2G_OutQ : 0
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:249 epoll ep 5
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:261 epoll 0001 triggered on conn 0x1502ad0
[Thu Apr 23 13:29:33 2015] dyn_core.c:455 event 00FF on s 6
[Thu Apr 23 13:29:33 2015] dyn_message.c:351 get msg 0x1519190 id 4 request 0 owner sd 6
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:108 get mbuf 0x151d2b8
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:197 insert mbuf 0x151d2b8 len 0
[Thu Apr 23 13:29:33 2015] dyn_connection.c:468 recv on sd 6 7 of 16328
[Thu Apr 23 13:29:33 2015] dyn_redis.c:2033 parsed rsp 4 res 0 type 118 state 0 rpos 7 of 7
00000000  2b 50 4f 4e 47 0d 0a                               |+PONG..|
[Thu Apr 23 13:29:33 2015] dyn_message.c:189 delete msg 3 from tmo rbt
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'out_queue' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1401 decr field 'out_queue' to 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'out_queue_bytes' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1435 decr by field 'out_queue_bytes' to 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'read_responses' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1384 incr field 'read_responses' to 2
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1366 metric 'read_response_bytes' in pool 0 server 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1418 incr by field 'read_response_bytes' to 14
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1198 skip swap of current 0x14fe390 shadow 0x150e410 as aggregator is busy
[Thu Apr 23 13:29:33 2015] dyn_core.c:575 timeout = 30000
[Thu Apr 23 13:29:33 2015] dyn_core.c:555 length of C2G_OutQ : 0
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:249 epoll ep 5
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:261 epoll 0004 triggered on conn 0x150d450
[Thu Apr 23 13:29:33 2015] dyn_core.c:455 event FF00 on c 9
[Thu Apr 23 13:29:33 2015] dyn_response.c:325 send next rsp 4 on c 9
[Thu Apr 23 13:29:33 2015] dyn_connection.c:519 sendv on sd 9 7 of 7 in 1 buffers
[Thu Apr 23 13:29:33 2015] dyn_response.c:338 send done rsp 4 on c 9
[Thu Apr 23 13:29:33 2015] dyn_message.c:453 put msg 0x1519190 id 4
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:220 remove mbuf 0x151d2b8 len 0
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:136 put mbuf 0x151d2b8 len 0
[Thu Apr 23 13:29:33 2015] dyn_message.c:453 put msg 0x1514ee0 id 3
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:220 remove mbuf 0x1519008 len 0
[Thu Apr 23 13:29:33 2015] dyn_mbuf.c:136 put mbuf 0x1519008 len 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1198 skip swap of current 0x14fe390 shadow 0x150e410 as aggregator is busy
[Thu Apr 23 13:29:33 2015] dyn_core.c:575 timeout = 30000
[Thu Apr 23 13:29:33 2015] dyn_core.c:555 length of C2G_OutQ : 0
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:249 epoll ep 5
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:261 epoll 0019 triggered on conn 0x150d450
[Thu Apr 23 13:29:33 2015] dyn_core.c:455 event FF00FF on c 9
[Thu Apr 23 13:29:33 2015] dyn_core.c:324 close c 9 'unknown' on event FF00FF eof 0 done 0 rb 14 sb 7: Connection reset by peer
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1247 metric 'client_connections' in pool 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1282 decr field 'client_connections' to 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1247 metric 'client_err' in pool 0
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1264 incr field 'client_err' to 2
[Thu Apr 23 13:29:33 2015] dyn_client.c:70 unref conn 0x150d450 owner 0x14fb2a0 from pool 'bbc_test'
[Thu Apr 23 13:29:33 2015] dyn_connection.c:428 put conn 0x150d450
[Thu Apr 23 13:29:33 2015] dyn_stats.c:1198 skip swap of current 0x14fe390 shadow 0x150e410 as aggregator is busy
[Thu Apr 23 13:29:33 2015] dyn_core.c:575 timeout = 30000
[Thu Apr 23 13:29:33 2015] dyn_core.c:555 length of C2G_OutQ : 0
[Thu Apr 23 13:29:33 2015] dyn_epoll.c:249 epoll ep 5

Support BLPOP BRPOP with single Key

BLPOP command is defined as "BLPOP key [key ...] timeout"

This comand is very useful, and most time we only need to popup one key

Can Dynomite support "BLPOP key timeout" ?

compile error with "struct dmsg"

dyn_dnode_msg.c: In function ‘dmsg_free’:
dyn_dnode_msg.c:366: error: ‘struct dmsg’ has no member named ‘mhdr’

ASSERT(STAILQ_EMPTY(&dmsg->mhdr)); ==> ASSERT(STAILQ_EMPTY(&dmsg->owner->mhdr));?

client can not connect to dynomite server

I setup a very simple dynomite server with 1 redis node. Then I use a redis client (implemented by ruby, java are all tried) to connect dynomite. The host and port number is set to dynomite configuration file's "listen" value. However, this client's connection fails, but it can directly connect the redis node I set.

I guess it may due to permission issue. I notice that there is "dynomite.pem" file under conf folder, and dynomite will load this file after setup. How can I use this file?

Why don't write operations scale as well as read operations?

Hello,

Looking at this image from the intro blog post:

scale

Why do we see the number of write operations not scale as well as the read operations. I understand that the total number of write operations might be lower than the number of read operations, but shouldn't the lines be parallel?
If Dynomite was waiting for confirmation from all nodes before accepting a write, this might make sense, but it is doing all of that asynchronously

Uneven Key Distribution

I seem to get a predictably uneven key distribution depending on the tokens I use. For a three node rack, I consistently get an approximate 55%/37%/8% distribution of keys. For my testing, I'm creating 100k alphanumeric keys of varying length [10,25].

I used the generate_yamls.py script for this cluster, which of course generated random tokens.

Since Dynomite does not yet have vnodes, is there a way to generate tokens based on the rack size that would offer a slightly more uniform distribution?

help me how to scale out the rack and dc

we are using this with phpredis extension
and we have two rack(rack1 rack2) and each have two node

the question is:
1.what's token ,it's mean the same dataset?
2.did i need set each node have same token or make the node different each.
3.how can we add rack .when i add a new rack to dc.it's cant get the "old" data.
4.we tried to make rack1(node1:token 1,node2:token2) rack2(node3:token2,node4:token1) we found only one token redis server have aof increased.and if this redis down the data lost too
5.when the write data redis down the data lost...

Network throttling

Dynomite nodes are sending too fast data across nodes in DCs. In the cloud environment such as AWS, this can cause weird behaviors such as packets got dropped somewhere in the WAN.

We should have a way to throttle our sent data and then build a large enough cluster to accommodate the traffics (not small cluster and high network usage).

I am looking at http://en.wikipedia.org/wiki/Token_bucket but if you guys have a better idea, please let me know.

update README.md

https://github.com/Netflix/dynomite/blob/master/README.md

I am brand spankin' new to Dynomite and there are some things that causes me as a new user to be confused. I wanted to point them out so that you can update the documentation to be more new user friendly. I can't do this myself as I don't know the answers.

env: Specify environment of a node. Currently supports aws and network (for physical datacenter).
What is the difference and why should I choose one over the other.

dyn_listen: The port that dynomite nodes use to inter-communicate and gossip.
This seems to be a "IP:PORT" combo and is misleading.

tokens: The token(s) owned by a node. Currently, we don't support vnode yet so this only works with one token for the time being.
Not sure what a token is, what to set it to, how it works in grand scheme of things, etc. In examples some seem to share this, some don't. Confused. I think it represents the range of hash values this server will process.

dyn_seed_provider: A seed provider implementation to provide a list of seed nodes.
Not sure what this is but plan to leave as default as other options are unknown.

dyn_seeds: A list of seed nodes in the format: address:port:rack:dc:tokens (node that vnode is not supported yet)
Instead of referring them to seed nodes is it better to have them described as location of other servers. At least in my naive understanding that's how I see them.

listen: The listening address and port (name:port or ip:port) for this server pool.
This seems to be the "IP:PORT" to use in a multihomed environment then "this server pool" is added and now I am totally confused.

server_connections: The maximum number of connections that can be opened to each server. By default, we open at most 1 server connection.
Is this Client -> Server traffic or gossip channel? Seems gossip as "we open" is in there.

servers: A list of local server address, port and weight (name:port:weight or ip:port:weight) for this server pool. Usually there is just one.
This overlaps some previous comments. "Server pool", "token". How does this differ from "listen"

Explain connection pools
Better explain tokens... I found this "Each node in a rack has a unique token, which helps to identify which dataset it owns." By itself it is ok but when it comes to setting it to some value in the file I am lost.


These seem missing but have found them in example configuration files online.
backlog: 1024
distribution: vnode
hash: murmur

redis-cli can not get response from the dynomite process.

Hi,

There is 2 datacenters with a single rack each containing 2 nodes(4 machines). if I use the redis-cli to do some test ,it shows:
192.168.7.215:8102> ping
(error) ERR Storage: Connection timed out
(24.12s)

192.168.7.215:8102>get "xx"

The dynomite did't give any response or until time out.

Using the config below:
dyn_o_mite:
datacenter: dc1
rack: rack1
dyn_listen: 192.168.7.215:8101
dyn_seeds:

  • 192.168.7.216:8101:rack1:dc1:1234567892
  • 192.168.7.248:8101:rack2:dc2:1234567891
  • 192.168.7.78:8101:rack2:dc2:1234567892
    listen: 192.168.7.215:8102
    servers:
  • 192.168.7.215:6379:1
    tokens: '1234567891'
    redis: true
    timeout: 30000

dyn_o_mite:
datacenter: dc1
rack: rack1
dyn_listen: 192.168.7.216:8101
dyn_seeds:

  • 192.168.7.215:8101:rack1:dc1:1234567891
  • 192.168.7.248:8101:rack2:dc2:1234567891
  • 192.168.7.78:8101:rack2:dc2:1234567892
    listen: 192.168.7.216:8102
    servers:
  • 192.168.7.216:6379:1
    tokens: '1234567892'
    redis: true
    timeout: 30000

dyn_o_mite:
datacenter: dc2
rack: rack2
dyn_listen: 192.168.7.248:8101
dyn_seeds:

  • 192.168.7.215:8101:rack1:dc1:1234567891
  • 192.168.7.216:8101:rack1:dc1:1234567892
  • 192.168.7.78:8101:rack2:dc2:1234567892
    listen: 192.168.7.248:8102
    servers:
  • 192.168.7.248:6379:1
    tokens: '1234567891'
    redis: true
    timeout: 30000

dyn_o_mite:
datacenter: dc2
rack: rack2
dyn_listen: 192.168.7.78:8101
dyn_seeds:

  • 192.168.7.215:8101:rack1:dc1:1234567891
  • 192.168.7.216:8101:rack1:dc1:1234567892
  • 192.168.7.248:8101:rack2:dc2:1234567891
    listen: 192.168.7.78:8102
    servers:
  • 192.168.7.78:6379:1
    tokens: '1234567892'
    redis: true
    timeout: 30000

Log from 192.168.7.215 below:

[Fri May 15 01:07:49 2015] dynomite.c:198 dynomite-0.1.19 built for Linux 2.6.32-358.el6.x86_64 x86_64 started on pid 15242
[Fri May 15 01:07:49 2015] dynomite.c:203 run, rabbit run / dig that hole, forget the sun / and when at last the work is done / don't sit down / it's time to dig another one
[Fri May 15 01:07:49 2015] dyn_mbuf.c:321 mbuf hsize 64 chunk size 16400 offset 16336 length 16336
[Fri May 15 01:07:49 2015] dyn_message.c:582 msg size 344
[Fri May 15 01:07:49 2015] dyn_connection.c:477 conn size 328
[Fri May 15 01:07:49 2015] dyn_conf.c:1533 setting secure_server_option to default value:none
[Fri May 15 01:07:49 2015] dyn_conf.c:1546 setting env to default value:aws
[Fri May 15 01:07:49 2015] dyn_conf.c:1551 setting pem key file to default value:conf/dynomite.pem
[Fri May 15 01:07:49 2015] dyn_conf.c:247 transform to server 0 '192.168.7.215:6379:1'
[Fri May 15 01:07:49 2015] dyn_server.c:158 init 1 servers in pool 0 'dyn_o_mite'
[Fri May 15 01:07:49 2015] dyn_conf.c:513 transform to pool 0 'dyn_o_mite'
[Fri May 15 01:07:49 2015] dyn_server.c:793 init 1 pools
[Fri May 15 01:07:49 2015] dyn_crypto.c:65 Private RSA structure filled
[Fri May 15 01:07:49 2015] dyn_stats.c:487 stats buffer size 2920
[Fri May 15 01:07:49 2015] dyn_stats.c:1192 m 5 listening on '0.0.0.0:22222'
[Fri May 15 01:07:49 2015] dyn_epoll.c:68 e 6 with nevent 1024
[Fri May 15 01:07:49 2015] dyn_server.c:505 connecting on s 8 to server '192.168.7.215:6379:1'
[Fri May 15 01:07:49 2015] dyn_proxy.c:211 p 9 listening on '192.168.7.215:8102' in redis pool 0 'dyn_o_mite' with 1 servers
[Fri May 15 01:07:49 2015] dyn_dnode_server.c:154 dyn: e 6 with nevent 1024
[Fri May 15 01:07:49 2015] dyn_dnode_server.c:195 dyn: p 10 listening on '192.168.7.215:8101' in redis pool 0 'dyn_o_mite' with 1 servers
[Fri May 15 01:07:49 2015] dyn_conf.c:300 transform to seed peer 0 '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_conf.c:300 transform to seed peer 1 '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_conf.c:300 transform to seed peer 2 '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:161 dyn: transform to local node to peer 0 'dyn_o_mite'
[Fri May 15 01:07:49 2015] dyn_conf.c:300 transform to seed peer 1 '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_conf.c:300 transform to seed peer 2 '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_conf.c:300 transform to seed peer 3 '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_vnode.c:67 peer name : '192.168.7.215'
[Fri May 15 01:07:49 2015] dyn_vnode.c:68 peer rack : 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:69 peer dc : 'dc1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:70 peer->processed = 0
[Fri May 15 01:07:49 2015] dyn_server.c:910 server_get_dc pool 'dc1'
[Fri May 15 01:07:49 2015] dyn_server.c:929 server_get_dc pool about to exit 'dc1'
[Fri May 15 01:07:49 2015] dyn_server.c:944 server_get_rack 'rack1'
[Fri May 15 01:07:49 2015] dyn_server.c:976 server_get_rack exiting 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:47 **** printing continuums for rack 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:51 next c[0]: idx = 0, token->mag = 1234567891
[Fri May 15 01:07:49 2015] dyn_vnode.c:53 **** end printing continuums for rack 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:67 peer name : '192.168.7.216'
[Fri May 15 01:07:49 2015] dyn_vnode.c:68 peer rack : 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:69 peer dc : 'dc1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:70 peer->processed = 0
[Fri May 15 01:07:49 2015] dyn_server.c:910 server_get_dc pool 'dc1'
[Fri May 15 01:07:49 2015] dyn_server.c:944 server_get_rack 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:47 **** printing continuums for rack 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:51 next c[0]: idx = 0, token->mag = 1234567891
[Fri May 15 01:07:49 2015] dyn_vnode.c:51 next c[1]: idx = 1, token->mag = 1234567892
[Fri May 15 01:07:49 2015] dyn_vnode.c:53 **** end printing continuums for rack 'rack1'
[Fri May 15 01:07:49 2015] dyn_vnode.c:67 peer name : '192.168.7.248'
[Fri May 15 01:07:49 2015] dyn_vnode.c:68 peer rack : 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:69 peer dc : 'dc2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:70 peer->processed = 0
[Fri May 15 01:07:49 2015] dyn_server.c:910 server_get_dc pool 'dc2'
[Fri May 15 01:07:49 2015] dyn_server.c:929 server_get_dc pool about to exit 'dc2'
[Fri May 15 01:07:49 2015] dyn_server.c:944 server_get_rack 'rack2'
[Fri May 15 01:07:49 2015] dyn_server.c:976 server_get_rack exiting 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:47 **** printing continuums for rack 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:51 next c[0]: idx = 2, token->mag = 1234567891
[Fri May 15 01:07:49 2015] dyn_vnode.c:53 **** end printing continuums for rack 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:67 peer name : '192.168.7.78'
[Fri May 15 01:07:49 2015] dyn_vnode.c:68 peer rack : 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:69 peer dc : 'dc2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:70 peer->processed = 0
[Fri May 15 01:07:49 2015] dyn_server.c:910 server_get_dc pool 'dc2'
[Fri May 15 01:07:49 2015] dyn_server.c:944 server_get_rack 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:47 **** printing continuums for rack 'rack2'
[Fri May 15 01:07:49 2015] dyn_vnode.c:51 next c[0]: idx = 2, token->mag = 1234567891
[Fri May 15 01:07:49 2015] dyn_vnode.c:51 next c[1]: idx = 3, token->mag = 1234567892
[Fri May 15 01:07:49 2015] dyn_vnode.c:53 **** end printing continuums for rack 'rack2'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:274 init 3 seeds and peers in pool 0 'dyn_o_mite'
[Fri May 15 01:07:49 2015] dyn_core.c:523 =====================Peers info=====================
[Fri May 15 01:07:49 2015] dyn_core.c:527 Server pool : 0
[Fri May 15 01:07:49 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:07:49 2015] dyn_core.c:532 Peer DC : 'dc1'
[Fri May 15 01:07:49 2015] dyn_core.c:533 Peer Rack : 'rack1'
[Fri May 15 01:07:49 2015] dyn_core.c:535 Peer name : '192.168.7.215'
[Fri May 15 01:07:49 2015] dyn_core.c:536 Peer pname : '192.168.7.215:8101'
[Fri May 15 01:07:49 2015] dyn_core.c:538 Peer state : 7
[Fri May 15 01:07:49 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:07:49 2015] dyn_core.c:540 Peer is_local : 1
[Fri May 15 01:07:49 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:07:49 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:07:49 2015] dyn_token.c:96 Token : 1 1234567891 1
[Fri May 15 01:07:49 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:07:49 2015] dyn_core.c:532 Peer DC : 'dc1'
[Fri May 15 01:07:49 2015] dyn_core.c:533 Peer Rack : 'rack1'
[Fri May 15 01:07:49 2015] dyn_core.c:535 Peer name : '192.168.7.216'
[Fri May 15 01:07:49 2015] dyn_core.c:536 Peer pname : '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_core.c:538 Peer state : 4
[Fri May 15 01:07:49 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:07:49 2015] dyn_core.c:540 Peer is_local : 0
[Fri May 15 01:07:49 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:07:49 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:07:49 2015] dyn_token.c:96 Token : 1 1234567892 1
[Fri May 15 01:07:49 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:07:49 2015] dyn_core.c:532 Peer DC : 'dc2'
[Fri May 15 01:07:49 2015] dyn_core.c:533 Peer Rack : 'rack2'
[Fri May 15 01:07:49 2015] dyn_core.c:535 Peer name : '192.168.7.248'
[Fri May 15 01:07:49 2015] dyn_core.c:536 Peer pname : '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_core.c:538 Peer state : 4
[Fri May 15 01:07:49 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:07:49 2015] dyn_core.c:540 Peer is_local : 0
[Fri May 15 01:07:49 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:07:49 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:07:49 2015] dyn_token.c:96 Token : 1 1234567891 1
[Fri May 15 01:07:49 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:07:49 2015] dyn_core.c:532 Peer DC : 'dc2'
[Fri May 15 01:07:49 2015] dyn_core.c:533 Peer Rack : 'rack2'
[Fri May 15 01:07:49 2015] dyn_core.c:535 Peer name : '192.168.7.78'
[Fri May 15 01:07:49 2015] dyn_core.c:536 Peer pname : '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_core.c:538 Peer state : 4
[Fri May 15 01:07:49 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:07:49 2015] dyn_core.c:540 Peer is_local : 0
[Fri May 15 01:07:49 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:07:49 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:07:49 2015] dyn_token.c:96 Token : 1 1234567892 1
[Fri May 15 01:07:49 2015] dyn_core.c:551 Peers Datacenters/racks/nodes .................................................
[Fri May 15 01:07:49 2015] dyn_core.c:555 Peer datacenter........'dc1'
[Fri May 15 01:07:49 2015] dyn_core.c:559 Peer rack........'rack1'
[Fri May 15 01:07:49 2015] dyn_core.c:560 Peer rack ncontinuumm : 2
[Fri May 15 01:07:49 2015] dyn_core.c:561 Peer rack nserver_continuum : 2
[Fri May 15 01:07:49 2015] dyn_core.c:555 Peer datacenter........'dc2'
[Fri May 15 01:07:49 2015] dyn_core.c:559 Peer rack........'rack2'
[Fri May 15 01:07:49 2015] dyn_core.c:560 Peer rack ncontinuumm : 2
[Fri May 15 01:07:49 2015] dyn_core.c:561 Peer rack nserver_continuum : 2
[Fri May 15 01:07:49 2015] dyn_core.c:566 ...............................................................................
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1023 dnode_peer_connect dyn: connect to peer '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1037 dyn: connect to peer '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1082 dyn: connecting on s 11 to peer '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1023 dnode_peer_connect dyn: connect to peer '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1037 dyn: connect to peer '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1082 dyn: connecting on s 12 to peer '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1023 dnode_peer_connect dyn: connect to peer '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1037 dyn: connect to peer '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1082 dyn: connecting on s 13 to peer '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_gossip.c:793 Seed provider :::::: 'simple_provider'
[Fri May 15 01:07:49 2015] dyn_gossip.c:848 What?? No rack in Dict for rack : 'dc1'
[Fri May 15 01:07:49 2015] dyn_gossip.c:848 What?? No rack in Dict for rack : 'dc2'
[Fri May 15 01:07:49 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:49 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:07:49 2015] dyn_server.c:542 connected on s 8 to server '192.168.7.215:6379:1'
[Fri May 15 01:07:49 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:49 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1121 dyn: peer connected on sd 11 to server '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:07:49 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:49 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1121 dyn: peer connected on sd 13 to server '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:07:49 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:49 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:07:49 2015] dyn_dnode_peer.c:1121 dyn: peer connected on sd 12 to server '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:07:49 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:49 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:07:58 2015] dyn_proxy.c:341 accepted c 14 on p 9 from '192.168.7.113:7595'
[Fri May 15 01:07:58 2015] dyn_proxy.c:285 accept on p 9 not ready - eagain
[Fri May 15 01:07:58 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:58 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:07:58 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:07:58 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:08:03 2015] dyn_message.c:227 alloc_msg_count : 1
[Fri May 15 01:08:03 2015] dyn_connection.c:508 recv on sd 14 14 of 16320
[Fri May 15 01:08:03 2015] dyn_redis.c:1621 parsed req 1 res 0 type 82 state 0 rpos 14 of 14
00000000 2a 31 0d 0a 24 34 0d 0a 70 69 6e 67 0d 0a |*1..$4..ping..|
[Fri May 15 01:08:03 2015] dyn_server.c:910 server_get_dc pool 'dc1'
[Fri May 15 01:08:03 2015] dyn_server.c:944 server_get_rack 'rack1'
[Fri May 15 01:08:03 2015] dyn_dnode_peer.c:1254 Entering dnode_peer_pool_conn ................................
[Fri May 15 01:08:03 2015] dyn_vnode.c:67 peer name : '192.168.7.215'
[Fri May 15 01:08:03 2015] dyn_vnode.c:68 peer rack : 'rack1'
[Fri May 15 01:08:03 2015] dyn_vnode.c:69 peer dc : 'dc1'
[Fri May 15 01:08:03 2015] dyn_vnode.c:70 peer->processed = 1
[Fri May 15 01:08:03 2015] dyn_vnode.c:67 peer name : '192.168.7.216'
[Fri May 15 01:08:03 2015] dyn_vnode.c:68 peer rack : 'rack1'
[Fri May 15 01:08:03 2015] dyn_vnode.c:69 peer dc : 'dc1'
[Fri May 15 01:08:03 2015] dyn_vnode.c:70 peer->processed = 1
[Fri May 15 01:08:03 2015] dyn_vnode.c:67 peer name : '192.168.7.248'
[Fri May 15 01:08:03 2015] dyn_vnode.c:68 peer rack : 'rack2'
[Fri May 15 01:08:03 2015] dyn_vnode.c:69 peer dc : 'dc2'
[Fri May 15 01:08:03 2015] dyn_vnode.c:70 peer->processed = 1
[Fri May 15 01:08:03 2015] dyn_vnode.c:67 peer name : '192.168.7.78'
[Fri May 15 01:08:03 2015] dyn_vnode.c:68 peer rack : 'rack2'
[Fri May 15 01:08:03 2015] dyn_vnode.c:69 peer dc : 'dc2'
[Fri May 15 01:08:03 2015] dyn_vnode.c:70 peer->processed = 1
[Fri May 15 01:08:03 2015] dyn_dnode_peer.c:1239 dyn: key '' on dist 3 maps to server '192.168.7.215:8101'
[Fri May 15 01:08:03 2015] dyn_server.c:604 update pool 0 'dyn_o_mite' to add 0 servers
[Fri May 15 01:08:03 2015] dyn_request.c:491 forwarding request from client conn '192.168.7.113:7595' to storage conn '192.168.7.113:7595'
[Fri May 15 01:08:03 2015] dyn_message.c:171 insert msg 1 into tmo rbt with expiry of 30000 msec
[Fri May 15 01:08:03 2015] dyn_request.c:537 local forward from c 14 to s 8 req 1 len 14 type 82 with key ''
[Fri May 15 01:08:03 2015] dyn_core.c:592 timeout = 30000
[Fri May 15 01:08:03 2015] dyn_core.c:573 length of C2G_OutQ : 0
[Fri May 15 01:08:19 2015] dyn_gossip.c:726 Gossip is running ...
[Fri May 15 01:08:19 2015] dyn_gossip.c:755 I am still joining the ring!
[Fri May 15 01:08:22 2015] dyn_proxy.c:341 accepted c 15 on p 9 from '192.168.7.215:8960'
[Fri May 15 01:08:22 2015] dyn_proxy.c:285 accept on p 9 not ready - eagain
[Fri May 15 01:08:22 2015] dyn_core.c:592 timeout = 11685
[Fri May 15 01:08:22 2015] dyn_core.c:573 length of C2G_OutQ : 1
[Fri May 15 01:08:22 2015] dyn_dnode_peer.c:1023 dnode_peer_connect dyn: connect to peer '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:08:22 2015] dyn_message.c:227 alloc_msg_count : 2
[Fri May 15 01:08:22 2015] dyn_dnode_peer.c:1023 dnode_peer_connect dyn: connect to peer '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:08:22 2015] dyn_message.c:227 alloc_msg_count : 3
[Fri May 15 01:08:22 2015] dyn_dnode_peer.c:1023 dnode_peer_connect dyn: connect to peer '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:08:22 2015] dyn_message.c:227 alloc_msg_count : 4
[Fri May 15 01:08:22 2015] dyn_core.c:523 =====================Peers info=====================
[Fri May 15 01:08:22 2015] dyn_core.c:527 Server pool : 0
[Fri May 15 01:08:22 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:08:22 2015] dyn_core.c:532 Peer DC : 'dc1'
[Fri May 15 01:08:22 2015] dyn_core.c:533 Peer Rack : 'rack1'
[Fri May 15 01:08:22 2015] dyn_core.c:535 Peer name : '192.168.7.215'
[Fri May 15 01:08:22 2015] dyn_core.c:536 Peer pname : '192.168.7.215:8101'
[Fri May 15 01:08:22 2015] dyn_core.c:538 Peer state : 7
[Fri May 15 01:08:22 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:08:22 2015] dyn_core.c:540 Peer is_local : 1
[Fri May 15 01:08:22 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:08:22 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:08:22 2015] dyn_token.c:96 Token : 1 1234567891 1
[Fri May 15 01:08:22 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:08:22 2015] dyn_core.c:532 Peer DC : 'dc1'
[Fri May 15 01:08:22 2015] dyn_core.c:533 Peer Rack : 'rack1'
[Fri May 15 01:08:22 2015] dyn_core.c:535 Peer name : '192.168.7.216'
[Fri May 15 01:08:22 2015] dyn_core.c:536 Peer pname : '192.168.7.216:8101:rack1:dc1:1234567892'
[Fri May 15 01:08:22 2015] dyn_core.c:538 Peer state : 4
[Fri May 15 01:08:22 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:08:22 2015] dyn_core.c:540 Peer is_local : 0
[Fri May 15 01:08:22 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:08:22 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:08:22 2015] dyn_token.c:96 Token : 1 1234567892 1
[Fri May 15 01:08:22 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:08:22 2015] dyn_core.c:532 Peer DC : 'dc2'
[Fri May 15 01:08:22 2015] dyn_core.c:533 Peer Rack : 'rack2'
[Fri May 15 01:08:22 2015] dyn_core.c:535 Peer name : '192.168.7.248'
[Fri May 15 01:08:22 2015] dyn_core.c:536 Peer pname : '192.168.7.248:8101:rack2:dc2:1234567891'
[Fri May 15 01:08:22 2015] dyn_core.c:538 Peer state : 4
[Fri May 15 01:08:22 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:08:22 2015] dyn_core.c:540 Peer is_local : 0
[Fri May 15 01:08:22 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:08:22 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:08:22 2015] dyn_token.c:96 Token : 1 1234567891 1
[Fri May 15 01:08:22 2015] dyn_core.c:530 ==============================================
[Fri May 15 01:08:22 2015] dyn_core.c:532 Peer DC : 'dc2'
[Fri May 15 01:08:22 2015] dyn_core.c:533 Peer Rack : 'rack2'
[Fri May 15 01:08:22 2015] dyn_core.c:535 Peer name : '192.168.7.78'
[Fri May 15 01:08:22 2015] dyn_core.c:536 Peer pname : '192.168.7.78:8101:rack2:dc2:1234567892'
[Fri May 15 01:08:22 2015] dyn_core.c:538 Peer state : 4
[Fri May 15 01:08:22 2015] dyn_core.c:539 Peer port : 8101
[Fri May 15 01:08:22 2015] dyn_core.c:540 Peer is_local : 0
[Fri May 15 01:08:22 2015] dyn_core.c:541 Peer failure_count : 0
[Fri May 15 01:08:22 2015] dyn_core.c:542 Peer num tokens : 1
[Fri May 15 01:08:22 2015] dyn_token.c:96 Token : 1 1234567892 1
[Fri May 15 01:08:22 2015] dyn_core.c:551 Peers Datacenters/racks/nodes .................................................
[Fri May 15 01:08:22 2015] dyn_core.c:555 Peer datacenter........'dc1'
[Fri May 15 01:08:22 2015] dyn_core.c:559 Peer rack........'rack1'
[Fri May 15 01:08:22 2015] dyn_core.c:560 Peer rack ncontinuumm : 2
[Fri May 15 01:08:22 2015] dyn_core.c:561 Peer rack nserver_continuum : 2
[Fri May 15 01:08:22 2015] dyn_core.c:555 Peer datacenter........'dc2'
[Fri May 15 01:08:22 2015] dyn_core.c:559 Peer rack........'rack2'
[Fri May 15 01:08:22 2015] dyn_core.c:560 Peer rack ncontinuumm : 2
[Fri May 15 01:08:22 2015] dyn_core.c:561 Peer rack nserver_continuum : 2
[Fri May 15 01:08:22 2015] dyn_core.c:566 ...............................................................................
[Fri May 15 01:08:22 2015] dyn_connection.c:561 sendv on sd 11 77 of 77 in 2 buffers
[Fri May 15 01:08:22 2015] dyn_dnode_request.c:334 dnode_req_send_done entering!!!
[Fri May 15 01:08:22 2015] dyn_connection.c:561 sendv on sd 12 30 of 30 in 1 buffers
[Fri May 15 01:08:22 2015] dyn_dnode_request.c:334 dnode_req_send_done entering!!!

Does the config file has something wrong?

Single machine replication

Hi

I'm trying dynomite on a single machine with 2 memcachedb instances and a dynomite instance for each of them.

instance1.yml

dyn_o_mite:
  datacenter: dc1
  rack: rack1
  listen: 127.0.0.1:8101
  dyn_listen: 127.0.0.1:8102
  dyn_seeds:
   - 127.0.0.1:8101:rack1:dc1:12345678
   - 127.0.0.2:8101:rack1:dc1:101134286

  tokens: '12345678'
  servers:
  - 127.0.0.1:21201:1

instance2.yml

dyn_o_mite:
  datacenter: dc1
  rack: rack1
  listen: 127.0.0.2:8101
  dyn_listen: 127.0.0.2:8102
  dyn_seeds:
   - 127.0.0.1:8101:rack1:dc1:12345678
   - 127.0.0.2:8101:rack1:dc1:101134286

  tokens: '101134286'
  servers:
  - 127.0.0.1:21202:1

I can connect to instance1, but as soon as I connect to instance2 I get a invalid argument. Not sure what the problem is. Is my configuration correct?

Cheers,
-Kristoffer

instance2

[Thu Jul 23 15:10:17 2015] dyn_proxy.c:341 accepted c 12 on p 7 from '127.0.0.1:37622'
[Thu Jul 23 15:10:18 2015] dyn_dnode_peer.c:1037 dyn: connect to peer '127.0.0.1:8101:rack1:dc1:12345678'
[Thu Jul 23 15:10:18 2015] dyn_connection.c:524 recv on sd 13 eof rb 0 sb 36
[Thu Jul 23 15:10:18 2015] dyn_core.c:403 req 15 on s 13 timedout
[Thu Jul 23 15:10:18 2015] dyn_core.c:293 dnode close p 13 '127.0.0.1:8101' on event 00FF eof 0 done 0 rb 0 sb 36: Connection timed out
[Thu Jul 23 15:10:18 2015] dyn_dnode_peer.c:527 dyn: dnode_peer_close on peer '127.0.0.1:8101:rack1:dc1:12345678'
[Thu Jul 23 15:10:18 2015] dyn_connection.c:524 recv on sd 12 eof rb 7 sb 41
[Thu Jul 23 15:10:18 2015] dyn_core.c:312 close c 12 '127.0.0.1:37622' on event 00FF eof 1 done 1 rb 7 sb 41  
[Thu Jul 23 15:10:20 2015] dyn_proxy.c:341 accepted c 12 on p 7 from '127.0.0.1:37626'

instance1

[Thu Jul 23 15:10:18 2015] dyn_proxy.c:341 accepted c 10 on p 7 from '127.0.0.1:49279'
[Thu Jul 23 15:10:18 2015] dyn_core.c:312 close c 10 '127.0.0.1:49279' on event FFFF eof 0 done 0 rb 36 sb 0: Invalid argument

Does a failed node cause whole rack to also go down?

Is there such a concept of a hot backup for a node such that when a node within a rack is restarted, experiences failure, etc, it doesn't take the whole rack down.

For instance:
Adding two dyn_seeds with different IPs but same token

What is the servers used for. Can this be used to have multiple Redis per node? Is this for future support for backdoor of #92

How to recover from down rack. The process here seems error prone for a prod system: #51. Are there other solutions?

Version (in)sanity?

There seems to be a mismatch between the versions as tagged here in git and the version string as reported by dynomite ltself.

# dynomite --version
This is dynomite-0.1.19

https://github.com/Netflix/dynomite/releases contains 0.3.0, 0.4.0

This is likely drive from the section here:

dynomite/configure.ac

Lines 1 to 5 in 438b6db

# Define the package version numbers and the bug reporting address
m4_define([DN_MAJOR], 0)
m4_define([DN_MINOR], 1)
m4_define([DN_PATCH], 19)
m4_define([DN_BUGS], [[email protected]])
not being updated.

I don't know how the project wants to version itself, but it appears that there's some attempt at using git tags - so I'd propose that one be set and trump the other.
Also, if there's any way to get the build git hash in the version string for prerelase versions or such, that would be awesome.

Here's a method I read about, could likely be adapted: http://stackoverflow.com/a/1705263

Unable to make install: murmur3 failure

Using the following steps,

$ git clone [email protected]:Netflix/dynomite.git
$ cd dynomite
$ autoreconf -fvi
$ CFLAGS="-ggdb3 -O0" ./configure --enable-debug=full
$ make
$ sudo make install

The following error is returned during make install:

...
make[3]: Leaving directory `/home/etlfm/Desktop/DEV/dynomite/contrib/yaml-0.1.4/tests'
make[2]: Leaving directory `/home/etlfm/Desktop/DEV/dynomite/contrib/yaml-0.1.4'
Making install in murmur3
make[2]: Entering directory `/home/etlfm/Desktop/DEV/dynomite/contrib/murmur3'
make[2]: *** No rule to make target `install'.  Stop.
make[2]: Leaving directory `/home/etlfm/Desktop/DEV/dynomite/contrib/murmur3'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory `/home/etlfm/Desktop/DEV/dynomite/contrib'
make: *** [install-recursive] Error 1
...

This appears to be resolved by the following patch:

diff --git a/contrib/Makefile.am b/contrib/Makefile.am
index 8883d1d..cf22345 100644
--- a/contrib/Makefile.am
+++ b/contrib/Makefile.am
@@ -1,3 +1,3 @@
-SUBDIRS = yaml-0.1.4 murmur3
+SUBDIRS = yaml-0.1.4 

With the above patch, autoreconf works, make works, and most importantly make install works.

I don't believe you need to reference murmur3 in the Makefile.am for contrib. I'm not sure why this is breaking for me and was not encountered previously.

Note: I'm working on forking this repository and submitting a pull request, but I wanted to get this issue logged.

why dynomite not check redis status

I have two rack(rack1 rack2) and each have four node & four redis service,
I set & read data from random rack with phpredis extension.
I have simple tactics for failing over:if find some dynomite node not connect,i set all node for same rack is fail,next redis request will avoid connect bad node. well,It worked out Fine.

if all dynomite rack status is ok,i turn off any backend redis service,
the above failing over tacticsout of work: connect dynomite node is ok,but i will missing get & set some data.

how can known backend redis status by dynomite? dynomite node have some auto heath check for redis?

gos_insterval and florida

gos_interval has no relation with the time between two florida lookup (for local service) is it right?

Whatever the gos_interval setted, dynomite seams to call my local service each 30 seconds.

Max.

a compile error with struct event_base on mac os x 10.10

$ make
/Applications/Xcode.app/Contents/Developer/usr/bin/make  all-recursive
...(SKIP)...
gcc -DHAVE_CONFIG_H -I. -I..  -D_GNU_SOURCE -I ../src/hashkit -I ../src/proto -I ../src/event -I ../src/seedsprovider -I ../contrib/yaml-0.1.4/include  -Wall -Wshadow -Wpointer-arith -Winline -Wunused-function -Wunused-variable -Wunused-value -Wno-unused-parameter -Wno-unused-value -Wconversion -Wsign-compare -Wstrict-prototypes -Wmissing-prototypes -Wredundant-decls -Wmissing-declarations -g -O2 -MT dyn_dnode_server.o -MD -MP -MF .deps/dyn_dnode_server.Tpo -c -o dyn_dnode_server.o dyn_dnode_server.c
dyn_dnode_server.c:154:63: error: no member named 'ep' in 'struct event_base'
    log_debug(LOG_INFO, "dyn: e %d with nevent %d", ctx->evb->ep, ctx->evb->nevent);
                                                    ~~~~~~~~  ^

Here is a part of my config.h after ./configure --enable-debug=full.

/* Define to 1 if epoll is supported */
/* #undef HAVE_EPOLL */

...(SKIP)...

/* Define to 1 if kqueue is supported */
#define HAVE_KQUEUE 1

Mac OS X have kqueue instead of epoll. And, HAVE_KQUEUE's struct event_base have no member 'ep'. It's only for HAVE_EPOLL.

default config fails to start when pem file not found

Introduced in #57.

The default config has no configuration set for crypto, yet it attempts to load it by default, and fails when not found.

root@4622f90515db:~# pwd
/root
root@4622f90515db:~# which dynomite
/usr/local/sbin/dynomite
root@4622f90515db:~# dynomite -c dynomite/conf/dynomite.yml
[Sun Mar 15 16:17:35 2015] dynomite.c:194 dynomite-0.1.19 built for Linux 3.18.5-tinycore64 x86_64 started on pid 8668
[Sun Mar 15 16:17:35 2015] dynomite.c:199 run, rabbit run / dig that hole, forget the sun / and when at last the work is done / don't sit down / it's time to dig another one
[Sun Mar 15 16:17:35 2015] dyn_crypto.c:45 Error: file conf/dynomite.pem not exists
[Sun Mar 15 16:17:35 2015] dyn_core.c:87 Failed to initialize crypto!!!
[Sun Mar 15 16:17:35 2015] dynomite.c:205 done, rabbit done
root@4622f90515db:~#

The error message comes from

if( access( file_name, F_OK ) < 0 ) {
log_error("Error: file %s not exists", file_name);
return DN_ERROR;
}
- which makes sense, as the default-set path of conf/dynomite.pem is not reachable from the current path, however this should not be loaded until a pem_key_file setting is added in the config.

This is likely due to the definition here:

#define PEM_KEY_FILE "conf/dynomite.pem"
which is inherited by all configurations, even when unspecified.

I think this shouldn't be declared as a constant, rather some empty value that does not attempt to initialize crypto on startup when the parameter is not set.

Add/remove node

Hi,

Do you have a procedure to add/remove nodes to share?

Thank a lot.

Max.

Support ring describe cmd

This is one of the admin commands that must be available as soon as possible. This cmd is similar to cassandra's nodetool ring cmd.

Default memcache configs contain invalid dyn_seed settings

The default memcache configs in the conf/ directory are invalid.

$src/dynomite -t  -c conf/dynomite_mc1.yml
dynomite: configuration file 'conf/dynomite_mc1.yml' syntax is invalid

The following error is reported

...
[Thu Dec  4 16:15:11 2014] dyn_conf.c:739 conf: directive "dyn_seeds" has an invalid format must match "hostname:port:rack:dc:tokens [name]"
[Thu Dec  4 16:15:11 2014] dyn_conf.c:706 pop '127.0.0.2:8101:RACK1:5622637,721812480,851406979,1036155118,2147653893,2233516174'

....

It would appear a "default" data center needs to be added to the dyn_seed setting to both memcache configs.

Example:

From

dyn_seeds:
  - 127.0.0.2:8101:RACK1:5622637,721812480,851406979,1036155118,2147653893,2233516174

To

  dyn_seeds:
  - 127.0.0.2:8101:RACK1:DC1:5622637,721812480,851406979,1036155118,2147653893,2233516174

Setting same rack name in different datacenter get SIGNAL 11

If I set the rack name to default-rack for both datacenter, the second dynomite will start fail.

[Mon Jul 20 15:53:34 2015] dynomite.c:194 dynomite-0.1.19 built for Linux 3.18.0-20-rpi2 armv7l started on pid 28532
[Mon Jul 20 15:53:34 2015] dynomite.c:199 run, rabbit run / dig that hole, forget the sun / and when at last the work is done / don't sit down / it's time to dig another one
[Mon Jul 20 15:53:34 2015] dyn_stats.c:1039 m 4 listening on '0.0.0.0:22222'
[Mon Jul 20 15:53:34 2015] dyn_proxy.c:211 p 8 listening on '0.0.0.0:8102' in redis pool 0 'raspberry-pi-chimera' with 1 servers
[Mon Jul 20 15:53:34 2015] dyn_dnode_server.c:195 dyn: p 9 listening on '0.0.0.0:8101' in redis pool 0 'raspberry-pi-chimera' with 1 servers
[Mon Jul 20 15:53:34 2015] dyn_signal.c:127 signal 11 (SIGSEGV) received, core dumping
Segmentation fault (core dumped)
  • Here is my configuration file in one datacenter(shanghai):
raspberry-pi-chimera:
  datacenter: shanghai
  rack: default-rack
  dyn_listen: 0.0.0.0:8101
  listen: 0.0.0.0:8102
  servers:
  - 127.0.0.1:6379:1
  tokens: '2035160051'
  redis: true
  secure_server_option: datacenter
  pem_key_file: /var/dynomite/dynomite.pem
  dyn_seeds:
  - 152.62.40.200:8101:default-rack:beijing:835488289
  • Here is my configuration file in other datacenter(beijing):
raspberry-pi-chimera:
  datacenter: beijing
  rack: default-rack
  dyn_listen: 0.0.0.0:8101
  listen: 0.0.0.0:8102
  servers:
  - 127.0.0.1:6379:1
  tokens: '835488289'
  redis: true
  secure_server_option: datacenter
  pem_key_file: /var/dynomite/dynomite.pem  

However, if I change the rack name to default-rack-shanghai and default-rack-beijing, they will work.

compile problems on CentOS 7

bash-4.2# rpm -qa | grep autoconf
autoconf-2.69-11.el7.noarch
bash-4.2# rpm -qa | grep automake
automake-1.13.4-3.el7.noarch
bash-4.2# autoreconf -fvi
...
configure.ac:36: error: possibly undefined macro: AC_PROG_LIBTOOL
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.

any ideas?

Unable to start dynomite with default config(s) fails with Aborted (core dumped)

On the following host type:

uname -a
Linux etlfm-VirtualBox 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Perform the following:

$ git clone [email protected]:Netflix/dynomite.git
$ cd dynomite
$ autoreconf -fvi
$ CFLAGS="-ggdb3 -O0" ./configure --enable-debug=full
$ make

If you run a single test you get the following output:

$ src/dynomite -t -c conf/dynomite_mc1.yml
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [0] src/dynomite(array_deinit+0x2f) [0x429f77]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [1] src/dynomite() [0x41fbe0]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [2] src/dynomite(conf_destroy+0x58) [0x4231ed]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [3] src/dynomite(conf_create+0xc8) [0x42318e]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [4] src/dynomite() [0x4304bb]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [5] src/dynomite(main+0xd1) [0x430721]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [6] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fc5c34f2ec5]
[Mon Nov 17 14:03:11 2014] dyn_util.c:320 [7] src/dynomite() [0x407659]
Aborted (core dumped)

Also, any of the default configs results in a core dump:

$ find conf/*.yml -print0 | xargs -0 -n 1 -ippqq echo echo src/dynomite -t -c ppqq \; src/dynomite -t -c ppqq | sh
src/dynomite -t -c conf/dynomite_mc1.yml
Aborted (core dumped)
src/dynomite -t -c conf/dynomite_mc2.yml
Aborted (core dumped)
src/dynomite -t -c conf/dynomite_mc_single.yml
Aborted (core dumped)
src/dynomite -t -c conf/dynomite_redis1.yml
Aborted (core dumped)
src/dynomite -t -c conf/dynomite_redis2.yml
Aborted (core dumped)
src/dynomite -t -c conf/dynomite_redis_single.yml
Aborted (core dumped)
src/dynomite -t -c conf/dynomite.yml
Aborted (core dumped)

Attempting to run with the default configs also results in a a core dump

$ find conf/*.yml -print0 | xargs -0 -n 1 -ippqq echo echo src/dynomite -c ppqq \; src/dynomite -c ppqq | sh
... ( trimmed as the errors are the very similar ) ...
src/dynomite -c conf/dynomite.yml
[Mon Nov 17 14:00:01 2014] dynomite.c:192 dynomite-0.1.19 built for Linux 3.13.0-24-generic x86_64 started on pid 21017
[Mon Nov 17 14:00:01 2014] dynomite.c:197 run, rabbit run / dig that hole, forget the sun / and when at last the work is done / don't sit down / it's time to dig another one
[Mon Nov 17 14:00:01 2014] dyn_stats.c:994 m 3 listening on '0.0.0.0:22222'
[Mon Nov 17 14:00:01 2014] dyn_proxy.c:211 p 6 listening on '127.0.0.1:8102' in redis pool 0 'dyn_o_mite' with 1 servers
[Mon Nov 17 14:00:01 2014] dyn_dnode_server.c:195 dyn: p 8 listening on '127.0.0.1:8101' in redis pool 0 'dyn_o_mite' with 1 servers
[Mon Nov 17 14:00:01 2014] dyn_util.c:330 assert 'dst->len == 0 && dst->data == NULL' failed @ (dyn_string.c, 92)
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [0] src/dynomite(string_copy+0x43) [0x429c7d]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [1] src/dynomite() [0x42c116]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [2] src/dynomite() [0x42d261]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [3] src/dynomite(array_each+0x9e) [0x42a3b6]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [4] src/dynomite(gossip_pool_init+0x26) [0x42d634]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [5] src/dynomite() [0x4117da]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [6] src/dynomite(core_start+0x2e) [0x41190a]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [7] src/dynomite() [0x43061b]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [8] src/dynomite(main+0x132) [0x430782]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f9ce2e25ec5]
[Mon Nov 17 14:00:01 2014] dyn_util.c:320 [10] src/dynomite() [0x407659]
Aborted (core dumped)

Unexpected behavior with "noreply"

The issue can be reproduced as follows:

$ cat ~/reproduce.py
import socket
import time


def main():
    val = str(int(time.time()))
    sock = socket.socket()
    sock.connect(('localhost', 8102))
    sock.send('delete foo noreply\r\n')
    sock.send('set foo 0 0 %d noreply\r\n%s\r\n' % (len(val), val))
    sock.send('get foo\r\n')
    resp = sock.recv(1024)
    try:
        assert 'foo' in resp
    except:
        print 'unexpected response: %s' % resp.replace('\r', r'\r').replace('\n', r'\n')
        raise
    finally:
        sock.close()


if __name__ == '__main__':
    main()
$ python ~/reproduce.py
unexpected response: END\r\n
Traceback (most recent call last):
  File "/home/mckelvin/reproduce.py", line 23, in <module>
    main()
  File "/home/mckelvin/reproduce.py", line 14, in main
    assert 'foo' in resp
AssertionError
$ python ~/reproduce.py
$ python ~/reproduce.py
unexpected response: END\r\n
Traceback (most recent call last):
  File "/home/mckelvin/reproduce.py", line 23, in <module>
    main()
  File "/home/mckelvin/reproduce.py", line 14, in main
    assert 'foo' in resp
AssertionError

I also did the same test on facebook/mcrouter and twitter/twemproxy. This will only happen on Netflix/dynomite. So I guess that's a bug.

MessagingService component

We should have this component first as one of the backbone components to help communication between nodes. Similar in Cassandra, this component is leveraged in Gossiping as nodes exchanges gossiping messages asynchronously. Also, it is used in asynchronously exchanging other kinds of messages in Cassandra like Merkle tree requests, cluster snapshot taking, mutation messages, data repair streaming requests, and many others. I am creating this issue just for building the MessagingService.

Plugin Architecture for New Protocols

Hey there!

I've been interested in Dynomite and would like to work on supporting a new protocol with it. I've been reading through the Dynomite source code, and the code base seems to be coupled strongly to redis and memcache (code paths such as: https://github.com/Netflix/dynomite/blob/master/src/dyn_dnode_msg.c#L446 and others).

Given dynomite's lineage with memcache, with redis added later, I would like to send patches that decouple the codebase from redis / memcache explicitly and roughly follow this development path:

  1. Begin by decoupling the config from redis or not, and support proto as an arbitrary string in the config file. We can preserve backwards compatibility with the redis boolean if desired, but it'd be nice to deprecate it eventually if that's important to everyone.
  2. Decouple internal msg branches that explicitly check for the redis boolean and other places that explicitly branch on memcache / redis.
  3. Put in place an interface for a more flexible "Protocol Interface" and move redis and memcache to this interface.
  4. Move to allow dynamic linking to "Protocol Interfaces" that can be built outside of dynomite and decoupled from its build.
  5. Investigate seeds providers for bootstrapping on these protocol interfaces? This is an area I'm fuzzy on, but could be useful to have built-ins for Protocol Interfaces that ship with Dynomite built-in.

Is there anything else I'm missing? Does this sound like a worthwhile effort? Having Dynomite be easily pluggable to other protocols would be the end goal. Let me know what you think!

Blake

Support MSET and MSETNX commands

Dynomite should support the MSET and MSETNX cmds.
Also, as precaution, Dynomite should check every key in these cmds, and ensure that they all belong to the same token/shard.

Dynomite without Dyno

I'm testing Dynomite from several weeks now and my real issue is client side. I do not use Dyno client but just simple redis-cli or simple Redis drivers.

Everything works well, perf increases linearly according to the number of nodes, replication is good too. I'm working now in HA, , as you do not have retry policy or failover strategy, when a node goes down, all the rack become unusable (because queries routed by the other nodes to its node will crash). We should work only on "sain" racks, to ensure consistency, we only work in one rack a time.

We usually use HAProxy for routing connections to our backend so I tried a schema like this:

frontend RING1
bind 0.0.0.0:6379
acl RING1-DC1-RC1_issue nbsrv(RING1-DC1-RC1) lt 2
use_backend RING1-DC1-RC2 if RING1-DC1-RC1_issue
default_backend RING1-DC1-RC1

backend RING1-DC1-RC1
option tcp-check
tcp-check send *1\r\n$4\r\nping\r\n
tcp-check expect string +PONG
balance leastconn
server DC1-RC1-R1 192.168.33.40:8102 check inter 100ms
server DC1-RC1-R2 192.168.33.41:8102 check inter 100ms

backend RING1-DC1-RC2
option tcp-check
tcp-check send *1\r\n$4\r\nping\r\n
tcp-check expect string +PONG
balance leastconn
server DC1-RC2-R3 192.168.33.42:8102 check inter 100ms
server DC1-RC2-R4 192.168.33.43:8102 check inter 100ms

Clients talk to the HAProxy through 6379, they are redirected to the rack RC1, if one node goes down in this rack, HAProxy routes queries on RC2 (HAProxy can be redunded with a VIP or something else).

I think it is really node expensive because if you have 40 nodes racks, when you loose one node you loose the rest of the rack (39) or if you have a ring of 2 racks your service goes down if you encouter an issue with one node in each rack. I would have your opinion on this topology, do I go in the wrong way?

Max.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.