basho / riak_repl Goto Github PK

View Code? Open in Web Editor NEW

55.0 27.0 30.0 7.43 MB

Riak DC Replication

Makefile 0.03% Erlang 99.97%

riak_repl's Introduction

riak_repl's People

Contributors

Stargazers

Watchers

riak_repl's Issues

Zombie server processes

The send_timeout_close option seems to generate zombie server processes when exposed to bad network conditions.

BNW Bootstrap mode

REPL to an empty cluster is incredibly inefficient. Provide a bootstrap command that blasts all data to a second cluster.

1.2 connections now report under 'sources' and 'sinks'

Clean up riak-repl command line output

Using riak-admin cluster as inspiration, clean up and add text to the output of riak-repl to make it more user-friendly.

Monitor leak

See: https://github.com/basho/riak_repl/blob/master/src/riak_repl2_rtsink_helper.erl#L63

realtime replication does not automatically connect to a sink when a node is added

When a new node is joined to an existing cluster, the new node does not automatically connect to a node in the sink cluster when RT replication is enabled. This makes it impossible to add nodes to a running cluster without having to stop/start RT replication, which means objects written to the new node will not be RT replicated. This is bad. This should be fixed.

ongoing repl + riak_test development and maintenance

BNW real-time postcommit hook called on "old" style cluster

@kellymclaughlin found an issue while running a replication riak_test against CS, using old 1.2 repo configuration (what is now called "default" - not 1.3 BNW).

A bunch of repeated errors showed up in the logs list this:

2013-02-06 16:34:05.681 [debug] <0.5494.0>@riak_kv_put_fsm:decode_postcommit:781 Problem invoking post-commit hook riak_repl2_rt:postcommit -> error:badarg

Looking into it, the only way I can see this getting hit is when BNW postcommit has been installed into the bucket properties, because that is a riak_repl2_rt:postcommit() call. The badarg is a mystery to me, though.

When I look at the code path, it includes a call here:
get_modes(Ring) ->
    RC = get_repl_config(Ring),
    case dict:find(repl_modes, RC) of
        {ok, ReplModes} -> ReplModes;
        error ->
            %% default to mixed modes
            [mode_repl12, mode_repl13]
    end.

Is it possible that we aren't finding repl_modes in the ring? Makes me wonder if the default should really be mixed modes until we officially support BNW in 1.4.

This doesn't seem to be causing any tests to fail, so it's not data-loss or corruption, and is thus being marked as a MUST for 1.3, but I think we should investigate for 1.4.

IPV6 support in BNW

All RT source nodes connect to same sink when bouncing the source cluster

from ZenDesk https://help.basho.com/tickets/3441

Apple had a 48-node cluster (the source) and bounced each node in 5 minute intervals. After each bounced node came back up, the realtime sync source connected to the same realtime sync sink on the remote cluster (the sink). This is bad because it could have overloaded the one sink node. After all nodes were bounced, every source was connected to the same single sink node.

The reason is that the cluster manager was giving out the same list of IP addresses to each source node as it came back up, as the locator for realtime sync. Even though the round-robin balancer was working, it was getting reset every 5 seconds by the remote node polling update (polls IP address of the remote cluster every 5 seconds). The current cluster manager assumes the list returned by the remote cluster is in an order of least busy to most busy, but that is not actually implemented yet. So, the remote just returns the same list every time and hence it's not actually balanced. Bah.

Two solutions come to mind:

have the remote send over a list that correctly reflects busy-ness
have the local cluster not re-order the list on updates. Just store the IP addresses

Solution 2 is easier I think. But we'll see.

Connection Manager

Connection Manager is a "brave new world" component for replication that we believe will also benefit "core",

It's job is to manage connections and sub-protocols to a remote cluster.

The following is an out-of-date documentation page:
https://github.com/basho/internal_wiki/wiki/Replication-Brave-New-World

Connection Manager will move to core after the Apple delivery, which is why all the modules are named "riak_core_*.erl"
The relevant files are as follows. I know it's a lot.

riak_repl/src/riak_core_cluster_conn.erl
riak_repl/src/riak_core_cluster_mgr_sup.erl
riak_repl/src/riak_core_service_mgr.erl
riak_repl/src/riak_core_cluster_conn_sup.erl
riak_repl/src/riak_core_connection.erl
riak_repl/src/riak_core_cluster_mgr.erl
riak_repl/src/riak_core_connection_mgr.erl

Slow connections can cause the creation of many repl server processes

Description

It appears that slow WAN connections can cause communication between two data centers to fail but the server process is never killed. The client data center opens a new connection which creates a new server process. The server processes consume more and more memory over time until the node runs out of memory.

Steps to reproduce

Start two servers
Install Riak EE
Setup MDC replication between the servers
Add several large objects (1MB)
Use netem to add artificial latency to the connection (6000ms)
Wait for processes to build up

Vagrant example

The following Vagrantfile and provisioning script will setup the above scenario. You have to replace RIAK_EE_URL in the provisioning script with the appropriate URL for the amd64 deb package.

Vagranfile

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant::Config.run do |config|
        [{:name => :one, :ip => "192.168.33.11", :other_ip => "192.168.33.12"},
         {:name => :two, :ip => "192.168.33.12", :other_ip => "192.168.33.11"}]
        .each do |m|
                config.vm.define m[:name] do |c|
                        c.vm.box = "lucid64"
                        c.vm.box_url = "http://files.vagrantup.com/lucid64.box"
                        c.vm.network :hostonly, m[:ip]
                        c.vm.provision :shell do |s|
                                s.path = "setup-riak-ee.sh"
                                s.args = "#{m[:ip]} #{m[:other_ip]} &"
                        end
                end
         end
end

setup-riak-ee.sh

#!/bin/bash

IP=$1
OTHER_IP=$2

sudo apt-get -y install curl

echo "Download riak-ee"
curl --silent RIAK_EE_URL -o riak-ee_1.1.1-1_amd64.deb

echo "Install riak-ee"
sudo dpkg -i riak-ee_1.1.1-1_amd64.deb

echo "Start riak-ee"
riak start

echo "Wait for riak-ee"
riak-admin wait-for-service riak_kv [email protected]

echo "Add data"
dd if=/dev/urandom count=1024 bs=1024 of=1MB_file
for i in {1..10}
do
        curl --silent http://127.0.0.1:8098/riak/b/$i -XPUT -d @1MB_file \
                -H 'content-type:text/plain'
done

echo "Add delay"
tc qdisc add dev eth1 root netem delay 6000ms

echo "Add listener"
riak-repl add-listener [email protected] $IP 9010

echo "Add site"
riak-repl add-site $OTHER_IP 9010 $OTHER_IP

Investigate why `riak-repl status` times out if rtq_max_bytes is set to a large value (e.g., 1GB)

See ticket #3241 for info.

{rtq_max_bytes, 11952000768}

To test:
riak-repl connect foo
riak-repl realtime enable

basho_bench data into the side with realtime enabled.
riak-repl status <- watch the queue grow as you write data into it.

Support multi-datacenter replication between rings of different ring sizes. [JIRA: RIAK-3412]

Cancel fullsync hit undefined function

I tried cancelling a fullsync when it got stuck and got this error report - is

([email protected])19> riak_repl_console:cancel_fullsync([]).
ok
([email protected])20> 21:13:53.820 [error] gen_fsm <0.4187.0> in state diff_bloom terminated with reason: call to undefined function riak_repl_keylist_server:stop/1 from riak_repl_keylist_server:diff_bloom/2
21:13:53.822 [error] CRASH REPORT Process <0.4187.0> with 3 neighbours exited with reason: call to undefined function riak_repl_keylist_server:stop(<0.4225.0>) in gen_fsm:terminate/7
21:13:53.824 [error] Supervisor ranch_conns_sup had child ranch_conns_sup started with {ranch_conns_sup,start_protocol,undefined} at <0.4186.0> exit with reason call to undefined function riak_repl_keylist_server:stop(<0.4225.0>) in context child_terminated
21:13:54.014 [info] Using fullsync strategy riak_repl_keylist_server.

I think I hit this...
https://github.com/basho/riak_repl/blob/master/src/riak_repl_keylist_server.erl#L368

add support to repl for new small riak-object format

riak object smallification is officially part of the riak core cabal, but I'm putting an issue here under replication so that we can track our part of it.

Replication needs to work with mixed cluster versions where the riak object format may be different on the two clusters.

EQC test that runs random commands

There are so many cases, now, where we need to test replication in the face of other kinds of riak cluster operations. The latest, that comes to mind, is binary object downgrades. That's just one example.

So, how about we write a riak test that fires random, but legal, operations at the cluster while we do random kinds of replication. Mixtures of modes, etc. This is an idea that Andrew had some time ago; and it's looking really useful now.

2 stats in riak-repl status output are strings instead of atoms [JIRA: RIAK-3409]

Reported by Steve Vinoski:

Why are all the stat names in Erlang atoms except for these two, which are strings instead?
{"local_leader_message_queue_len",0},
{"local_leader_heap_size",233}

Add mechanism to indicate if objects were dropped and a fullsync is needed

Currently, if we drop objects from realtime replication (BNW), we may in some cases be able to observe the drop. Relevant error messages:

No nodes available to migrate replication data (on shutdown)
No available nodes to proxy objects to (on shutdown)
rtq proxy target is down (on shutdown)

However, if a node crashes during realtime replication, we can't write to the logfile and there is no indication of dropped objects on restart.

At least one customer (Anya) wants to be able to detect the need for a full sync by monitoring repl_stats (json) in their operational environment, since they have already built mechanisms to trigger alarms based off polling each node.

We already have a dropped object count in the status. This will monitors drops due to connection loss, connection errors, overloaded nodes, and anything else that happens while running.

We may need to add an entry to stats for dropped objects during controlled or uncontrolled shutdowns. A completely crashed node that never comes back obviously can't indicate drops. But perhaps the other end of the connection can mark the loss of connection.

Don't autodiscover site IPs when NAT is detected

Doing so means that a bunch of unroutable IPs end up in the site's IP list, which delays connection time and could possibly lead to connecting to the wrong place.

`riak-repl start-fullsync` does nothing on non-leader node

Running riak-repl start-fullsync on a node in the cluster that is not the replication leader results in no messages and no actions. Preferable behavior would be to either communicate with the leader to start a fullsync, or return a message stating that no action is being taken and that the command should be run on the leader (and/or log the message).

gen_tcp:recv call without timeout in listener

Quick audit of code after #riak_core/153

The listen here could get hit by the erlang-socket-with-no-os-socket issue

https://github.com/basho/riak_repl/blob/master/src/riak_repl_listener.erl#L35

Active Anti Entropy for BNW

Additional tests for repl JSON stats

Add more debug info for proxy_get client and server modules

The proxy_get implementation could benefit by having more debug messages that can be traced at run time. Mostly for tracing the requests to ensure that we are getting all three phases: protocol buffer "coordinator" for the get requests, the tcp client, and the tcp server.

Call to riak_repl_wm_stats:jsonify_stats fails during fullsync

See ZD ticket 3634 for client detail. When a fullsync is running, calls to riak_repl_wm_stats:jsonify_stats will fail with e.g.

{error,{error,function_clause,[{riak_repl_wm_stats,jsonify_stats,[[{fullsync,63,left},...

This is called during webmachine calls to '/riak-repl/stats'.

Quickcheck test for RTQ

#169

1.2.1 CRASH in Tcp error handler... called as tcp ++ "_error"

In the 1.2.1 code base, we had a bug where a tcp failure could result in a bogus error message formatting that would crash the replication service. Here is what the log file showed:

2013-03-06 02:39:20.145 UTC [error] <0.1806.0> Supervisor poolboy_sup had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.1812.0> exit with reason {{badarg,[{erlang,'++',[tcp,"_error"],[]},{riak_repl_tcp_server,send,3,[{file,"src/riak_repl_tcp_server.erl"},{line,417}]},{riak_repl_keylist_server,diff_bloom,3,[{file,"src/riak_repl_keylist_server.erl"},{line,490}]},{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]},{gen_fsm,sync_send_event,[<16329.11597.0>,{diff_obj,{r_object,<<"dev/faketest16_22">>,<<"test-blobitory-management-collection/1c60744e-033d-4c69-2350-693...">>,...}},...]}} in context child_terminated

Andrew already had a patch for that, which I applied as a cherry pick against the 1.2.1 code base, on a branch of repl: cet-apply-error-msg-fix-to-1.2. This resolved the bug. Beams were sent to the customer.

We decided not to merge this fix into the 1.2 code base because it would be confusing to us later in trying to decide which fixes were sent to which customers. Maybe the right thing to do is to create a 1.2.x release and merge it against that.

ZenDesk ticket...
https://basho.zendesk.com/agent/#/tickets/3998

The FIX:
https://github.com/basho/riak_repl/tree/cet-apply-error-msg-fix-to-1.2

The TEST: (Thanks to @metadave Dave Parfitt)
https://gist.github.com/metadave/5cb961f5ea68054e5b5d

rename source in rt_source_connected_to when reporting on sink stats

A source stat appears along with rt_source_connected_to which gives details about the sink. This can be confusing to users. Rename source -> sink if stat is being reported from a sink.

fullsync on rt_dirty

See @buddhisthead's comment on #158 (comment)

Reverse client/server roles

If we do this one-way-sync would be better.

Investigate the following REPL error message

To reproduce:
Compile with R15B01
make devrel
dev1/bin/riak-repl add-listener [email protected] 127.0.0.1 8080
dev4/bin/riak-repl add-site 127.0.0.1 8080 foo

basho_bench data into dev1.

2012-05-31 10:58:54.994 [info] <0.1412.0>@riak_repl_keylist_server:build_keylist:153 Full-sync with site "foo"; built keylist for 182687704666362864775460604089535377456991567872 (built in 0.07 secs)
2012-05-31 10:58:55.001 [error] <0.1412.0> gen_fsm <0.1412.0> in state wait_keylist terminated with reason: no function clause matching riak_repl_keylist_server:wait_keylist({#Ref<0.0.0.6404>,keylist_built}, {state,"foo",{sslsocket,new_ssl,<0.1411.0>},ranch_ssl,"./data/riak_repl/work/7224057/foo-127.0.0...",...}) line 172
2012-05-31 10:58:55.003 [error] <0.1412.0> CRASH REPORT Process <0.1412.0> with 1 neighbours exited with reason: no function clause matching riak_repl_keylist_server:wait_keylist({#Ref<0.0.0.6404>,keylist_built}, {state,"foo",{sslsocket,new_ssl,<0.1411.0>},ranch_ssl,"./data/riak_repl/work/7224057/foo-127.0.0...",...}) line 172 in gen_fsm:terminate/7 line 611
2012-05-31 10:58:55.004 [error] <0.980.0> Supervisor ranch_conns_sup had child ranch_conns_sup started with {ranch_conns_sup,start_protocol,undefined} at <0.1410.0> exit with reason no function clause matching riak_repl_keylist_server:wait_keylist({#Ref<0.0.0.6404>,keylist_built}, {state,"foo",{sslsocket,new_ssl,<0.1411.0>},ranch_ssl,"./data/riak_repl/work/7224057/foo-127.0.0...",...}) line 172 in context child_terminated
2012-05-31 10:58:55.040 [info] <0.1533.0>@riak_repl_tcp_server:handle_msg:249 Using fullsync strategy riak_repl_keylist_server.
2012-05-31 10:58:55.041 [info] <0.1533.0>@riak_repl_tcp_server:handle_msg:279 Full-sync on connect
2012-05-31 10:58:55.059 [info] <0.1535.0>@riak_repl_keylist_server:wait_for_partition:120 Full-sync with site "foo"; doing fullsync for 182687704666362864775460604089535377456991567872
2012-05-31 10:58:55.059 [info] <0.1535.0>@riak_repl_keylist_server:build_keylist:139 Full-sync with site "foo"; building keylist for 182687704666362864775460604089535377456991567872
2012-05-31 10:58:55.104 [info] <0.1543.0>@riak_repl_fullsync_helper:handle_cast:284 Sorting keylist "./data/riak_repl/work/7224057/foo-127.0.0.1:8080-127.0.0.1:57843/182687704666362864775460604089535377456991567872.ours.sterm"
2012-05-31 10:58:55.106 [info] <0.1543.0>@riak_repl_fullsync_helper:handle_cast:287 Sorted ./data/riak_repl/work/7224057/foo-127.0.0.1:8080-127.0.0.1:57843/182687704666362864775460604089535377456991567872.ours.sterm in 0.00 seconds

Stale kl_exchange message during fullsync


/home/andrew/riak_test/rt/dev/dev6/log/console.log:2012-08-10 14:23:38.152 [warning] <0.2058.0>@riak_repl_keylist_client:request_partition:153 Full-sync with site "site1"; skipping partition 548063113999088594326381812268606132370974703616 because of error node_not_available
/home/andrew/riak_test/rt/dev/dev6/log/console.log:2012-08-10 14:23:38.186 [error] <0.2058.0> gen_fsm <0.2058.0> in state request_partition terminated with reason: no function clause matching riak_repl_keylist_client:request_partition({kl_exchange,548063113999088594326381812268606132370974703616}, {state,"site1",#Port<0.11954>,ranch_tcp,"./data/riak_repl/work/112824079/site1-127.0.0.1:52588-1...",...})
/home/andrew/riak_test/rt/dev/dev6/log/console.log:2012-08-10 14:23:38.199 [error] <0.2058.0> CRASH REPORT Process <0.2058.0> with 1 neighbours exited with reason: no function clause matching riak_repl_keylist_client:request_partition({kl_exchange,548063113999088594326381812268606132370974703616}, {state,"site1",#Port<0.11954>,ranch_tcp,"./data/riak_repl/work/112824079/site1-127.0.0.1:52588-1...",...}) in gen_fsm:terminate/7
/home/andrew/riak_test/rt/dev/dev6/log/console.log:2012-08-10 14:23:38.207 [error] <0.1582.0> Supervisor riak_repl_client_sup had child "site1" started with riak_repl_tcp_client:start_link("site1") at <0.2048.0> exit with reason no function clause matching riak_repl_keylist_client:request_partition({kl_exchange,548063113999088594326381812268606132370974703616}, {state,"site1",#Port<0.11954>,ranch_tcp,"./data/riak_repl/work/112824079/site1-127.0.0.1:52588-1...",...}) in context child_terminated

The request_partition state should ignore stale kl_exchange messages.

BNW Large Scale Testing on AWS

@jsmartin is working on deployment tools
@buddhisthead is testing the tools and trying to build repl clusters

Clients do not handle gen_tcp:send/2 errors

The riak_repl_tcp_client does not handle the return from gen_tcp:send/2 which may return {error, Reason}. This can cause replication clients to hang around indefinitely without replication actually occurring.

A module has been provided below to help observe current client behavior. To observe the issue you will need to setup 2 Riak nodes and connect them via riak-repl. Once that is setup, you can use the below module as follows:

riak_repl_client_test:test_client_ports().                                 
[{<0.8179.0>,ok},{<0.8182.0>,ok}]

riak_repl_client_test:close_client_ports().                                  
[{<0.8179.0>,true},{<0.8182.0>,true}]

riak_repl_client_test:test_client_ports(). 
[{<0.8179.0>,{error,closed}},{<0.8182.0>,{error,closed}}]

riak_repl_client_test:test_client_ports_through_client().
[{<0.8179.0>,
  {status,[{node,'[email protected]'},
           {site,"node2-to-node1"},
           {strategy,riak_repl_keylist_client},
           {fullsync_worker,<0.8184.0>},
           {put_pool_size,5},
           {connected,"127.0.0.1",9012},
           {state,wait_for_fullsync}]}},
 {<0.8182.0>,
  {status,[{node,'[email protected]'},
           {site,"node3-to-node1"},
           {strategy,riak_repl_keylist_client},
           {fullsync_worker,<0.8186.0>},
           {put_pool_size,5},
           {connected,"127.0.0.1",9013},
           {state,wait_for_fullsync}]}}]

The last command, riak_repl_client_test:test_client_ports_through_client/0, forces the client to send a keapalive_ack over the closed port. The expectation is that this would cause the client to fail and re-connect. The client does not fail. The client ignores the {error, closed} returned from gen_tcp:send/2.

-module(riak_repl_client_test).

-compile(export_all).

test_client_ports_through_client() ->
    Pids = get_client_pids(),
    [{Pid, send_keepalive_through_client(Pid)} || Pid <- Pids].

test_client_ports() ->
    Ports = get_client_ports(),
    [{Pid, send_keepalive(Port)} || {Pid, Port} <- Ports].

close_client_ports() ->
    Ports = get_client_ports(),
    [{Pid, close_port(Port)} || {Pid, Port} <- Ports].

get_client_ports() ->
    Pids = get_client_pids(),
    [{Pid, get_port_from_pid(Pid)} || Pid <- Pids].

get_client_pids() ->
   [ Pid2 || {riak_repl_client_sup,P,_,_} <- supervisor:which_children(riak_repl_sup),
           {_,Pid2,_,_} <- supervisor:which_children(P)].

get_port_from_pid(Pid) ->
    get_port_from_status(sys:get_status(Pid)).

get_port_from_status(Status) ->
    {_, _, _, [_,running,_,_,[_,_,{data,[{_,State}]}]]} = Status,
    element(5, State).

close_port(Port) ->
    erlang:port_close(Port).

send_keepalive(Port) ->
    gen_tcp:send(Port, term_to_binary(keepalive)).

send_keepalive_through_client(Pid) ->
    Port = get_port_from_pid(Pid),
    Pid ! {tcp, Port, term_to_binary(keepalive)},
    riak_repl_tcp_client:status(Pid).

Typo in lager message

Saw this manifested in some repl logs today:

In src/riak_repl_keylist_client.erl, line 225: "exhanging differences for"

Server Stats and Start/Stop/Pause FullSync Do Not Work

Previously, riak_repl_server_sup handled riak_repl_tcp_server. When the fullsync and stats commands were run from riak_repl_console, the following was ran:

server_pids() ->
    [P || {_,P,_,_} <- supervisor:which_children(riak_repl_server_sup), P /= undefined].

This listed the riak_repl_tcp_server children, and then the commands were sent like so:

start_fullsync([]) ->
    [riak_repl_tcp_server:start_fullsync(Pid) || Pid <- server_pids()],
    ok.

However, this has changed. Instead of calling riak_repl_server_sup:start_server/1, it is called via riak_repl_listener_sup:ensure_listeners/1, which calls riak_repl_listener_sup:start_listener/1:

start_listener(Listener = #repl_listener{listen_addr={IP, Port}}) ->
    case riak_repl_util:valid_host_ip(IP) of
        true ->
            lager:info("Starting replication listener on ~s:~p",
                [IP, Port]),
            {ok, RawAddress} = inet_parse:address(IP),
            ranch:start_listener(Listener, 10, ranch_tcp,
                [{ip, RawAddress}, {port, Port}], riak_repl_tcp_server, []);
        _ ->
            lager:error("Cannot start replication listener "
                "on ~s:~p - invalid address.",
                [IP, Port])
    end.

You'll note that it called ranch:start_listener/6. This is a new dependency, managing socket connection pools. The source is located at https://github.com/extend/ranch.

With this, calling supervisor:which_children(riak_repl_server_sup). results in nothing:

[]

This is because children are no longer spawned under riak_repl_server_sup.

However, unlike listing supervisors under riak_repl_server_sup, which had riak_repl_tcp_server as a direct child, ranch operates differently.

Note the following:

supervisor:which_children(ranch_sup).               
[{{ranch_listener_sup,#repl_listener{nodename = '[email protected]',
                                     listen_addr = {"127.0.0.1",9010}}},
  <0.31129.0>,supervisor,
  [ranch_listener_sup]}]

This has a ranch_listener_sup as a child. Digging further in:

supervisor:which_children(c:pid(0,31129,0)).
[{ranch_acceptors_sup,<0.31132.0>,supervisor,
                      [ranch_acceptors_sup]},
 {ranch_conns_sup,<0.31131.0>,supervisor,[ranch_conns_sup]},
 {ranch_listener,<0.31130.0>,worker,[ranch_listener]}]

It turns out that riak_repl_tcp_server is spawned from ranch_conns_sup:

supervisor:which_children(c:pid(0,31131,0)).
[{undefined,<0.31145.0>,worker,[ranch_conns_sup]}]

We can then query riak_repl_tcp_server with this pid:

riak_repl_tcp_server:status(c:pid(0,31145,0)).
{status,[{node,'[email protected]'},
         {site,"dev1-dev2-dev3"},
         {strategy,riak_repl_keylist_server},
         {fullsync_worker,<0.31146.0>},
         {dropped_count,0},
         {queue_length,0},
         {queue_byte_size,0},
         {state,wait_for_partition}]}

So the interface to riak_repl_tcp_server still works, however the migration to using ranch has made querying this process troublesome.

basho / riak_repl Goto Github PK

riak_repl's Introduction

riak_repl

Pull Request template

Testing

Documentation

New Feature Deliverables

BEAM release process