Giter VIP home page Giter VIP logo

shards's Introduction

Shards

ETS tables on steroids!

Sharding for ETS tables out-of-box.

CI Codecov Hex Version Docs License

Why might we need Sharding/Partitioning for the ETS tables? The main reason is to keep the lock contention under control enabling ETS tables to scale out and support higher levels of concurrency without lock issues; specially write-locks, which most of the cases might cause significant performance degradation.

Therefore, one of the most common and proven strategies to deal with these problems is Sharding or Partitioning; the principle is pretty similar to DHTs.

This is where shards comes in. Shards is an Erlang/Elixir library fully compatible with the ETS API, but it implements sharding or partitioning on top of the ETS tables, completely transparent and out-of-box.

See the getting started guide and the online documentation.

Installation

Erlang

In your rebar.config:

{deps, [
  {shards, "1.1.0"}
]}.

Elixir

In your mix.exs:

def deps do
  [{:shards, "~> 1.1"}]
end

For more information and examples, see the getting started guide.

Important links

  • Blog Post - Transparent and out-of-box sharding support for ETS tables in Erlang/Elixir.

  • Projects using shards:

    • shards_dist - Distributed version of shards. It was moved to a separate repo since v1.0.0.
    • Nebulex – Distributed Caching framework for Elixir.
    • ExShards – Elixir wrapper for shards; with extra and nicer functions.
    • KVX – Simple Elixir in-memory Key/Value Store using shards (default adapter).
    • Cacherl Distributed Cache using shards.

Testing

$ make test

You can find tests results in _build/test/logs, and coverage in _build/test/cover.

NOTE: shards comes with a helper Makefile, but it is just a simple wrapper on top of rebar3, therefore, you can do everything using rebar3 directly as well (e.g.: rebar3 do ct, cover).

Generating Edoc

$ make docs

NOTE: Once you run the previous command, you will find the generated HTML documentation within doc folder; open doc/index.html.

Contributing

Contributions to shards are very welcome and appreciated!

Use the issue tracker for bug reports or feature requests. Open a pull request when you are ready to contribute.

When submitting a pull request you should not update the CHANGELOG.md, and also make sure you test your changes thoroughly, include unit tests alongside new or changed code.

Before to submit a PR it is highly recommended to run make check before and ensure all checks run successfully.

Copyright and License

Copyright (c) 2016 Carlos Andres Bolaños R.A.

Shards source code is licensed under the MIT License.

shards's People

Contributors

cabol avatar casidiablo avatar moogle19 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shards's Issues

Allow to create the `state` passing the rest of its attributes

Basically, provide in shards_state functions like:

  • shards_state:new/1: new(NumShards)
  • shards_state:new/2: new(NumShards, Module)
  • shards_state:new/3: new(NumShards, Module, PickShardFun)
  • shards_state:new/4: new(NumShards, Module, PickShardFun, PickNodeFun)

LRU Feature may be

Can we have a feature of autodeleting record if its retrieved for X amount of time making free space for other objects to use the space ? something like LRU logic

Fix `shards_dist` and `shards` specs to consider the case when `rpc` returns `{badrpc, Reason}`

Currently, the module shards_dist uses rpc to run distributed calls, and in the specs the case whet rpc returns {badrpc, Reason} is not being considered.

I see two ways to handle this:

  1. Just fix the specs considering this case (specs have to be fixed not only in shards_dist but also in shards, since it is a wrapper on top of it). In this case we can create a type like:
-type rpc_res(R) :: R | {badrpc, Reason :: term()}

And re use it from the the specs, like so:

-spec update_counter(
        Tab      :: atom(),
        Key      :: term(),
        UpdateOp :: term(),
        State    :: shards_state:state()
      ) -> rpc_res(integer() | [integer()]).
  1. Handle the {badrpc, Reason} internally and raise an exception with the corresponding error.
%% @private
rpc_call(Node, Module, Function, Args) ->
  handle_rpc_res(rpc:call(Node, Module, Function, Args)).

%% @private
handle_rpc_res({badrpc, {'EXIT', {Reason, _}}}) ->
  error(Reason);
handle_rpc_res(Res) ->
  Res.

And reuse the private function rpc_call/4 internally.

Evaluate both alternatives and come up with the best approach.

Fix `shards` to work well with `ordered_set` tables.

Currently, shards doesn't behaves well when tables are ordered_set type, because results are sorted per shards, but not at global level (across all configured shards). The same problem happens with shards_dist across different node.

NOTE: This might be tricky to solve.

Fulfil ETS API for `shards_local`

ETS Functions:

  • shards:all/0
  • shards:delete/1
  • shards:delete/2
  • shards: delete_all_objects/1
  • shards: delete_object/2
  • shards: file2tab/1
  • shards: file2tab/2
  • shards:first/1
  • shards: foldl/3
  • shards:foldr/3
  • shards:from_dets/2 -- NOT SUPPORTED
  • shards:fun2ms/1 -- Use ets:fun2ms/1 instead.
  • shards:give_away/3
  • shards:i/0
  • shards:i/1 -- NOT SUPPORTED
  • shards:info/1
  • shards:info/2
  • shards:init_table/2 -- NOT SUPPORTED
  • shards:insert/2
  • shards: insert_new/2
  • shards: is_compiled_ms/1
  • shards:last/1
  • shards: lookup/2
  • shards: lookup_element/3
  • shards:match/2
  • shards:match/3
  • shards:match/1
  • shards:match_delete/2
  • shards:match_object/2
  • shards:match_object/3
  • shards:match_object/1
  • shards:match_spec_compile/1
  • shards:match_spec_run/2
  • shards:member/2
  • shards:new/2
  • shards:next/2
  • shards:prev/2
  • shards:rename/2
  • shards:repair_continuation/2
  • shards:safe_fixtable/2
  • shards:select/2
  • shards:select/3
  • shards:select/1
  • shards:select_count/2
  • shards:select_delete/2
  • shards:select_reverse/2
  • shards:select_reverse/3
  • shards:select_reverse/1
  • shards: setopts/2
  • shards:slot/2 -- NOT SUPPORTED
  • shards:tab2file/2
  • shards:tab2file/3
  • shards:tab2list/1
  • shards:tabfile_info/1
  • shards:table/1
  • shards:table/2
  • shards:test_ms/2
  • shards:take/2
  • shards:to_dets/2 -- NOT SUPPORTED
  • shards:update_counter/3
  • shards:update_counter/4
  • shards:update_element/3

shards:info/1 bad argument crash

Can lose a race between looking up the shards and calling ets:info/1, getting undefined instead of a list, and:

[error] GenServer Braun.CacheMonitor terminating
** (ArgumentError) argument error
    (stdlib 3.13.2) :lists.keyfind(:size, 1, :undefined)
    (shards 0.6.2) deps/shards/src/shards_local.erl:1531: anonymous fn/3 in :shards_local.shards_info/3
    (shards 0.6.2) deps/shards/src/shards_lib.erl:161: anonymous fn/4 in :shards_lib.keyupdate/4

Unify `pick_shard_fun` and `pick_node_fun` in a single spec

Both pick_shard_fun and pick_node_fun receive the same parameters and produce the same result – given a Key, Range and Op return a value between 0..Range-1.

The spec for both funs might be:

-type pick_fun() :: fun((key(), range(), op()) -> non_neg_integer() | any).

In case of pick_node_fun, the rest of the logic of select from the node list the Nth element (returned by pick_node_fun) might be handled in shards_dist module.

Failed to start shard_sup

I'm trying to run my cache on alpine linux and I'm getting errors. I'm using the nebulex library btw.

Process #PID<0.2814.0> terminating: {{:shutdown, {:failed_to_start_child, Studio.Cache, {:shutdown, {:failed_to_start_child, :shards_sup, {:EXIT, {:undef, [{:shards_sup, :start_link, [Studio.Cache.LocalSupervisor], []}, {:supervisor, :do_start_child, 2, [file: 'supervisor.erl', line: 365]}, {:supervisor, :start_children, 3, [file: 'supervisor.erl', line: 348]}, {:supervisor, :init_children, 2, [file: 'supervisor.erl', line: 314]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 365]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 333]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 247]}]}}}}}}, {Studio.Application, :start, [:normal, []]}}

Rollback when `insert_new` fails

Despite this operation is not atomic, at least roll-back or delete previously inserted entries if insert_new fails on one shard.

Issues on Erlang v21

In an elixir project:

===> Compiling shards
===> Compiling src/shards_task.erl failed
src/shards_task.erl:281: erlang:get_stacktrace/0: deprecated; use the new try/catch syntax for retrieving the stack backtrace
src/shards_task.erl:284: erlang:get_stacktrace/0: deprecated; use the new try/catch syntax for retrieving the stack backtrace
src/shards_task.erl:287: erlang:get_stacktrace/0: deprecated; use the new try/catch syntax for retrieving the stack backtrace

** (Mix) Could not compile dependency :shards, ".mix/rebar3 bare compile --paths "_build/dev/lib/*/ebin"" command failed. You can recompile this dependency with "mix deps.compile shards", update it with "mix deps.update shards" or clean it with "mix deps.clean shards"

OTP < 18 not supported

shards_owner_sup is making use of the newly introduced map support for supervisor child specs and flags (https://www.erlang.org/downloads/18.0), this prevents older versions from being able to use shards. Maybe co-existence could be possible by detecting the target OTP version and generating either the old tuple format or the new one.

Implement sharding at global level.

Currently, shards only supports sharding locally, where each created table (logical table) is represented internally and transparently by multiple physical ETS tables. The idea is extent the API at cluster level, having multiple erlang nodes with their own local shards inside. Here is needed a consistent hashing algorithm, so the idea is to use Google Jumping Consistent Hashing. See also toy_kv.

Compilation error with OTP 26

When compiling using OTP 26 and rebar 3.22.0 we get the following error:

❯ rebar3 compile +debug_info
===> Fetching rebar3_proper v0.12.1
===> Analyzing applications...
===> Compiling rebar3_proper
===> Fetching rebar3_ex_doc v0.2.23
===> Analyzing applications...
===> Compiling rebar3_ex_doc
===> Verifying dependencies...
===> Analyzing applications...
===> Compiling shards
===> Compiling src/shards.erl failed
src/shards.erl:1505:23: type variable 'Pos' is only used once (is unbound)
src/shards.erl:1505:28: type variable 'Value' is only used once (is unbound)

These are the detailed rebar and OTP versions:

❯ rebar3 --version
rebar 3.22.0 on Erlang/OTP 26 Erts 14.1.1
❯ erl -eval '{ok, Version} = file:read_file(filename:join([code:root_dir(), "releases", erlang:system_info(otp_release), "OTP_VERSION"])), io:fwrite(Version), halt().' -noshell
26.1.2

We believe this to be related to this change introduced in OTP 26: https://www.erlang.org/patches/otp-26.0#OTP-18389

Improve `shards_local:info/1,2` to be compliant with ETS functions

The idea is to return information about the shards as well, like number of shards, a list of shards tables. etc. So instead of return only a list with the info for each shard, return a mixed tuple list with the regular ETS info and additionally with the shards info. For items like memory and size it would be necessary to sum all shards values.

file2tab,tab2file not working

shards:new(live,[{scope, g},{nodes, [node()]}]).
shards:insert(live,{k2,val2}).
true
shards:insert(live,{k4,val2}).
true
shards:insert(live,{k23,val2}).
true

shards:lookup(live,k23).
[{k23,val2}]

shards:tab2list(live).
** exception error: undefined function shards_dist:tab2list/2
(ejabberd@localhost)10> shards:tab2file(live,"/tmp/test").
** exception error: undefined function shards_dist:tab2file/3

above functions dont work.

from ets it does work.
ets:tab2list(live_2).
[{k2,val2}]

Allow to call `shards_local` without the state – using a default state.

Most of the cases, the tables are created with default parameters, normally shards is used locally, using default pick funs. For those cases, it is not necessary to recover the state first, it can be created with default parameters, and avoid an unnecessary call to the ETS control table – such as shards does currently. To achieve this, two things are required:

  1. Improve shards_state module, add setters functions in order to be able to create a sate and modify its properties.
  2. In shards_local module, for all those function that receives the state as parameter, add an equivalent function without the state, creating a default one. E.g.: for shards_local:insert/3:
insert(Tab, ObjOrObjL, State)

Create an equivalent function without the State param, like this:

insert(Tab, ObjOrObjL) ->
  insert(Tab, ObjOrObjL, shards_state:new()).

Improve `file2tab` and `tab2file` in `shards_local` to be compliant with ETS functions

Current file2tab and tab2file works with a list of files instead of a single filename (as ETS functions).

For tab2file function the idea is generate one file per shard using ets:tab2file/3, and also generate a master file with the given Filename that holds the information about the other shards files in order to be able to recover them later using ets:file2tab/1,2.

For file2tab the idea would be read the master file given by Filename, get the rest of the shards files info and then restore the table (using ets:file2tab/1,2 for each shard).

rebar2 compatibility

Projects that still rely on rebar2 are unable to build shards (example on Mac):
cc /Users/luis.rascao/Projects/shards/c_src/jumping_hash.o -arch x86_64 -shared -L /Users/luis.rascao/Projects/erlang/17.4-hipe/lib/erl_interface-3.7.20/lib -lerl_interface -lei -o /Users/luis.rascao/Projects/shards/c_src/../priv/jumping_hash.so Undefined symbols for architecture x86_64: "_enif_get_uint", referenced from: _compute in jumping_hash.o "_enif_get_ulong", referenced from: _compute in jumping_hash.o "_enif_make_badarg", referenced from: _compute in jumping_hash.o "_enif_make_int", referenced from: _compute in jumping_hash.o

rebar2 compat would help adoption

Operation of the shards:info/2 does not match

Thank you for a nice library.

This is to report that the shards:info/2 is different from the ets:info/2 behavior.

> shards:info(mytab, name).
** exception error: bad argument
     in function  ets:lookup_element/3
        called as ets:lookup_element(mytab,'$shards_state',2)
...
> ets:info(mytab, name).
undefined

Improve documentation

  • Create a guides folder and put all documentation there
  • Create a first getting-started.md file and more part of the README there
  • Create a shards.md in order to document everything related to design, overall configuration, etc.

update_counter spec is not the same as ETS.

The current shards spec expects an integer result when calling update_counter/3, however ETS allows to pass a list of operations:

%%  From Erlang docs
update_counter(Tab, Key, X3 :: [UpdateOp]) -> [Result]

This unfortunately is messing with dialyzer since I'm calling something like this in distributed mode:

case Shards.update_counter(@table, key, [{2, increment}, {4, 1, 0, now}]) do
   [count, _] when count < limit ->
      {:ok, count}

   # key not found, insert it
   {:badrpc, {:EXIT, {:badarg, _trace}}} -> 
      true = Shards.insert(@table, {key, increment, now, now})
      {:ok, increment}

   _other ->
      {:error, :invalid}
end

From dialyzer I get:

The pattern
[_count, _]

can never match the type
integer()
________________________________________________________________________________
The pattern
{:badrpc, {:EXIT, {:badarg, __trace}}}

can never match the type
integer()

Is this just a matter of updating the function spec?

Migrate from pg2 to pg (OTP 23)

Hi,

As of OTP 23, The pg2 module has been deprecated and replaced with similar but faster pg module.

I think for most parts changing pg2 with pg will do the trick.

But there are couple places in this library which used create/1 and delete/1 functions.
These functions are not present in pg.

All create/1 calls can be deleted.

Groups are automatically created when any process joins, and are removed when all processes leave the group. 
Non-existing group is considered empty (containing no processes).

source: https://erlang.org/doc/man/pg.html

But delete/1 is an issue! Since it may be used to clear a group but pg does not have a simple way of doing that.
A substitute would be:

pg:leave(Group, pg:get_members(Group))

I don't know if doing this is necessary!

And I wanted to thank you for your amazing libraries. ❤️

Shards 0.6.2 doesn't compile on OTP 23.0.3 on OS X 10.15

Compiling src/shards_dist.erl failed
src/shards_dist.erl:125: pg2:join/2 is deprecated and will be removed in OTP 24; use use 'pg' instead
src/shards_dist.erl:130: pg2:get_members/1 is deprecated and will be removed in OTP 24; use use 'pg' instead
src/shards_dist.erl:135: pg2:leave/2 is deprecated and will be removed in OTP 24; use use 'pg' instead
src/shards_dist.erl:145: pg2:get_members/1 is deprecated and will be removed in OTP 24; use use 'pg' instead
src/shards_dist.erl:440: pg2:delete/1 is deprecated and will be removed in OTP 24; use use 'pg' instead
src/shards_dist.erl:441: pg2:create/1 is deprecated and will be removed in OTP 24; use use 'pg' instead

** (Mix) Could not compile dependency :shards, "/Users/.../.mix/rebar3 bare compile --paths="/Users/.../Projects/.../_build/dev/lib/*/ebin"" command failed. You can recompile this dependency with "mix deps.compile shards", update it with "mix deps.update shards" or clean it with "mix deps.clean shards"

Happened directly after upgrading OTP. Reinstalled erlang and elixir once afterwards. Unsure why it is failing due to depreciations.

Probably same issue as: #40 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.