fogfish / cache Goto Github PK

View Code? Open in Web Editor NEW

137.0 16.0 33.0 357 KB

Erlang in-memory cache

License: Apache License 2.0

Makefile 0.08% Erlang 99.92%

cache erlang time-to-live in-memory-storage in-memory-caching

cache's Introduction

Cache

Library implements segmented in-memory cache.

Inspiration

Cache uses N disposable ETS tables instead of single one. The cache applies eviction and quota policies at segment level. The oldest ETS table is destroyed and new one is created when quota or TTL criteria are exceeded. This approach outperforms the traditional timestamp indexing techniques.

The write operation always uses youngest segment. The read operation lookup key from youngest to oldest table until it is found same time key is moved to youngest segment to prolong TTL. If none of ETS table contains key then cache-miss occurs.

The downside is inability to assign precise TTL per single cache entry. TTL is always approximated to nearest segment. (e.g. cache with 60 sec TTL and 10 segments has 6 sec accuracy on TTL)

Key features

Key/value interface to read/write cached entities
Naive transform interface (accumulators, lists, binaries) to modify entities in-place
Check-and-store of put behavior
Supports asynchronous I/O to cache buckets
Sharding of cache bucket

Getting started

The latest version of the library is available at its master branch. All development, including new features and bug fixes, take place on the master branch using forking and pull requests as described in contribution guidelines.

The stable library release is available via hex packages, add the library as dependency to rebar.config

{deps, [
   cache
]}.

Usage

The library exposes public primary interface through exports of module cache.erl. An experimental features are available through interface extensions. Please note that further releases of library would promote experimental features to primary interface.

sharded_cache.erl

Build library and run the development console to evaluate key features

make && make run

spawn and configure

Use cache:start_link(...) to spawn an new cache instance. It supports a configuration using property lists:

type - a type of ETS table to used as segment, default is set. See ets:new/2 documentation for supported values.
n - number of cache segments, default is 10.
ttl - time to live of cached items in seconds, default is 600 seconds. It is recommended to use value multiple to n. The oldest cache segment is evicted every ttl / n seconds.
size - number of items to store in cache. It is recommended to use value multiple to n, each cache segment takes about size / n items. The size policy is applied only to youngest segment.
memory - rough number of bytes available for cache items. Each cache segment is allowed to take about memory / n bytes. Note: policy enforcement accounts Erlang word size.
policy - cache eviction policy, default is lru, supported values are Least Recently Used lru, Most Recently Used mru.
check - time in seconds to enforce cache policy. The default behavior enforces policy every ttl / n seconds. This timeout helps to optimize size/memory policy enforcement at high throughput system. The timeout is disabled by default.
stats - cache statistics handler either function/2 or {M, F} struct.
heir - the ownership of ETS segment is given away to the process during segment eviction. See ets:give_away/3 for details.

key/value interface

The library implements traditional key/value interface through put, get and remove functions. The function get prolongs ttl of the item, use lookup to keep ttl untouched.

application:start(cache).
{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).

ok  = cache:put(my_cache, <<"my key">>, <<"my value">>).
Val = cache:get(my_cache, <<"my key">>).

asynchronous i/o

The library provides synchronous and asynchronous implementation of same functions. The asynchronous variant of function is annotated with _ suffix. E.g. get(...) is a synchronous cache lookup operation (the process is blocked until cache returns); get_(...) is an asynchronous variant that delivers result of execution to mailbox.

application:start(cache).
{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).

Ref = cache:get_(my_cache, <<"my key">>).
receive {Ref, Val} -> Val end.

transform element

The library allows to read-and-modify (modify in-place) cached element. You can apply any function over cached elements and returns the result of the function. The apply acts a transformer with three possible outcomes:

undefined (e.g. fun(_) -> undefined end) - no action is taken, old cache value remains;
unchanged value (e.g. fun(X) -> X end) - no action is taken, old cache value remains;
new value (e.g. fun(X) -> <<"x", X/binary>> end) - the value in cache is replaced with the result of the function.

application:start(cache).
{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).

cache:put(my_cache, <<"my key">>, <<"x">>).
cache:apply(my_cache, <<"my key">>, fun(X) -> <<"x", X/binary>> end).
cache:get(my_cache, <<"my key">>).

The library implement helper functions to transform elements with append or prepend.

application:start(cache).
{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).

cache:put(my_cache, <<"my key">>, <<"b">>).
cache:append(my_cache, <<"my key">>, <<"c">>).
cache:prepend(my_cache, <<"my key">>, <<"a">>).
cache:get(my_cache, <<"my key">>).

accumulator

application:start(cache).
{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).

cache:acc(my_cache, <<"my key">>, 1).
cache:acc(my_cache, <<"my key">>, 1).
cache:acc(my_cache, <<"my key">>, 1).

check-and-store

The library implements the check-and-store semantic for put operations:

add store key/val only if cache does not already hold data for this key
replace store key/val only if cache does hold data for this key

configuration via Erlang `sys.config`

The cache instances are configurable via sys.config. These cache instances are supervised by application supervisor.

{cache, [
   {my_cache, [{n, 10}, {ttl, 60}]}
]}

distributed environment

The cache application uses standard Erlang distribution model. Please node that Erlang distribution uses single tcp/ip connection for message passing between nodes. Therefore, frequent read/write of large entries might impact on overall Erlang performance.

The global cache instance is visible to all Erlang nodes in the cluster.

%% at [email protected]
{ok, _} = cache:start_link({global, my_cache}, [{n, 10}, {ttl, 60}]).
Val = cache:get({global, my_cache}, <<"my key">>).

%% at [email protected]
ok  = cache:put({global, my_cache}, <<"my key">>, <<"my value">>).
Val = cache:get({global, my_cache}, <<"my key">>).

The local cache instance is accessible for any Erlang nodes in the cluster.

%% [email protected]
{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).
Val = cache:get(my_cache, <<"my key">>).

%% [email protected]
ok  = cache:put({my_cache, '[email protected]'}, <<"my key">>, <<"my value">>).
Val = cache:get({my_cache, '[email protected]'}, <<"my key">>).

sharding

Module cache_shards provides simple sharding on top of cache. It uses simple hash(Key) rem NumShards approach, and keeps NumShards in application environment. This feature is still experimental, its interface is a subject to change in further releases.

{ok, _} = cache_shards:start_link(my_cache, 8, [{n, 10}, {ttl, 60}]).
ok = cache_shards:put(my_cache, key1, "Hello").
{ok,"Hello"} = cache_shards:get(my_cache, key1).

sharded_cache uses only small subset of cache API. But you can get shard name for your key and then use cache directly.

{ok, Shard} = cache_shards:get_shard(my_cache, key1)
{ok, my_cache_2}
cache:lookup(Shard, key1).
"Hello"

How to Contribute

The library is Apache 2.0 licensed and accepts contributions via GitHub pull requests.

Fork it
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

The development requires Erlang/OTP version 19.0 or later and essential build tools.

commit message

The commit message helps us to write a good release note, speed-up review process. The message should address two question what changed and why. The project follows the template defined by chapter Contributing to a Project of Git book.

Short (50 chars or less) summary of changes

More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of an email and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); tools like rebase can get confused if you run the two together.

Further paragraphs come after blank lines.

Bullet points are okay, too

Typically a hyphen or asterisk is used for the bullet, preceded by a single space, with blank lines in between, but conventions vary here

Bugs

If you detect a bug, please bring it to our attention via GitHub issues. Please make your report detailed and accurate so that we can identify and replicate the issues you experience:

specify the configuration of your environment, including which operating system you're using and the versions of your runtime environments
attach logs, screen shots and/or exceptions if possible
briefly summarize the steps you took to resolve or reproduce the problem

Changelog

2.3.0 - sharding of cache bucket (single node only)
2.0.0 - various changes on asynchronous api, not compatible with version 1.x
1.0.1 - production release

Contributors

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

cache's People

Contributors

Stargazers

Watchers

cache's Issues

transaction

hello,
can you provide transaction in this lib?

when i use this lib, i maybe use like this:
Value = cache:get({global, test}, key),
NewValue = ....,
cache:put({global, test}, key, NewValue),

if no transaction ,the cache maybe modify by other progress.

Use CT for Testing

cache expire issue

Run the following eshell:

1> cache:start_link(c, [{n,10}, {ttl, 60}]).
2> cache:put(c, x, test, 10).

What is the upper bound ttl of x in cache? I thought it should not be more than 16 seconds but somehow x stays there for minutes. Is this a bug?

Cache distroed

Hello!
Please tell me this is normal behavior of a cache, then that for any wrong address, he destroyed?

23> {ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 600}]).
{ok,<0.79.0>}
25> Val11 = cache:get(my_cache, <<"my key">>).
undefined
26> cache:i(my_cache).
[{heap,[90132]},
 {expire,[1384955909]},
 {size,[0]},
 {memory,[307]}]
27> cache:i(my_cache1).
** exception exit: {noproc,{gen_server,call,[my_cache1,i]}}
     in function  gen_server:call/2 (gen_server.erl, line 180)
28> cache:i(my_cache).
** exception exit: {noproc,{gen_server,call,[my_cache,i]}}
     in function  gen_server:call/2 (gen_server.erl, line 180)
29> Val11 = cache:get(my_cache, <<"my key">>).
** exception exit: {noproc,
                       {gen_server,call,[my_cache,{get,<<"my key">>},60000]}}
     in function  gen_server:call/3 (gen_server.erl, line 188)
30> {ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 600}]).
{ok,<0.90.0>}
31> {ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 600}]).
** exception error: no match of right hand side value
                    {error,{already_started,<0.90.0>}}
32> Val11 = cache:get(my_cache, <<"my key">>).
** exception exit: {noproc,
                       {gen_server,call,[my_cache,{get,<<"my key">>},60000]}}
     in function  gen_server:call/3 (gen_server.erl, line 188)
33> cache:i(my_cache).
** exception exit: {noproc,{gen_server,call,[my_cache,i]}}
     in function  gen_server:call/2 (gen_server.erl, line 180)

cache:ttl(...) returns negative value

(cache@localhost.localdomain)1> cache:start_link(c, [{n,10}, {ttl, 60}]).
{ok,<0.68.0>}
(cache@localhost.localdomain)2> cache:put(c, x, test, 10).
ok
(cache@localhost.localdomain)3> cache:ttl(c, x).
6
(cache@localhost.localdomain)4> timer:sleep(10000).
ok
(cache@localhost.localdomain)5> cache:ttl(c, x).
-4
(cache@localhost.localdomain)6>

Set a limit on the use memory

How can I set a limit on the use of memory, when i use this project in my App ?
I try start on VPS:
{ok, _} = cache:start_link(my_cache, [{memory, 1000000}, {n, 10}, {ttl, 60}]).
and I get this error:

** exception exit: undef
     in function  erlang:sysinfo/1
        called as erlang:sysinfo(wordsize)
     in call from cache_bucket:init/2 (src/cache_bucket.erl, line 79)
     in call from cache_bucket:init/1 (src/cache_bucket.erl, line 73)
     in call from gen_server:init_it/6 (gen_server.erl, line 304)
     in call from proc_lib:init_p_do_apply/3 (proc_lib.erl, line 239)

When I add in sys.config i have error too, when start up my App(
And i try add into {env, [{memory, 206870912}]} in myapp.app.src, but it did not produce the expected effect it to limit memory

Any way to receive notifications when the key's are removed because of TTL

Hello,

Will be nice if you can provide in a way or other notifications when the key's are removed because of the TTL expired.

Or in case you don’t want to implement this can you please let me know what part of the code is doing the delete of the expired key's.

Silviu

support pid() as cache stats recipient.

Get and put value from function if not exists.

Hello.
Thank you for your library.

I didn't find simple logic like:

Getting value by key.
If value is not exists it calculates this value by passed function
Put new value into cache and return result (value).

This logic must be atomic as another processes can ask this key at same time.

I hoped to implement this logic by cache:apply function. But can't as this function is always putting data into cache. It doesn't compare old value with new value what can cause useless load.

What solution do you see in this situation ?
Thanks.

Is there any limitation on size of segmentation for optimal performance?

Is there any limitation on the size of the segmentation for optimal performance? I am trying to cache 3GB in a single segment and facing latency during :get operation.

start cache on distribute nodes makes crash

Hi,

my node down, due to cache started on distrubute nodes, and logs are attached, please have a look!

([email protected])1> 05:12:08.352 [warning] lager_error_logger_h dropped 63 messages in the last second that exceeded the limit of 50 messages/sec
05:12:08.352 [info] global: Name conflict terminating {pswd_cache,<29575.6921.0>}
05:12:08.352 [info] global: Name conflict terminating {sms_cache,<29575.6922.0>}
05:12:08.352 [info] global: Name conflict terminating {user_token_cache,<29575.6923.0>}
05:15:01.117 [error] Received a MNESIA down event, removing on node '[email protected]' all pids of node '[email protected]'
05:23:17.800 [error] Received a MNESIA down event, removing on node '[email protected]' all pids of node '[email protected]'
05:39:32.342 [info] global: Name conflict terminating {pswd_cache,<29575.4243.0>}
05:39:32.342 [info] global: Name conflict terminating {sms_cache,<29575.4244.0>}
05:39:32.342 [info] global: Name conflict terminating {user_token_cache,<29575.4245.0>}
05:39:35.178 [error] Received a MNESIA down event, removing on node '[email protected]' all pids of node '[email protected]'
05:48:52.009 [info] global: Name conflict terminating {pswd_cache,<29575.7284.0>}
05:48:52.009 [info] global: Name conflict terminating {sms_cache,<29575.7285.0>}
05:48:52.009 [info] global: Name conflict terminating {user_token_cache,<29575.7286.0>}
05:49:02.789 [error] Received a MNESIA down event, removing on node '[email protected]' all pids of node '[email protected]'
05:49:57.855 [info] global: Name conflict terminating {pswd_cache,<29575.818.0>}
05:49:57.859 [info] global: Name conflict terminating {sms_cache,<29575.819.0>}
05:49:57.860 [info] global: Name conflict terminating {user_token_cache,<29575.820.0>}

Q: Информация о кэше, cache:i()

Пожалуйста уточните параметры выводимой информации о состоянии кэша, правильно ли я их понимаю:

Это количество сегментов в кэше:
{heap, [integer()]} - cache segments references

Оставшееся время самому старому сигменту (мс):
{expire, [integer()]} - cache segments expire times

Общее количество записей в кэше:
{size, [integer()]} - cardinality of cache segments

Всего занято памяти Кэшем на инстанции (в словах):
{memory, [integer()]} - memory occupied by each cache segment

Cache Replacement Policies

Evaluate a possible use-cases of given policies.
https://en.wikipedia.org/wiki/Cache_replacement_policies

entry can not stay longer than ttl to cache:start_link(...)

For example,

{ok, _} = cache:start_link(my_cache, [{n, 10}, {ttl, 60}]).
cache:put(my_cache, x, 100, 120).

cache:lookup(my_cache, x) will return undefined after about 54 seconds. Is this a bug?

can you provide api that i can get Multiple data

the api i want to used like this:
i saved some data to the cache,and i want to get some data page and page.
of course,i can use the api -> cache:all() to replace, but not a good way because too many data.

Not compiling with erlang 19

Hello,

Is not compiling with erlang 19 because of the type specs.

All type specs like:

-spec(new/4 :: (atom(), integer(), integer(), integer()) -> #heap{}).

Should be converted in :

-spec(new(atom(), integer(), integer(), integer()) -> #heap{}).

Silviu

Multiple entries in a single key

Hello there,

is there a way to store multiple entires in a single key? (Like an ETS duplicate_bag)

My use case would be a list that only contains the N most recently added entries while discarding older ones. Is there a way to do this using this library or would you recommend another one/rolling my own solution?

Thank you for your time!

stick to one namespace/prefix

With 2.3.0 there is suddenly a module named sharded_cache. Is it possible to stick to one prefix ie. cache? Thanks.

Sharding support

Nice work. I've spent some time exploring caching libraries for Erlang. And this library is really good.

Did you consider to add sharding? Currently all read/write queries go through gen_server:call to single process. And this is not good for my case.

Sharding could be added as separate library on top of "cache", or it could be implemented inside "cache". What option do you think will be better?

I am ready to do it myself and send PR to you.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

fogfish / cache Goto Github PK

cache's Introduction

Cache

Inspiration

Key features

Getting started

Usage

spawn and configure

key/value interface

asynchronous i/o

transform element

accumulator

check-and-store

configuration via Erlang sys.config

distributed environment

sharding

How to Contribute

commit message

Bugs

Changelog

Contributors

License

cache's People

Contributors

Stargazers

Watchers

Forkers

cache's Issues

Recommend Projects

Recommend Topics

Recommend Org

configuration via Erlang `sys.config`