dgilland / cacheout Goto Github PK

View Code? Open in Web Editor NEW

403.0 13.0 43.0 231 KB

A caching library for Python

Home Page: https://cacheout.readthedocs.io

License: MIT License

Python 100.00%

caching memoization lru mru lfu fifo lifo rr python python3

cacheout's People

Contributors

Stargazers

Watchers

cacheout's Issues

sanic web framework will it be safe ?

Can this library be used in Tornado or sanic ? Will it be safe?

TTL setting is unclear according to the documentation

@cache.memoize(ttl=5, typed=True) - default time units are never explained. So I need to look all over the code to figure out what 5 means.
cache = Cache(maxsize=256, ttl=0, timer=time.time, default=None) # defaults - timer is callable I suppose. What is the specification of the callable? What should it return?
It's a great project. In fact this is the first python cache where I see a human operatable cache.set method instead of intricate annotated set mechanisms (which is present as well). But ttl is my second question and it is almost not described in the documentation.

LRU cache throws KeyError on add()

Might be fixed by 937d2df, but thought I'd report just in case. Follow on issue from #4.

In [1]: import cacheout

In [2]: c = cacheout.LRUCache()

In [3]: c.add('foo', 'bar')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-09536a570d71> in <module>()
----> 1 c.add('foo', 'bar')

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in add(self, key, value, ttl)
    271         """
    272         with self._lock:
--> 273             self._add(key, value, ttl=ttl)
    274 
    275     def _add(self, key, value, ttl=None):

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in _add(self, key, value, ttl)
    274 
    275     def _add(self, key, value, ttl=None):
--> 276         if self._has(key):
    277             return
    278         self._set(key, value, ttl=ttl)

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in _has(self, key)
    181     def _has(self, key):
    182         # Use get method since it will take care of evicting expired keys.
--> 183         return self._get(key, default=_NOTSET) is not _NOTSET
    184 
    185     def size(self):

~/anaconda3/lib/python3.6/site-packages/cacheout/lru.py in _get(self, key, default)
     14     def _get(self, key, default=None):
     15         value = super()._get(key, default=default)
---> 16         self._cache.move_to_end(key)
     17         return value

KeyError: 'foo'

Cache statistics

I'm going to write the cache statistics module. Besides cache hits/misses and cache frequency, are there any statistics that need to be recorded?

I don't understand what cache frequency means. Does it mean the number of times a cache entry was accessed?

To-Do List:

A typo in docstring?

I think the policy in the link should be "least-frequently-used eviction policy", not "least-recently-used eviction policy".

cacheout/src/cacheout/lfu.py

Line 11 in 1f4b78c

least-recently-used eviction policy.

on-get callback

I think on_get callback may just be simple like this:

cache = Cache(on_get=on_get)

The callback function.

Callable[[key: Hashable, value: Any],  None]

In fact, I've hardly ever seen a on-get listener in cache libraries. Do we really need it? 🤷

OrderedDict Mutated during Iteration

Code:

This is the code I'm using python=3.6

from cacheout import Cache
import time
import os
import re


TIMED_WINDOW = os.getenv('TIMED_WINDOW', 3)
MAX_EVENTS = 256

c = Cache(maxsize=MAX_EVENTS, ttl=TIMED_WINDOW, timer=time.time)
c.set('v1', 1)
c.set('v2', 2)
c.set('v3', 3)
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
c.set('a1', 1)
c.set('a2', 2)
c.set('a3', 3)
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))

Error:

Traceback (most recent call last):
  File "windowed.py", line 24, in <module>
    print(c.get_many(re.compile(r"v.*")))
  File "C:\tools\Anaconda3\envs\py36\lib\site-packages\cacheout\cache.py", line 248, in get_many
    return {key: self.get(key, default=default) for key in self._filter(iteratee)}
  File "C:\tools\Anaconda3\envs\py36\lib\site-packages\cacheout\cache.py", line 248, in <dictcomp>
    return {key: self.get(key, default=default) for key in self._filter(iteratee)}
  File "C:\tools\Anaconda3\envs\py36\lib\site-packages\cacheout\cache.py", line 500, in _filter
    yield from filter(filter_by, target)
RuntimeError: OrderedDict mutated during iteration

I think that when it try to get many it is removing the cache by ttl rule.

on-set callback

First of all, on-set callbcak and RemovalCause.SET coincide.

We can replace RemovalCause.SET with on-set callbcak.

cache = Cache(on_set=on_set)

Callable[[key: Hashable, new_value: Any, old_value: Any], None]

Roadmap "Layered caching (multi-level caching)"

When are you planning to add support for layered caching? Has there already been put effort into? I'd be interested in "guided" contribution 😏

how about add a callback to expired object?

just a thought :)

LRU cache get() with default throws KeyError

Maybe I'm not understanding the semantics, but this is not what I would expect:

$ ipython
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import cacheout

In [2]: cacheout.__version__
Out[2]: '0.10.1'

In [3]: c = cacheout.LRUCache()

In [4]: c.get('foo', default='bar')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-4-09ce1ce78da4> in <module>()
----> 1 c.get('foo', default='bar')

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in get(self, key, default)
    214         """
    215         with self._lock:
--> 216             return self._get(key, default=default)
    217 
    218     def _get(self, key, default=None):

~/anaconda3/lib/python3.6/site-packages/cacheout/lru.py in _get(self, key, default)
     14     def _get(self, key, default=None):
     15         value = super()._get(key, default=default)
---> 16         self._cache.move_to_end(key)
     17         return value

KeyError: 'foo'

Looking at the traceback, it appears that the move_to_end() call shouldn't occur if the default value was returned, but there's no way to know whether the value returned is the default or just happened to exist in the cache with the same value as the default. Maybe _get() needs to return an is_default bool?
Ubuntu 14.04

How we can use it in webserver?

Memoization to disk

This lib looks pretty neat!

However, I found it because I was looking for persistent caching, so to disk.

Do you have any plans to implement this?

Are the default functions single-threaded?

If you have a cache miss, and you want to call the default callable to get the right value for the cache, are the calls to that function (at least for a single key) single threaded? That is, if I have 100 calls all hit at once for the same key, but it's missing from the cache, is the callable only going to be called once?

This is a specific dimension of thread safety. The functions I want to use are thread-safe, but they're sometimes very expensive (several seconds). I don't want to startup 100 identical calls to replace the one cache value...

An error in LFU?

from cacheout.lfu import LFUCache
cache = LFUCache(maxsize=4, ttl=0, timer=time.time, default=None)
for i in range(4):
    cache.add(1,1)
    cache.add(2,1)
for i in range(3):
    cache.add(3,1)
    cache.add(4,1)
cache.add(5,1)
cache.add(6,1)
cache.keys()

output should be "odict_keys([1, 2, 4, 6])"
but is instead "odict_keys([1, 2, 3, 4, 6])"

cacheout

The in operator returns True after cache expiration

After a cached expires (with finite ttl) the in operator still returns True (that the item is in the cache) until the get() method is called and corrects the internal state. It should be consistent with get() and return False right after expiration happens. Observed in the latest cachout==0.11.2.

It seems that the __contains__() method is not checking self.expired(key), in contrast to _get() which does. That would be the cause.

In addition there's a method has() which internally calls get() and checks expiration and distinguishes non-presence from None value. Does it mean that not checking expiration in __contains__ is intentional?

import cacheout
# works for all implementations (descendants) the Cache class
cache = cacheout.Cache(ttl=1)

Only calls to the in operator. After expiration it returns the wrong value.

cache.set('foo', 'bar')
for i in range(3):
    print('foo' in cache)
    time.sleep(1.5)

# True
# True # <-- wrong
# True # <-- wrong

Call the get() modifies the state and the in operator returns correct value afterwards.

cache.set('foo', 'bar') 
for i in range(3): 
    print('foo' in cache, cache.get('foo')) 
    time.sleep(1.5) 

# True bar
# True None # <-- wrong
# False None

cache.set('foo', 'bar') 
for i in range(3): 
    print(cache.get('foo'), 'foo' in cache)
    time.sleep(1.5) 

# bar True
# None False # OK
# None False

Eviction callback

Is there any option to add an "eviction callback" when a key is deleted?
I personally looking for a cache/TTLdict structure that I can tell when keys are evicted from it (by timeout).

Update expire time for specific entrys

What I need is to add expire time for specific keys. Currently cacheout provides api "set" to create new entry or update existing entry. So a simple way is:

v = cache.get(k, None)
if v is not None:
    cache.set(k, v, ttl=100)
else:
    raise KeyError(k)

Will you consider to provide apis like "expire" in Redis to update expire time directly?

dgilland / cacheout Goto Github PK

cacheout's People

Contributors

Stargazers

Watchers

Forkers

cacheout's Issues

Code:

Error:

Recommend Projects

Recommend Topics

Recommend Org