Giter VIP home page Giter VIP logo

cacheout's People

Contributors

allinolcp avatar dgilland avatar fried-sausage avatar johnbergvall avatar uncle-lv avatar xcess avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cacheout's Issues

TTL setting is unclear according to the documentation

@cache.memoize(ttl=5, typed=True) - default time units are never explained. So I need to look all over the code to figure out what 5 means.
cache = Cache(maxsize=256, ttl=0, timer=time.time, default=None) # defaults - timer is callable I suppose. What is the specification of the callable? What should it return?
It's a great project. In fact this is the first python cache where I see a human operatable cache.set method instead of intricate annotated set mechanisms (which is present as well). But ttl is my second question and it is almost not described in the documentation.

LRU cache throws KeyError on add()

Might be fixed by 937d2df, but thought I'd report just in case. Follow on issue from #4.

In [1]: import cacheout

In [2]: c = cacheout.LRUCache()

In [3]: c.add('foo', 'bar')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-09536a570d71> in <module>()
----> 1 c.add('foo', 'bar')

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in add(self, key, value, ttl)
    271         """
    272         with self._lock:
--> 273             self._add(key, value, ttl=ttl)
    274 
    275     def _add(self, key, value, ttl=None):

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in _add(self, key, value, ttl)
    274 
    275     def _add(self, key, value, ttl=None):
--> 276         if self._has(key):
    277             return
    278         self._set(key, value, ttl=ttl)

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in _has(self, key)
    181     def _has(self, key):
    182         # Use get method since it will take care of evicting expired keys.
--> 183         return self._get(key, default=_NOTSET) is not _NOTSET
    184 
    185     def size(self):

~/anaconda3/lib/python3.6/site-packages/cacheout/lru.py in _get(self, key, default)
     14     def _get(self, key, default=None):
     15         value = super()._get(key, default=default)
---> 16         self._cache.move_to_end(key)
     17         return value

KeyError: 'foo'

Cache statistics

I'm going to write the cache statistics module. Besides cache hits/misses and cache frequency, are there any statistics that need to be recorded?

I don't understand what cache frequency means. Does it mean the number of times a cache entry was accessed?

To-Do List:

  • Total cache hits
  • Total cache misses
  • Hit rate
  • Miss rate
  • Eviction rate
  • Total size (entity count)
  • Total size in bytes
  • Average size of entities in bytes
  • Largest N keys

on-get callback

I think on_get callback may just be simple like this:

cache = Cache(on_get=on_get)

The callback function.

Callable[[key: Hashable, value: Any],  None]

In fact, I've hardly ever seen a on-get listener in cache libraries. Do we really need it? 🤷

OrderedDict Mutated during Iteration

Code:

This is the code I'm using python=3.6

from cacheout import Cache
import time
import os
import re


TIMED_WINDOW = os.getenv('TIMED_WINDOW', 3)
MAX_EVENTS = 256

c = Cache(maxsize=MAX_EVENTS, ttl=TIMED_WINDOW, timer=time.time)
c.set('v1', 1)
c.set('v2', 2)
c.set('v3', 3)
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
c.set('a1', 1)
c.set('a2', 2)
c.set('a3', 3)
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))
time.sleep(1)
print(c.get_many(re.compile(r"v.*")))
print(c.get_many(re.compile(r"a.*")))

Error:

Traceback (most recent call last):
  File "windowed.py", line 24, in <module>
    print(c.get_many(re.compile(r"v.*")))
  File "C:\tools\Anaconda3\envs\py36\lib\site-packages\cacheout\cache.py", line 248, in get_many
    return {key: self.get(key, default=default) for key in self._filter(iteratee)}
  File "C:\tools\Anaconda3\envs\py36\lib\site-packages\cacheout\cache.py", line 248, in <dictcomp>
    return {key: self.get(key, default=default) for key in self._filter(iteratee)}
  File "C:\tools\Anaconda3\envs\py36\lib\site-packages\cacheout\cache.py", line 500, in _filter
    yield from filter(filter_by, target)
RuntimeError: OrderedDict mutated during iteration

I think that when it try to get many it is removing the cache by ttl rule.

on-set callback

First of all, on-set callbcak and RemovalCause.SET coincide.

We can replace RemovalCause.SET with on-set callbcak.

cache = Cache(on_set=on_set)
Callable[[key: Hashable, new_value: Any, old_value: Any], None]

LRU cache get() with default throws KeyError

Maybe I'm not understanding the semantics, but this is not what I would expect:

$ ipython
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import cacheout

In [2]: cacheout.__version__
Out[2]: '0.10.1'

In [3]: c = cacheout.LRUCache()

In [4]: c.get('foo', default='bar')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-4-09ce1ce78da4> in <module>()
----> 1 c.get('foo', default='bar')

~/anaconda3/lib/python3.6/site-packages/cacheout/cache.py in get(self, key, default)
    214         """
    215         with self._lock:
--> 216             return self._get(key, default=default)
    217 
    218     def _get(self, key, default=None):

~/anaconda3/lib/python3.6/site-packages/cacheout/lru.py in _get(self, key, default)
     14     def _get(self, key, default=None):
     15         value = super()._get(key, default=default)
---> 16         self._cache.move_to_end(key)
     17         return value

KeyError: 'foo'

Looking at the traceback, it appears that the move_to_end() call shouldn't occur if the default value was returned, but there's no way to know whether the value returned is the default or just happened to exist in the cache with the same value as the default. Maybe _get() needs to return an is_default bool?
Ubuntu 14.04

Memoization to disk

This lib looks pretty neat!

However, I found it because I was looking for persistent caching, so to disk.

Do you have any plans to implement this?

Are the default functions single-threaded?

If you have a cache miss, and you want to call the default callable to get the right value for the cache, are the calls to that function (at least for a single key) single threaded? That is, if I have 100 calls all hit at once for the same key, but it's missing from the cache, is the callable only going to be called once?

This is a specific dimension of thread safety. The functions I want to use are thread-safe, but they're sometimes very expensive (several seconds). I don't want to startup 100 identical calls to replace the one cache value...

An error in LFU?

from cacheout.lfu import LFUCache
cache = LFUCache(maxsize=4, ttl=0, timer=time.time, default=None)
for i in range(4):
    cache.add(1,1)
    cache.add(2,1)
for i in range(3):
    cache.add(3,1)
    cache.add(4,1)
cache.add(5,1)
cache.add(6,1)
cache.keys()

output should be "odict_keys([1, 2, 4, 6])"
but is instead "odict_keys([1, 2, 3, 4, 6])"

The in operator returns True after cache expiration

After a cached expires (with finite ttl) the in operator still returns True (that the item is in the cache) until the get() method is called and corrects the internal state. It should be consistent with get() and return False right after expiration happens. Observed in the latest cachout==0.11.2.

It seems that the __contains__() method is not checking self.expired(key), in contrast to _get() which does. That would be the cause.

In addition there's a method has() which internally calls get() and checks expiration and distinguishes non-presence from None value. Does it mean that not checking expiration in __contains__ is intentional?

import cacheout
# works for all implementations (descendants) the Cache class
cache = cacheout.Cache(ttl=1)

Only calls to the in operator. After expiration it returns the wrong value.

cache.set('foo', 'bar')
for i in range(3):
    print('foo' in cache)
    time.sleep(1.5)

# True
# True # <-- wrong
# True # <-- wrong

Call the get() modifies the state and the in operator returns correct value afterwards.

cache.set('foo', 'bar') 
for i in range(3): 
    print('foo' in cache, cache.get('foo')) 
    time.sleep(1.5) 

# True bar
# True None # <-- wrong
# False None
cache.set('foo', 'bar') 
for i in range(3): 
    print(cache.get('foo'), 'foo' in cache)
    time.sleep(1.5) 

# bar True
# None False # OK
# None False

Eviction callback

Is there any option to add an "eviction callback" when a key is deleted?
I personally looking for a cache/TTLdict structure that I can tell when keys are evicted from it (by timeout).

Update expire time for specific entrys

What I need is to add expire time for specific keys. Currently cacheout provides api "set" to create new entry or update existing entry. So a simple way is:

v = cache.get(k, None)
if v is not None:
    cache.set(k, v, ttl=100)
else:
    raise KeyError(k)

Will you consider to provide apis like "expire" in Redis to update expire time directly?

Using cacheout in the context of multiprocessing

Hi,
I have used cacheout in my app, and it works like charm. but, later we have moved from threads to processes.
The problem is cacheout is thread-safe not process safe, so I want to share an instance of CacheManger between all processes.

is it a good idea?

single key info

1、How to get a single key and how long it will expire
2、How to extend the expiration time of a single key

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.