Is there a Python caching library?

Question

I'm looking for a Python caching library but can't find anything so far. I need a simple dict-like interface where I can set keys and their expiration and get them back cached. Sort of something like:

cache.get(myfunction, duration=300)

which will give me the item from the cache if it exists or call the function and store it if it doesn't or has expired. Does anyone know something like this?

within the same process or shared between processes? threaded or not? — Aaron Watters
– Aaron Watters, Commented Sep 15, 2009 at 14:37
It should be thread-safe, sorry, I should have mentioned. I don't need to share between processes. — Stavros Korokithakis
– Stavros Korokithakis, Commented Sep 18, 2009 at 10:20
Try DiskCache: Apache2 licensed, 100% coverage, thread-safe, process-safe, multiple eviction policies and fast (benchmarks). — GrantJ
– GrantJ, Commented Mar 21, 2016 at 18:13

bw1024 · Accepted Answer · 2022-03-29 16:13:22Z

100

From Python 3.2 you can use the decorator @lru_cache from the functools library. It's a Least Recently Used cache, so there is no expiration time for the items in it, but as a fast hack it's very useful.

from functools import lru_cache @lru_cache(maxsize=256) def f(x): return x*x for x in range(20): print f(x) for x in range(20): print f(x)

edited Mar 29, 2022 at 16:13

bw1024

1,15812 silver badges15 bronze badges

answered Dec 8, 2013 at 20:45

Genma

1,5411 gold badge12 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

vaab Over a year ago

cachetools offers an nice implementation of these and it's compatible python 2 and python 3.

Jörn Hees Over a year ago

big +1 for cachetools... seems pretty cool and has a couple more caching algorithms :)

PascalVKooten Over a year ago

This should never be suggested! Stay compatible.

Motty Over a year ago

@roboslone, two years (minus 4 days..) from your comment about not being thread safe, it may have changed. I have cachetools 2.0.0 and I see in the code that it uses an RLock. /usr/lib/python2.7/site-packages/cachetools/func.py

martineau Over a year ago

@Motty: The documentation for cachetools 4.0.0.0 says this: "Please be aware that all these classes are not thread-safe. Access to a shared cache from multiple threads must be properly synchronized, e.g. by using one of the memoizing decorators with a suitable lock object" (bold mine)

Iberê · Accepted Answer · 2017-12-14 10:02:54Z

56

Take a look at Beaker:

edited Dec 14, 2017 at 10:02

Iberê

1,2511 gold badge11 silver badges16 bronze badges

answered Sep 15, 2009 at 13:52

Corbin March

25.8k6 gold badges77 silver badges100 bronze badges

2 Comments

Stavros Korokithakis Over a year ago

Ah, I kept searching for this and all I found was a wiki that mentioned how to use it as an WSGI middleware. It looks like what I need, thank you.

s29 Over a year ago

See also dogpile- supposedly the new and improved beaker.

tgray · Accepted Answer · 2009-09-15 15:49:50Z

30

You might also take a look at the Memoize decorator. You could probably get it to do what you want without too much modification.

answered Sep 15, 2009 at 15:49

tgray

8,9865 gold badges38 silver badges41 bronze badges

2 Comments

Ehtesh Choudhury Over a year ago

That's clever. A few changes and the decorator could even expire after a set time.

reem Over a year ago

You could definitely write a space-based limit to the cache in the decorator. That would be helpful if you wanted a function to, for example, generate the fibonacci sequence term by term. You want caching, but you only need the last two values - saving all of them is just space inefficient.

NuclearPeon · Accepted Answer · 2014-05-23 21:29:02Z

16

No one has mentioned shelve yet. https://docs.python.org/2/library/shelve.html

It isn't memcached, but looks much simpler and might fit your need.

answered May 23, 2014 at 21:29

NuclearPeon

6,1175 gold badges51 silver badges57 bronze badges

1 Comment

cristoper Over a year ago

I wrote a thread- and multiprocess-safe wrapper for the standard shelve module (including a helper function for caching http requests) in case that is useful for anyone: github.com/cristoper/shelfcache

Michael Delgado · Accepted Answer · 2019-12-02 01:03:06Z

Joblib https://joblib.readthedocs.io supports caching functions in the Memoize pattern. Mostly, the idea is to cache computationally expensive functions.

>>> from joblib import Memory >>> mem = Memory(cachedir='/tmp/joblib') >>> import numpy as np >>> square = mem.cache(np.square) >>> >>> a = np.vander(np.arange(3)).astype(np.float) >>> b = square(a) ________________________________________________________________________________ [Memory] Calling square... square(array([[ 0., 0., 1.], [ 1., 1., 1.], [ 4., 2., 1.]])) ___________________________________________________________square - 0...s, 0.0min >>> c = square(a)

You can also do fancy things like using the @memory.cache decorator on functions. The documentation is here: https://joblib.readthedocs.io/en/latest/generated/joblib.Memory.html

As a sidenote, joblib really shines when you're working with large NumPy arrays, since it has special methods to deal with them specifically.

David Berger · Accepted Answer · 2009-09-15 13:52:10Z

9

I think the python memcached API is the prevalent tool, but I haven't used it myself and am not sure whether it supports the features you need.

answered Sep 15, 2009 at 13:52

David Berger

12.9k6 gold badges41 silver badges53 bronze badges

1 Comment

Stavros Korokithakis Over a year ago

That one's the industry standard, but all I want is a simple in-memory storage mechanism that can hold 100 keys or so, and memcached is a bit overkill. Thank you for the answer, though.

Tzury Bar Yochay · Accepted Answer · 2009-09-15 14:22:41Z

import time class CachedItem(object): def __init__(self, key, value, duration=60): self.key = key self.value = value self.duration = duration self.timeStamp = time.time() def __repr__(self): return '<CachedItem {%s:%s} expires at: %s>' % (self.key, self.value, time.time() + self.duration) class CachedDict(dict): def get(self, key, fn, duration): if key not in self \ or self[key].timeStamp + self[key].duration < time.time(): print 'adding new value' o = fn(key) self[key] = CachedItem(key, o, duration) else: print 'loading from cache' return self[key].value if __name__ == '__main__': fn = lambda key: 'value of %s is None' % key ci = CachedItem('a', 12) print ci cd = CachedDict() print cd.get('a', fn, 5) time.sleep(2) print cd.get('a', fn, 6) print cd.get('b', fn, 6) time.sleep(2) print cd.get('a', fn, 7) print cd.get('b', fn, 7)

I did something like that, but you need locks for multithreading and a size parameter to avoid it growing infinitely. Then you need some function to sort the keys by accesses to discard the least-accessed ones, etc etc...
The repr line is incorrect (should use the self.timeStamp). As well it's a poor implementation that needlessly does math for every get(). The expiry time should be calculated in the CachedItem init.
In fact, if you're only implementing the get method, this shouldn't be a dict subclass, it should be an object with an embedded dict.

harry · Accepted Answer · 2011-06-10 07:34:09Z

Try redis, it is one of the cleanest and easiest solutions for applications to share data in a atomic way or if you have got some web server platform. Its very easy to setup, you will need a python redis client http://pypi.python.org/pypi/redis

Should be mentioned, It is out of process, needs to be accessed using TCP.

Magentron · Accepted Answer · 2016-05-30 11:12:43Z

You can use my simple solution to the problem. It is really straightforward, nothing fancy:

class MemCache(dict): def __init__(self, fn): dict.__init__(self) self.__fn = fn def __getitem__(self, item): if item not in self: dict.__setitem__(self, item, self.__fn(item)) return dict.__getitem__(self, item) mc = MemCache(lambda x: x*x) for x in xrange(10): print mc[x] for x in xrange(10): print mc[x]

It indeed lacks expiration funcionality, but you can easily extend it with specifying a particular rule in MemCache c-tor.

Hope code is enough self-explanatory, but if not, just to mention, that cache is being passed a translation function as one of its c-tor params. It's used in turn to generate cached output regarding the input.

Hope it helps

+1 for suggesting something simple. Depending on the problem, it might just be the tool for the job. P.S. You don't need the else in __getitem__ :)
Why would he not need to else in the __getitem__ ? That's where he populates the dict...

Community · Accepted Answer · 2020-06-20 09:12:55Z

This project aims to provide "Caching for humans" (seems like it's fairly unknown though)

Some info from the project page:

Installation

pip install cache

Usage:

import pylibmc from cache import Cache backend = pylibmc.Client(["127.0.0.1"]) cache = Cache(backend) @cache("mykey") def some_expensive_method(): sleep(10) return 42 # writes 42 to the cache some_expensive_method() # reads 42 from the cache some_expensive_method() # re-calculates and writes 42 to the cache some_expensive_method.refresh() # get the cached value or throw an error # (unless default= was passed to @cache(...)) some_expensive_method.cached()

This requires an external memcached server and was mentioned 11 years before this answer

neuro · Accepted Answer · 2018-11-29 17:52:04Z

2

Look at gocept.cache on pypi, manage timeout.

edited Nov 29, 2018 at 17:52

neuro

15.2k3 gold badges38 silver badges61 bronze badges

answered May 9, 2010 at 16:57

user2665694

Comments

Jens W. Klein · Accepted Answer · 2010-07-14 08:31:42Z

Look at bda.cache http://pypi.python.org/pypi/bda.cache - uses ZCA and is tested with zope and bfg.

Hoda · Accepted Answer · 2020-09-17 16:44:51Z

0

ExpiringDict is another option:

https://pypi.org/project/expiringdict/

answered Sep 17, 2020 at 16:44

Hoda

3352 silver badges7 bronze badges

Comments

Jerry Chong · Accepted Answer · 2023-01-19 17:15:26Z

Besides all tools mentioned by others users earlier, you also can use cacheout pypi library.

It allows setting of cache timeout (TTL) for all keys or particular key and get value of particular key if needed.

Hope this helps!

warvariuc · Accepted Answer · 2013-10-23 12:52:43Z

-8

keyring is the best python caching library. You can use

keyring.set_password("service","jsonkey",json_res) json_res= keyring.get_password("service","jsonkey") json_res= keyring.core.delete_password("service","jsonkey")

edited Oct 23, 2013 at 12:52

warvariuc

60.1k45 gold badges183 silver badges234 bronze badges

answered Oct 23, 2013 at 12:43

imp

2,0855 gold badges29 silver badges44 bronze badges

2 Comments

Stavros Korokithakis Over a year ago

That's a keyring library, not a caching library.

imp Over a year ago

@StavrosKorokithakis Actually, i implemented caching of keys through keyring

Collectives™ on Stack Overflow

Is there a Python caching library?

15 Answers 15

5 Comments

2 Comments

2 Comments

1 Comment

1 Comment

1 Comment

3 Comments

1 Comment

2 Comments

Installation

Usage:

1 Comment

Comments

Comments

Comments

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

15 Answers 15

5 Comments

2 Comments

2 Comments

1 Comment

1 Comment

1 Comment

3 Comments

1 Comment

2 Comments

Installation

Usage:

1 Comment

Comments

Comments

Comments

Comments

2 Comments

Linked

Related