Pickling the Cache - Page 14
June 7, 2001
A real life application is an extension of the cache example
given in the dictionary chapter. Recall that there we were
calling a function that performed a time intensive calculation
based on its three arguments. During the course of a program run
many of our calls to it ended up being with the same set of
arguments. We were able to obtain a significant performance
improvement by caching the results in a dictionary, keyed by the
arguments that produced them. However, it was also the case that
many different sessions of this program were being run many times
over the course of days, weeks, and months. Therefore, by
pickling the cache we were able to keep from having to start over
with every session. Following is a pared down version of the
module for doing this.
"""sole module: contains function sole, save, show"""
import cPickle
_soleMemCacheD = {}
_soleDiskFileS = "solecache"
# This initialization code will be executed
# when this module is first loaded.
file = open(_ soleDiskFileS, 'r')
_soleMemCacheD = cPickle. load( file)
file. close()
# Public functions
def sole( m, n, t):
""" sole( m, n, t): perform the sole calculation
using the cache."""
global _soleMemCacheD
if _soleMemCacheD. has_ key(( m, n, t)):
return _soleMemCacheD[( m, n, t)]
else:
# . . . do some time-consuming calculations . . .
_soleMemCacheD[( m, n, t)] = result
return result
def save():
""" save(): save the updated cache to disk."""
global _soleMemCacheD, _soleDiskFileS
file = open(_ soleDiskFileS, 'w')
cPickle. dump(_ soleMemCacheD, file)
file. close()
def show():
""" show(): print the cache"""
global _soleMemCacheD
print _soleMemCacheD
[Lines 16 and 17 above are one line. They have been split for
formatting purposes.]
This code assumes the cache file already exists. If you want to
play around with it, use the following to initialize the cache
file:
>>> import cPickle
>>> file = open(" solecache", W)
>>> cPickle. dump({}, file)
You will also, of course, need to replace the comment "# . . .
do some time-consuming calculations" with an actual
calculation. Note that for production code, this is a situation
where you probably would use an absolute pathname for your cache
file. Also, concurrency is not being handled here. If two people
run overlapping sessions, you will only end up with the additions
of the last person to save. If this were an issue, you could
limit this overlap window significantly by using the dictionary
update method in the save function.
Pickling Objects Into Files - Page 13
The Quick Python Book
Shelving Objects - Page 15
|