-
- Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Originally reported by: wickman (Bitbucket: wickman, GitHub: wickman)
as far as I can tell, build_zipmanifest is not cached.
from a recent profile:
ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 4.942 4.942 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/pex.py:124(_execute_internal) 1 0.000 0.000 3.830 3.830 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/environment.py:114(activate) 1 0.001 0.001 3.823 3.823 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/environment.py:121(_activate) 1 0.000 0.000 3.737 3.737 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/environment.py:105(update_candidate_distributions) 56 0.003 0.000 3.731 0.067 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/environment.py:84(load_internal_cache) 1 0.032 0.032 3.728 3.728 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/environment.py:63(write_zipped_internal_cache) 55 0.002 0.000 2.947 0.054 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/_twitter_common_python/util.py:46(distribution_from_path) 57 0.005 0.000 2.945 0.052 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/pkg_resources.py:1703(__init__) 57 0.183 0.003 2.938 0.052 /Users/wickman/clients/science/dist/aurora_client.pex/.bootstrap/pkg_resources.py:1452(build_zipmanifest)This is the profile for starting up a PEX file (zipped python environment, see https://mail.python.org/pipermail/distutils-sig/2014-January/023727.html ) with a number of exploded eggs inside. build_zipmanifest is called with the same archive every time we construct a Distribution via EggMetadata:
def __init__(self, module): EggProvider.__init__(self,module) self.zipinfo = build_zipmanifest(self.loader.archive) self.zip_pre = self.loader.archive+os.sepIt's not an unreasonable assumption that each time you construct a new Distribution, it will either be on disk or part of its own zip archive, meaning these would not be duplicated calls.
In our case, all eggs are in a single zip. This means 57 50ms calls instead of 1 50ms call in order to run this Python application which has 57 egg dependencies.
The proposal is to cache calls to zipmanifest (perhaps invalidating should os.stat/mtime change.)