4

My friend just started learning Python and Flask, and is missing a lot of "best practices", e.g., a requirements.txt file.

He has recently asked me for assistance, and to make the project clean, I want to setup a CI service (Travis), but I need to work out this file first.

Since he did not initially have a requirements.txt, all information I can have is his import statements, as well as his output of pip freeze.

As there's no way to distinguish a direct requirement by the project and an indirect requirement by one of the packages, I want to find out all "top-level" packages from the list. A "top-level package" is a package that's not required by another package in the list. For example, urllib3 is required by requests, so when requests is present, urllib3 may better not appear in the final result.

Is there a way to achieve this?


If anyone wants to help me with this specific instance, here's the output of pip freeze:

apturl==0.5.2 arrow==0.12.1 asn1crypto==0.24.0 binaryornot==0.4.4 blinker==1.4 Bootstrap-Flask==1.0.9 Brlapi==0.6.6 certifi==2018.1.18 chardet==3.0.4 Click==7.0 colorama==0.3.7 command-not-found==0.3 configparser==3.5.0 cookiecutter==1.6.0 cryptography==2.1.4 cupshelpers==1.0 decorator==4.1.2 defer==1.0.6 distro-info==0.18 dominate==2.3.5 Flask==1.0.2 Flask-Bootstrap4==4.0.2 Flask-Login==0.4.1 Flask-Mail==0.9.1 Flask-Moment==0.6.0 Flask-SQLAlchemy==2.3.2 Flask-WTF==0.14.2 future==0.17.1 httpie==0.9.8 httplib2==0.9.2 idna==2.6 ipython==5.5.0 ipython-genutils==0.2.0 itsdangerous==1.1.0 Jinja2==2.10 jinja2-time==0.2.0 keyring==10.6.0 keyrings.alt==3.0 language-selector==0.1 launchpadlib==1.10.6 lazr.restfulclient==0.13.5 lazr.uri==1.0.3 louis==3.5.0 macaroonbakery==1.1.3 Mako==1.0.7 MarkupSafe==1.1.0 mysqlclient==1.3.14 netifaces==0.10.4 oauth==1.0.1 olefile==0.45.1 pexpect==4.2.1 pickleshare==0.7.4 Pillow==5.1.0 poyo==0.4.2 prompt-toolkit==1.0.15 protobuf==3.0.0 pycairo==1.16.2 pycrypto==2.6.1 pycups==1.9.73 Pygments==2.2.0 pygobject==3.26.1 pymacaroons==0.13.0 PyNaCl==1.1.2 pyRFC3339==1.0 python-apt==1.6.3 python-dateutil==2.7.5 python-debian==0.1.32 pytz==2018.3 pyxdg==0.25 PyYAML==3.12 reportlab==3.4.0 requests==2.18.4 requests-unixsocket==0.1.5 ruamel.yaml==0.15.34 SecretStorage==2.3.1 simplegeneric==0.8.1 simplejson==3.13.2 six==1.11.0 SQLAlchemy==1.2.14 system-service==0.3 systemd-python==234 traitlets==4.3.2 ubuntu-drivers-common==0.0.0 ufw==0.35 unattended-upgrades==0.1 urllib3==1.22 usb-creator==0.3.3 visitor==0.1.3 wadllib==1.3.2 wcwidth==0.1.7 Werkzeug==0.14.1 whichcraft==0.5.2 WTForms==2.2.1 xkit==0.0.0 zope.interface==4.3.2 

and here are the import statements, with an additional pymysql he told me.

import os from flask import * from flask_bootstrap import Bootstrap from flask_moment import Moment from flask_wtf import FlaskForm from wtforms import * from wtforms.validators import * from flask_sqlalchemy import SQLAlchemy from flask_mail import Mail, Message from werkzeug.security import generate_password_hash,check_password_hash from flask_login import login_required , login_user,login_fresh,login_url,LoginManager,UserMixin,logout_user 
11
  • You should do i liek this: create a new virtual environment → install the dependencies from the imports via pip → check if everything works → use pip freeze Commented Jan 21, 2019 at 14:46
  • @KlausD. Yeah. As an experienced Python developer, I try to follow these good practices. But the problem is, my friend doesn't, and it's now a problem. Commented Jan 21, 2019 at 14:47
  • I don't get the question. Why are only the top level packages required? Installing them, will automatically install the rest (that they depend on), so whether the list will contain only the top level or all of them seems pretty much irrelevant. What am I missing? Commented Jan 21, 2019 at 14:56
  • 1
    @CristiFati Yes. While installing the minimum list will indeed install the full list consequently, putting the full output of pip freeze in requirements.txt isn't a good idea. Therefore, I want a minimum and maintainable list. Commented Jan 21, 2019 at 15:14
  • 1
    A few recommendations: pipdeptree — pip dependency tree; pipreqs — generate requirements.txt file for any project based on imports. Commented Jan 22, 2019 at 19:58

1 Answer 1

2

First, I wanted to suggest using PIP's API, but it's recommended to use pip as a CmdLine tool only ([PyPA]: Using pip from your program). Note that I successfully used it, I just don't expose the code (at least for now).
Here's a way that uses pkg_resources ([ReadTheDocs]: Package Discovery and Resource Access using pkg_resources).

code00.py:

#!/usr/bin/env python import os import pkg_resources import sys def get_pkgs(reqs_file="requirements_orig.txt"): if reqs_file and os.path.isfile(reqs_file): ret = dict() with open(reqs_file) as f: for item in f.readlines(): name, ver = item.strip("\n").split("==")[:2] ret[name] = ver, () return ret else: return { item.project_name: (item.version, tuple([dep.name for dep in item.requires()])) for item in pkg_resources.working_set } def print_pkg_data(text, pkg_info): print("{:s}\nSize: {:d}\n\n{:s}".format(text, len(pkg_info), "\n".join(["{:s}=={:s}".format(*item) for item in pkg_info]))) def main(*argv): pkgs = get_pkgs(reqs_file=None) full_pkg_info = [(name, data[0]) for name, data in sorted(pkgs.items())] print_pkg_data("----------FULL LIST----------", full_pkg_info) deps = set() for name in pkgs: deps = deps.union(pkgs[name][1]) min_pkg_info = [(name, data[0]) for name, data in sorted(pkgs.items()) if name not in deps] print_pkg_data("\n----------MINIMAL LIST----------", min_pkg_info) if __name__ == "__main__": print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform)) rc = main(*sys.argv[1:]) print("\nDone.\n") sys.exit(rc) 

Output:

(py_064_03.06.08_test0) e:\Work\Dev\StackOverflow\q054292236> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code00.py Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] 064bit on win32 ----------FULL LIST---------- Size: 133 Babel==2.6.0 Click==7.0 Django==2.1.4 Flask==1.0.2 Jinja2==2.10 Keras==2.2.4 Keras-Applications==1.0.6 Keras-Preprocessing==1.0.5 Markdown==3.0.1 MarkupSafe==1.1.0 Pillow==5.3.0 PyQt5==5.9.2 PyQt5-sip==4.19.13 PyYAML==3.13 Pygments==2.3.1 QtAwesome==0.5.3 QtPy==1.5.2 Send2Trash==1.5.0 Sphinx==1.8.3 Werkzeug==0.14.1 absl-py==0.6.1 alabaster==0.7.12 asn1crypto==0.24.0 astor==0.7.1 astroid==2.1.0 backcall==0.1.0 bleach==3.0.2 certifi==2018.11.29 cffi==1.11.5 chardet==3.0.4 cloudpickle==0.6.1 colorama==0.4.1 cryptography==2.4.2 cycler==0.10.0 decorator==4.3.0 defusedxml==0.5.0 djangorestframework==3.9.0 docutils==0.14 entrypoints==0.2.3 fatiando==0.5 funcsigs==1.0.2 future==0.17.1 gast==0.2.0 grpcio==1.17.1 h5py==2.9.0 html5lib==1.0.1 idna==2.8 imagesize==1.1.0 ipaddr==2.2.0 ipykernel==5.1.0 ipython==7.2.0 ipython-genutils==0.2.0 ipywidgets==7.4.2 isort==4.3.4 itsdangerous==1.1.0 jedi==0.13.2 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.2.4 jupyter-console==6.0.0 jupyter-core==4.4.0 keyboard==0.13.2 keyring==17.1.1 kiwisolver==1.0.1 lazy-object-proxy==1.3.1 llvmlite==0.26.0 lxml==4.2.5 matplotlib==3.0.2 mccabe==0.6.1 mistune==0.8.4 nbconvert==5.4.0 nbformat==4.4.0 notebook==5.7.4 numba==0.41.0 numpy==1.15.4 numpydoc==0.8.0 opencv-python==3.4.4.19 packaging==18.0 pandas==0.23.4 pandocfilters==1.4.2 parso==0.3.1 patsy==0.5.1 pickleshare==0.7.5 pip==18.1 prometheus-client==0.5.0 prompt-toolkit==2.0.7 protobuf==3.6.1 psutil==5.4.8 pyOpenSSL==18.0.0 pycodestyle==2.4.0 pycparser==2.19 pycryptodome==3.7.2 pyflakes==2.0.0 pygame==1.9.4 pylint==2.2.2 pynput==1.4 pyparsing==2.3.0 python-dateutil==2.7.5 pytz==2018.7 pywin32==224 pywin32-ctypes==0.2.0 pywinpty==0.5.5 pyzmq==17.1.2 qtconsole==4.4.3 requests==2.21.0 rope==0.11.0 scapy==2.4.0 scipy==1.2.0 setuptools==40.6.3 sip==4.19.8 six==1.12.0 snowballstemmer==1.2.1 sphinxcontrib-websupport==1.1.0 spyder==3.3.2 spyder-kernels==0.3.0 statsmodels==0.9.0 tensorboard==1.12.1 tensorflow-gpu==1.12.0 tensorflow-tensorboard==1.5.1 termcolor==1.1.0 terminado==0.8.1 testpath==0.4.2 thrift==0.11.0 tornado==5.1.1 traitlets==4.3.2 typed-ast==1.1.1 urllib3==1.24.1 wcwidth==0.1.7 webencodings==0.5.1 wheel==0.32.3 widgetsnbextension==3.4.2 wrapt==1.10.11 xlrd==1.2.0 ----------MINIMAL LIST---------- Size: 37 Babel==2.6.0 Click==7.0 Django==2.1.4 Flask==1.0.2 Keras==2.2.4 Keras-Applications==1.0.6 Keras-Preprocessing==1.0.5 Markdown==3.0.1 Pillow==5.3.0 PyQt5==5.9.2 PyQt5-sip==4.19.13 PyYAML==3.13 QtAwesome==0.5.3 QtPy==1.5.2 Sphinx==1.8.3 djangorestframework==3.9.0 fatiando==0.5 funcsigs==1.0.2 ipaddr==2.2.0 keyboard==0.13.2 lxml==4.2.5 opencv-python==3.4.4.19 pandas==0.23.4 patsy==0.5.1 pip==18.1 pyOpenSSL==18.0.0 pycryptodome==3.7.2 pygame==1.9.4 pynput==1.4 pywin32==224 scapy==2.4.0 spyder==3.3.2 statsmodels==0.9.0 tensorflow-gpu==1.12.0 tensorflow-tensorboard==1.5.1 thrift==0.11.0 xlrd==1.2.0 

Notes:

  • (Stating the obvious): In order to get a pkg info, that pkg needs to be installed. That's why in my example I didn't used your file (I named it requirements_orig.txt), but the pkgs installed on my VEnv

  • As you can see, in my case the pkg number dropped from 133 to 37, which I'd say it's pretty manageable (of course, more filtering can be done)

  • I created the data structures based on the assumption that a pkg name is a primary key (uniquely identifies a pkg). If this is false, the code would require a bit of change

Final note: If you also want to consider your module's import list (to strip out even more pkgs, if possible), you could also try [Python.Docs]: modulefinder - Find modules used by a script (I used it in [SO]: What files are required for Py_Initialize to run? (@CristiFati's answer), only from CmdLine, but it should be trivial to use it from a script)

Sign up to request clarification or add additional context in comments.

1 Comment

Sure. As I said at the beginning this is "a way", meaning there can be others (maybe even better) as well.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.