Managing Python Dependencies

Published on:
Aug. 16, 2018

I’m a member of a few groups on social media for python programmers. As with any group, digital or “real”, several discussion topics come up repeatedly. Some of the more experienced members of the group seem to be weary of seeing similar questions posted repeatedly, and newcomers sometimes feel less than welcome when their questions go unanswered.

To (hopefully) help reduce some answerer fatigue, and help people get the answers they’re looking for, I’m going to devode a few blog posts to handling problems that come up repeatedly. Today’s topic is dependency management in Python. If you have suggestions for other topics that could be covered in the future, let me know and I’ll write about them if I can.

Note: the following tutorial is for Linux/Unix systems (including MacOS). If you are running one of these systems and have issues with the instructions below please get in touch. Unfortunately, including instructions for Windows introduces some complexities that are beyond the scope of this blog.

The Issue


Every programmer who’s moved from learning things to building things has probably installed a library or package to help accomplish a task. Using packages built by other people is a huge timesaver, and lets you work on solving new probems instead of re-hashing the same old things over and over.

What happens when you have an old project that depends on a specific version of a package, but you want to use a new version of the same package for a new project? What if you want to test a new version of Python before you commit to upgrading an app you built? If you aren’t careful, you can wind up in a situation like this:

—Randall Monroe, XKCD #1987

This is a pretty common problem, and there are plenty of tools that can help solve it. Here we’re going to learn how to avoid the rat’s nest above by using Virtualenv and pip. In this tutorial we’ll create a new virtual environment, install some things in it, keep a record of our dependencies, and use that record to automatically recover our virtual environment if it gets destroyed.

Installing Pip and Virtualenv


Pip (Pip installs packages) is the package installer officially recommended by the Python Package Authority. If you have Python2 >= 2.7.9 or Python3 >= 3.4 you already have it installed. If you’re using an older version of python, you can install pip with:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py

For a detailed list of installation options available, see pip’s installation documentation. To confirm pip installed correctly, you can check it’s man page with:

man pip

Pip’s man page lists a bunch of useful commands you can use with pip to help install, uninstall, and upgrade dependencies. We’ll get to some of these later.

Once you have pip installed Virtualenv can be installed with:

pip install virtualenv

This will install virtualenv globally, so it can be used from anywhere.

Creating and Using a Virtual Environment


Let’s make a new project directory, and give it it’s own virtual environment. Navigate to your preferred parent folder in your terminal and execute the following:

mkdir BlogTutorialProject && cd $_
virtualenv -p python3 tutorial_env

That first line is normal Bash, but what is that second line doing? Virtualenv is the tool we just intsalled, and using the -p flag when executing it allows us to specify which version of python to use. In the place of python3, you can use any version of python you have installed. Python3 just happens to be the package name for Python 3 on debian systems.

After executing the commands above, if you run ls you should see a new directory called tutorial_env that wasn’t there before. This is our virtual environment, and it’s where everything we’ll install in this tutorial is actually located. To activate your new virtual environment, run:

source tutorial_env/bin/activate

You should now see (tutorial_env) before your normal command prompt. This lets you know that your new virtual environment is now active. When you’re done working in this virtualenv and would like to deactivate it, simply run the command deactivate. Now that we’re in a virtual environment, let’s install some packages. Run:

pip install django numpy pandas

Once the installer runs, check inside your tutorial_env/lib/[python_version]/site-packcages directory. You should see all of the packages you just installed. Now all of your packages are installed locally where they’re easy to get to and track. You can also check out the source code to see how your favorite tools work, and even customize them. Neat!

Tracking Packages With Pip


Earlier we took a look at pip’s man page, and I promised we’d explore some of those options. Now’s that time. If you’ve followed along from the beginning, your virtual environment should be active and you should have django, numpy, and pandas installed. If you aren’t sure, you can run

pip list

to see what’s currently installed. You should see Django, numpy, pandas, and all of their dependencies, along with pip and it’s dependencies (setuptools and wheel).

Pip also includes a utility to record your dependencies in a file. Run:

pip freeze > requirements.txt

You should now have a file in your directory called requirements.txt that contains a list of dependencies along with their version numbers. If you ever have to re-create a project or clone it onto a new machine, this requirements file is going to be a real lifesaver. To show exactly how, go ahead and deactivate your virtual environment with deactivate, and delete the tutorial_env directory completely. Don’t panic, I’ve done this before.

DO NOT DELETE YOUR REQUIREMENTS.TXT FILE!

Using virtualenv and pip, we can quickly and painlessly reconstruct our environment. Run:

virtualenv -p python3 tutorial_env2
source tutorial_env2/bin/activate

to create a new virtual environment and activate it.

Note: If you are on a debian-based distribution of Linux you should check your requirements.txt file for a line that reads: pkg-resources==0.0.0. This line is the result of a bug with some versions of debian-based systems. If it’s there remove it. If not, you don’t need to worry about it. more info

Now you can use the pip command:

pip install -r requirements.txt

to reinstall all of your dependencies. Just like that, the virtual environment is as good as new!

Summary


Virtualenv and pip are really handy tools. I create a new virtual environment for every project I start, and I use pip to install and track dependencies. Virtual environments keep my system from becoming a mess, and allow me to try new versions of packages when they’re released with no fear of causing permanent damage. More than once I’ve had to move a project to a new machine or clone it into a different environment, and virtualenv and pip have made that process much easier. There are more you can do with pip and a requirements file, but those can wait for a follow-on post.

If anyone reading this has suggestions or requests for future iterations, please let me know.