Tips for Pinning Python Requirements Files

I love you Python, but you got baggage. Or, er, you’re not great at putting things into bags. What I’m trying to say is, your packaging is a mess. At this point python lovers, myself included, accept our Stockholm syndrome.

This post comes as a result of a conversation that my team had. How to pin requirements in a requirements.txt files on older python projects? One where you may not understand the dependency structure.

Now, before you start saying to “simply” use poetry or anaconda, let’s take a breath. In this moment we’re exercising our choice of boring technology. Sometimes sticking with requirements files provides the least amount of friction.

Vocab

First Party Imports: Python packages directly imported in the source code (e.g import requests).

Third Party Imports: Python packages required by our first party imports. Also known as transitive dependencies.

You’ve probably read the advice online to pip freeze > requirements.txt. This works™, but it introduces some confusion. The resulting requirements.txt in an unintelligible mix of dependencies. Differentiating first and third party imports becomes impossible. Now comes a spiral of ideas on how to deal with it.

First Party Dependencies Only

I too have gone down this road, it does not bear fruit. A first party dependency can provide ranges for transitive dependencies. Ranges turn installations into a function of time. Installing today won’t give you the same result as installing a week from now. The person on the other end maintaining the library is a human. Semver doesn’t magically prevent packages from breaking, humans make mistakes. Remember when cryptography did a semver “minor” update? That introduced rust and broke a bunch of CI/CD pipelines? What’s a developer to do?

Pin All Dependencies

Pinning all your apps dependencies gives you better reproducibility while reducing unintended updates. It ensures that the next person setting the project up has less issues. It also helps with static analysis like Dependabot. Have you ever tried to produce dependency graphs without installing python packages? Dustin Ingram has a great post on the complexities involved.

But, the requirements will still result in spaghetti. How do we untangle our spaghetti? How do we remove or add dependencies? These two workflows should help you get through the day.

Adding New Requirements

The lesser of two evils is adding new requirements. You can:

  1. Add your pinned requirements to requirements.txt
  2. Create a new venv from requirements.txt. Don’t use your old one, it will have your dev dependencies as well as any extra packages you installed.
  3. pip install -r requirements.txt
  4. pip freeze > requirements.txt

This will capture both your first party import and transitive dependencies. You can teach this manual process to anyone on your team. Simplicity and standard tooling bring value.

Removing Requirements

Removing packages requires a bit more work. You’ve removed all references to import <removedpackage> from your codebase. Now the jump of determining first party imports. I take the naive approach of grepping:

# find all lines with from or import, allow leading space
grep -Erh "^[ ]*(from|import) " **/*.py | \
    # remove all from . imports. Don't care about local ones
    grep -v "from \." | \
    # remove leading space in results
    sed 's/^[ \t]*//' | \
    # Remove duplicates
    sort | uniq > allreqs.py

You could build a script that scrapes the AST of your python files for imports, some other day. This will give you a rough list of packages in a file called allreqs.py To filter out libraries from the stdlib you can:

  1. Clean up any syntax errors, things like from <package> import (
  2. Run isort allreqs.py
  3. Delete all the stdlib requirements

You can now cross reference the requirements against your current requirements.txt. Python libraries do not guarantee that pip install <package> and import <package> match. For instance pip install python-gitlab imports as import gitlab. Once you’ve found all your first party imports, remove all other dependencies from requirements.txt and:

  1. Create a new venv from requirements.txt. Don’t use your old one, it will have your dev dependencies as well as any extra packages you installed.
  2. pip install -r requirements.txt
  3. pip freeze > requirements.txt

Conclusion

Before you decide to try and spend too much time automating, consider if it’s worth the time. While manual, the processes require little expertise and easy to teach. In many projects, requirement files are flat out ignored. Cleaning up your dependencies doesn’t have to make life miserable. Your CI pipelines will thank you.