Tubthumper: An open-source Python Package of Retry Utilities
Monday, January 9th, 2023
Trying my hand at open-source Python with a package of retry utilities named after the English anarcho-communist rock band Chumbawamba's 1997 hit Tubthumping. Docs here, source code here.
In mid-September of 2021, I decided I wanted to try my hand at writing and publishing an open-source Python package. From what I can tell, this is a bit backwards, choosing a solution in search of a problem, but I was interested in gaining a bit more experience with the latest tools and best practices, all while trying a slightly different approach (more on that in a minute). This is a hobby, a little volunteer side-hustle, and certainly not the kind of thing you should do at your day job.
That out of the way, I ended up choosing to write a package of retry utilities, implementing best practices like exponential backoff and jitter. The existing solutions were either no longer supported, lacking a simple API, or short on features. Writing a first-class package of retry utilities seemed like a worthy, if admittedly minimal, contribution to open-source software.
So, what exactly do I mean by first-class? Yes, that’s right, it’s time to list some requirements!
- Published on PyPI.org, the Python Package Index.
- Fully type annotated, passing Mypy’s static type checker with strict flags.
- 100% test coverage across all supported versions of Python and operating systems.
- No external dependencies, i.e. standard library only.
- High-quality, auto-generated documentation hosted online.
Finally, specific to my use-case (retry utilities): compatibility with Python’s full featureset, e.g. methods (and their varieties), async support, the inspect module, and dunder attributes.
With those requirements defined, I went about implementing many best practices from the Python packaging ecosystem, with one major deviation: instead of relying on virtual environments and running everything on the bare metal of a developer’s machine, I wanted to execute all developer workflows in a reproducible Docker environment. As far as I can tell, this isn’t a standard approach in open-source despite my experience with this in industry. The benefits are pretty obvious, e.g. no more “well it runs on my machine” responses to Github Issues, full debug-ability with CI workflows locally. You can find the implementation details of this on Github, but the short version is I create a Docker image with all the relevant OS and Python packages installed, mount the repo as a volume installed in “editable mode,” and more-or-less everything else is standard.
One downside to this approach is a growing number of shell scripts. There are ways to make these high quality, but as Google’s Shell Style Guide, “Shell should only be used for small utilities or simple wrapper scripts” and there are reasons for this. When I get around to it, I’m hoping to drastically reduce the use of Shell in this repo by migrating more workflows to tox, but that’s for another day.
So, let’s talk about those best practices. First off, I think code should be both auto-formatted and linted as part of any decent continuous integration pipeline. Autoformatting automates the annoying task of writing code to a style guide, in Python’s case PEP8. At least for me, as a developer, I find this allows me to focus on writing functional code, then letting the autoformatter take care of making it match the necessary style. For Python code, I use black, along with isort to order imports. With the preponderance of Shell scripts in mind, I autoformatted them using shfmt.
Linting is a form of static analysis, checking code for everything from simple syntax issues to subtle bugs. Linters aren’t just useful for novice programmers learning a new language (although they’re great for that!), they’re an essential part of an experienced developer’s workflow, replacing some of the error-checking functionality of a compiler. As Itamar Turner-Trauring put in the blog post that inspired me to adopt it, Pylint is both useful and unusable. In particular, with a bit of configuration, I’ve found Pylint’s sometimes overly opinionated linting to be quite helpful when working on a project on my own with no one to review my code. For shell scripts, I’ve adopted the beloved shellcheck, which has probably taught me >90% of everything I know about writing shell scripts.
The next best practice is the use of Mypy as a static type checker of type annotated Python code. The type annotations of a retry library can get a bit tricky, e.g. a decorator function that accepts both normal and async functions, so some of this repo was stretching the limits of Mypy’s abilities. That said, the type annotation world of Python has been making incremental improvements to the language over the past few releases, e.g. PEP 612 in 3.10, and Mypy is doing an admirable job of keeping up with supporting these improvements. That said, I haven’t been that impressed with it, so you might want to try Microsoft’s, Facebook’s, or Google’s.
Type annotations have been a bit of a divisive topic in Python-land, but personally, I love them. Not only does this kind of static type checking improve the code quality of projects great and small, I find it improves how I think about the interfaces of my code, and if anything, causes me to use more, not less, duck-typing. As well, I think it improves readability and improves documentation.
With all of that static analysis out of the way, the next best practice is testing. In the case of a retry utility, we only need unit tests, so I used Python’s built-in unittest, using some of its newer mock features to test the async portion of tubthumper. Tox is the standard tool for running unit test suites across a variety of Python versions, and as I mentioned above, I was impressed enough by its proficiency as a workflow runner that I plan to use it to remove a handful of bash scripts in the future. Particularly useful was the tox-gh-actioons package, which makes it easy to speed up your CI pipeline in Github Actions by parallelizing unit test runs for different Python versions. In combination with Github Actions’ matrix strategy, I was able to test every version of Python on the three major operating systems (MacOS, Linux, & Windows), all in parallel. Finally, I used the coverage package to track test coverage, requiring 100% as its own form of test.
Some of you may be wondering why I haven’t mentioned pre-commit by now. This is a popular Python package used to run test suites, linters, and other developer workflows as part of git’s pre-commit hook. It’s a cool idea, forcing discipline on developers by automating test runs, code autoformatters and the like as part of making a commit, but it seemed against the spirit of the project to force someone to install a giant dependency that even includes a node environment on their developer machine to get this to work. I did write a few of my own pre-commit shell scripts at one point, but I can tell you, this is NOT a good idea, you will mess something up and regret it!
I DID however include a few more testing tools as part of the continuous integration process. One was check-wheel-contents, a neat little package intended solely to check contents of a Python wheel for typical issues. Another was twine’s check command, which confirms your distribution’s long description will render correctly on PyPI. The third was doctest, but before we get to testing documentation, we ought to talk over generating it.
The standard in Python packaging is to use Sphinx, typically using the annoying ReStructuredText format. Luckily, many folks are moving to using Markdown in their documentation using MyST, a parser extension compatible with Sphinx, so of course I did that. The killer feature of Sphinx is the ability to auto-generate documentation from docstrings, which is just lovely. Sphinx has a pretty odd looking syntax to make this work, but luckily the extension Napoleon implements Google style docstrings, making things much more read-able. The Sphinx ecosystem has all kinds of other nice extensions to improve your documentation, from the site’s style (I chose Furo) to providing links to your code’s source files on Github. I ended up writing my own code to improve the code linking, inspired by some custom code NumPy does for the same thing.
The other killer feature of Sphinx is the doctest extension, which will run code snippets in your documentation and make sure they behave correctly. This is an incredible tool for keeping documentation fresh and in-sync with code!
The standard for hosting documentation is Read the Docs, which makes it easy to do things like 1) integrate documentation generation with your CI pipeline, and 2) use a custom domain (how I manage to have them host tubthumper’s documentation as a subdomain of my personal site). I ended up writing a bit of code that goes through some (quite frankly unreasonable) hoops to spruce up my documentation with things like custom badges and reports from Mypy, coverage, Pylint, and Pytest. I can’t quite recommend that part, but the result is pretty nice to look at nonetheless.
It’s now been over a year and three releases of tubthumper since I first started this project, and I can say it’s been quite helpful to stay up-to-date with the latest & greatest in the Python packaging space. For example, this past December, I migrated to using pyproject.yaml for project configuration, removing the old setup.py/setup.cfg. I’m not sure if my retry utility package will take off (the name isn’t exactly viral), but I plan to maintain this thing for the foreseeable future in the hopes of keeping on the leading edge.
Questions | Comments | Suggestions
If you have any feedback and want to continue the conversation, please get in touch; I'd be happy to hear from you! Feel free to use the form, or just email me directly at firstname.lastname@example.org.