Overloading Functions in Python

Disclaimer: This is not a tutorial (though the article referenced a bit further is a good tutorial on this topic). This is merely my recording of something I learned recently. Mostly so that I don't forget it and so that I can remind myself of times of growth during times of feeling crushed under the weight of imposter syndrome. If it actually helps someone else, even better.

Last week, I was watching the live stream of one of my favorite podcasts, Python Bytes, and one of the hosts, Brian Okken, mentioned an article by Martin Heinz titled "The Correct Way to Overload Functions in Python".

Immediately, a real-world use case in one of my open-source projects came to mind.

I was implementing new functionality in the pythonbible library to be able to count the number of books, chapters, or verses in a given Bible reference.

Recently, one of the users of that library had expressed some confusion as to what to pass in as an argument to a particular function. Should a reference be passed in, or should it be a list of references? I thought it was pretty obvious if you looked at the implementation of that function, or even from the name of the function, and I had added type hints to make it even more explicit. I hadn't anticipated that this user was coming from a different career background from me and was using my library in a different environment. I'm a software engineer who has spent the majority of the past two decades hands-on writing code, much of it in python, and I do my development in an IDE where I can easily look at the code in the libraries I'm using. This user is at an earlier career stage, is primarily a data scientist, and is using my library in a notebook environment.

I thought this would be a good opportunity to, hopefully, avoid further similar confusion (for the user, not for the contributor) by overloading these counter functions to accept either a reference object or a list of reference objects. I then went one step further and allowed for a third option of a string parameter. After getting some user feedback, I discovered that most of the time, the users start with a string that contains one or more references and use the pythonbible library to find the references contained within that string and then do further things with that list of references. By allowing a string parameter in the counter functions, that step of finding the references in that string can be done for the user behind the scenes.

Here's what the implementation of the book counter ended up looking like at the time of this writing:

from functools import singledispatch
from typing import List

from pythonbible.normalized_reference import NormalizedReference
from pythonbible.parser import get_references


@singledispatch
def count_books(references: List[NormalizedReference]) -> int:
    """Return the count of books of the Bible included in the given reference(s)."""
    return _get_number_of_books_in_references(references)


@count_books.register
def _count_books_single(reference: NormalizedReference) -> int:
    return _get_number_of_books_in_reference(reference)


@count_books.register
def _count_books_string(reference: str) -> int:
    return _get_number_of_books_in_references(get_references(reference))


def _get_number_of_books_in_references(references: List[NormalizedReference]) -> int:
    return sum(_get_number_of_books_in_reference(reference) for reference in references)


def _get_number_of_books_in_reference(reference: NormalizedReference) -> int:
    return (
        reference.end_book.value - reference.book.value + 1 if reference.end_book else 1
    )

The following three statements are all valid and all return the same value (1).

import pythonbible as bible

bible.count_books("Genesis 1:1")
bible.count_books(bible.NormalizedReference(bible.Book.GENESIS, 1, 1, 1, 1))
bible.count_books([bible.NormalizedReference(bible.Book.GENESIS, 1, 1, 1, 1)])

What I Learned

There is always something else in the python standard library I haven't yet discovered.

From reading the article referenced above, I learned about both single and multiple dispatch methods of function overloading and how to implement them using python.

Multiple dispatch, which I did not end up needing in this case, requires the use of a third-party library, but single dispatch is part of the functools module of the python standard library and has been for quite a while.

While going through the interviewing process last summer, I discovered several gaps in my python knowledge. I've been gradually filling those gaps, but this one was new to me.

I should not assume the users of my software libraries are like me.

This should have been a lot more obvious to me than it apparently was.

I should not assume those who are using the software I create have a certain python skill level or general software development skill level or even that software development is their primary focus.

I should make my software libraries as simple and intuitive as possible. Even experienced developers might not be inclined to use my software if it's not simple and intuitive. If it's too complex or confusing to use, they will look elsewhere or write it themselves.

User feedback is awesome.

I'm ecstatic that people are actually using the software I write. It's encouraging to hear from users, that they are using it at all and how they're using it (often in ways I never anticipated). And constructive criticism has been helpful for product design and planning and for motivating me to keep working on it.