Python defaultdict: Handling Non-existing Keys with Default Values

How to use the defaultdict class in Python.

Introduction

In this tutorial we are going to learn how to use the defaultdict class. This class is a subclass of the built-in Python dict class [1] and it behaves pretty pretty much like dict with the exception that it can be initialized with a default factory to define a value to be returned when we access a non-existing key.

Note however that if we don’t provide the default factory argument (or explicitly set it to None), we will get the same KeyError exception that we would get in case we used a regular dict (as we will see below).

To define the default factory we can either use our own custom function or, for simpler use cases, use Python’s built in classes for common data types.

This tutorial was tested using version 3.10.11 of Python, running on Windows. The IDE used was PyCharm.

A regular dictionary

We will start with a very simple example using a regular dict. In the code below, we are simply creating an empty dictionary and adding a new key-value pair to it. After that, we try to access both the existing key and then a non-existing key.

 my_dict = {} my_dict["existing-key"] = 20 print(my_dict["existing-key"]) print(my_dict["non-existing-key"]) 

Upon running the previous script, we get the output shown in figure 1.

Output of the python script showing the exception thrown by accessing the non-existing dictionary key.
Figure 1 – Output from the script, showing the KeyError exception.

As expected, when we access the existing key, we get the value we assigned to it. On the contrary, when we access the non-existing key, we get a KeyError exception, which is pretty much the standard dictionary behavior.

As a side note, we could use the dictionary get method for a non-existing key safe access, which will return None by default or another default value we provide.

Using the defaultdict

Now we will alter the previous example to use the defaultdict class instead. We will start by importing it from the collections module.

 from collections import defaultdict 

After this we will create a new instance of the class. As input of the constructor, we will pass the default factory we want to be called whenever we access a non-existing key. This default factory must be a callable that will receive no arguments and should return as output the default value for when we access a non-existing dictionary key.

In our case, since we were working with integers in our original example, we will pass the built-in int class as default factory. You may have probably already used it to convert strings to integers by calling int(“your numeric string”). You may also notice it is documented as part of Python’s built-in functions. But if we look in detail to its documentation, we can note that we are in fact using a class constructor when we call it.

Regardless of the previous details, we can check in this same documentation that the int class constructor defaults to 0 when we pass no argument to it. So, if we use it as default factory, since the defaultdict won’t supply it any argument, it will return the integer 0 when we access a non-existing dictionary key.

 my_dict = defaultdict(int) 

Now we will do what we did before: add a key-value pair to our dictionary and then access an existing and a non-existing key.

 my_dict["existing-key"] = 20 print(my_dict["existing-key"]) print(my_dict["non-existing-key"]) 

For simplicity of running the code, you can check below the full script.

 from collections import defaultdict my_dict = defaultdict(int) my_dict["existing-key"] = 20 print(my_dict["existing-key"]) print(my_dict["non-existing-key"]) 

If you run this code, you will get a result similar to figure 2.

Output of the script using the defaultdict, initialized with the int class
Figure 2 – Output of the script using the defaultdict.

As expected, when we accessed the non-existing key, we no longer got a KeyError exception but instead we got the default value of 0.

There are other built in data types that you can use as default factory. The example below summarizes the usage of dict as default factory in case you need, for example, a dictionary of dictionaries.

 from collections import defaultdict my_dict = defaultdict(dict) my_dict["non-existing-key"]["prop1"] = "prop1" my_dict["non-existing-key"]["prop2"] = "prop2" print(my_dict["non-existing-key"]) 

The result of running the previous script is shown in figure 3. As can be seen, it is useful to avoid having to do an explicit dictionary initialization of the value when we are using a key for the first time.

Output of the script using dict as default factory.
Figure 3 – Using dict as default factory.

Important: defaultdict calls the default factory on its implementation of the __missing__ method, which is called only by a call to the __get_item__ method with a missing key [1]. In other words, the default will only apply when we use the [] operator and there is a missing key, not when we call the get method. The get method will still behave as if we were using a dict: it will return None or a default value, if provided.

Using a function as default factory

In the previous example, we saw that we could use some built-in classes as default factories as their parameterless constructors would be called. This is certainly useful for many use cases, but there may be scenarios where the default value of those classes is not what we intend. There may also be cases where we are using classes that don’t have a parameterless constructor.

So, we can in fact use an arbitrary function as default factory, as long as it receives no parameter. In this section, we will create a function that will return a default string and use it as default factory. The full snippet is shown below.

 from collections import defaultdict def my_default_factory(): return "An awesome default string" my_dict = defaultdict(my_default_factory) print(my_dict["non-existing-key"]) 

By running the previous code, we will get the output from figure 4.

Using a function as default factory.
Figure 4 – Using a function as default factory.

Note that we could not have used the str built-in class to achieve this result as its constructor, when receiving no parameters, creates an empty string.

To illustrate what happens if we use a function that expects a parameter, lets update our snippet as shown below:

 from collections import defaultdict def my_default_factory(param): return "An awesome default string" my_dict = defaultdict(my_default_factory) print(my_dict["non-existing-key"]) 

This time, if we run the code, we will get an exception, as exemplified in figure 5.

Exception thrown from invoking the default factory without the expected argument.
Figure 5 – Exception thrown from invoking the default factory without the expected argument.

To finalize this section, we will tweak the original example to make it a little bit more compact by using a lambda instead of a named function. This approach is very useful if your default factory is simple and doesn’t need to be reused anywhere else.

 from collections import defaultdict my_dict = defaultdict(lambda: "my awesome string") print(my_dict["non-existing-key"]) 

Running this code will result in the same exact output as figure 4, as we can see in figure 6.

Output of the script using a lambda as default factory.
Figure 6 – Using a lambda as default factory.

Not setting a default factory

Just as a final note, we will see that we can create a new default dictionary without passing any default factory. When we do this, it will behave as a dict object and throw a KeyError when we access a non-existing key.

 from collections import defaultdict my_dict = defaultdict() print(my_dict["non-existing-key"]) 

If you run the previous script, you will get a result similar to figure 7.

KeyError exception thrown by defaultdict.
Figure 7 – KeyError exception thrown by defaultdict.

Note that this happens because the default factory, when not passed, is implicitly set to None. So, if you call the constructor of the defaultdict with a value of None, you will still get this same exception. If your intention is actually that None is returned when you access a non-existing key, you can simply use a custom function like we have shown in the previous section.

References

[1] https://docs.python.org/3/library/collections.html#defaultdict-objects

Leave a Reply