14

One way to make a numpy array hashable is setting it to read-only. This has worked for me in the past. But when I use such a numpy array in a tuple, the whole tuple is no longer hashable, which I do not understand. Here is the sample code I put together to illustrate the problem:

import numpy as np npArray = np.ones((1,1)) npArray.flags.writeable = False print(npArray.flags.writeable) keySet = (0, npArray) print(keySet[1].flags.writeable) myDict = {keySet : 1} 

First I create a simple numpy array and set it to read-only. Then I add it to a tuple and check if it is still read-only (which it is).

When I want to use the tuple as key in a dictionary, I get the error TypeError: unhashable type: 'numpy.ndarray'.

Here is the output of my sample code:

False False Traceback (most recent call last): File "test.py", line 10, in <module> myDict = {keySet : 1} TypeError: unhashable type: 'numpy.ndarray' 

What can I do to make my tuple hashable and why does Python show this behavior in the first place?

3
  • 1
    Where did you get the idea that setting the writeable flag to False would make the array hashable? That doesn't work, even before you bring the tuple into the picture. Commented Oct 26, 2017 at 19:12
  • A numpy array does not have the required hashing method (something like __hash__). Commented Oct 26, 2017 at 19:16
  • The method is described here: stackoverflow.com/questions/16589791/… Commented Oct 26, 2017 at 19:22

2 Answers 2

18

You claim that

One way to make a numpy array hashable is setting it to read-only

but that's not actually true. Setting an array to read-only just makes it read-only. It doesn't make the array hashable, for multiple reasons.

The first reason is that an array with the writeable flag set to False is still mutable. First, you can always set writeable=True again and resume writing to it, or do more exotic things like reassign its shape even while writeable is False. Second, even without touching the array itself, you could mutate its data through another view that has writeable=True.

>>> x = numpy.arange(5) >>> y = x[:] >>> x.flags.writeable = False >>> x array([0, 1, 2, 3, 4]) >>> y[0] = 5 >>> x array([5, 1, 2, 3, 4]) 

Second, for hashability to be meaningful, objects must first be equatable - == must return a boolean, and must be an equivalence relation. NumPy arrays don't do that. The purpose of hash values is to quickly locate equal objects, but when your objects don't even have a built-in notion of equality, there's not much point to providing hashes.


You're not going to get hashable tuples with arrays inside. You're not even going to get hashable arrays. The closest you can get is to put some other representation of the array's data in the tuple.

Sign up to request clarification or add additional context in comments.

2 Comments

+1, thanks for the clarification on the internals. I'll have to go with converting it to a string first, if the idea of hashing the array is off the table.
Can one create some sort of wrapper around numpy arrays that makes them hashable and equatable? So that one doesn't have to keep going back and forth between byte/string/etc. representations of arrays?
12

The fastest way to hash a numpy array is likely tostring.

In [11]: %timeit hash(y.tostring()) 

What you could do is rather than use a tuple define a class:

class KeySet(object): def __init__(self, i, arr): self.i = i self.arr = arr def __hash__(self): return hash((self.i, hash(self.arr.tostring()))) 

Now you can use it in a dict:

In [21]: ks = KeySet(0, npArray) In [22]: myDict = {ks: 1} In [23]: myDict[ks] Out[23]: 1 

2 Comments

This feels pythonic, thx for the fast answer! And I can also convert it back with a function wrapping numpy.fromstring(), which I will require later on.
Doesn't this break robust use in a dictionary upon hash collisions, given that there is no __eq__ defined here?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.