I have some problems understanding how numpy objects hashability is managed.
>>> import numpy as np >>> class Vector(np.ndarray): ... pass >>> nparray = np.array([0.]) >>> vector = Vector(shape=(1,), buffer=nparray) >>> ndarray = np.ndarray(shape=(1,), buffer=nparray) >>> nparray array([ 0.]) >>> ndarray array([ 0.]) >>> vector Vector([ 0.]) >>> '__hash__' in dir(nparray) True >>> '__hash__' in dir(ndarray) True >>> '__hash__' in dir(vector) True >>> hash(nparray) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'numpy.ndarray' >>> hash(ndarray) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'numpy.ndarray' >>> hash(vector) -9223372036586049780 >>> nparray.__hash__() 269709177 >>> ndarray.__hash__() 269702147 >>> vector.__hash__() -9223372036586049780 >>> id(nparray) 4315346832 >>> id(ndarray) 4315234352 >>> id(vector) 4299616456 >>> nparray.__hash__() == id(nparray) False >>> ndarray.__hash__() == id(ndarray) False >>> vector.__hash__() == id(vector) False >>> hash(vector) == vector.__hash__() True How come
- numpy objects define a
__hash__method but are however not hashable - a class deriving
numpy.ndarraydefines__hash__and is hashable?
Am I missing something?
I'm using Python 2.7.1 and numpy 1.6.1
Thanks for any help!
EDIT: added objects ids
EDIT2: And following deinonychusaur comment and trying to figure out if hashing is based on content, I played with numpy.nparray.dtype and have something I find quite strange:
>>> [Vector(shape=(1,), buffer=np.array([1], dtype=mytype), dtype=mytype) for mytype in ('float', 'int', 'float128')] [Vector([ 1.]), Vector([1]), Vector([ 1.0], dtype=float128)] >>> [id(Vector(shape=(1,), buffer=np.array([1], dtype=mytype), dtype=mytype)) for mytype in ('float', 'int', 'float128')] [4317742576, 4317742576, 4317742576] >>> [hash(Vector(shape=(1,), buffer=np.array([1], dtype=mytype), dtype=mytype)) for mytype in ('float', 'int', 'float128')] [269858911, 269858911, 269858911] I'm puzzled... is there some (type independant) caching mechanism in numpy?
Vectorclass simply derives fromnumpy.ndarraywhich is not hashable, yet theVectorclass is, even if it's mutable.