5

From the ‘Special method lookup for new-style classes’ section of the ‘Data model’ chapter in the Python documentation (bold emphasis mine):

For new-style classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary. That behaviour is the reason why the following code raises an exception (unlike the equivalent example with old-style classes):

>>> class C(object): ... pass ... >>> c = C() >>> c.__len__ = lambda: 5 >>> len(c) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: object of type 'C' has no len() 

The rationale behind this behaviour lies with a number of special methods such as __hash__() and __repr__() that are implemented by all objects, including type objects. If the implicit lookup of these methods used the conventional lookup process, they would fail when invoked on the type object itself:

>>> 1 .__hash__() == hash(1) True >>> int.__hash__() == hash(int) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: descriptor ’__hash__’ of ’int’ object needs an argument 

Incorrectly attempting to invoke an unbound method of a class in this way is sometimes referred to as ‘metaclass confusion’, and is avoided by bypassing the instance when looking up special methods:

>>> type(1).__hash__(1) == hash(1) True >>> type(int).__hash__(int) == hash(int) True 

I cannot catch the words in bold well…

3 Answers 3

4

To understand what's going on here, you need to have a (basic) understanding of the conventional attribute lookup process. Take a typical introductory object-oriented programming example - fido is a Dog:

class Dog(object): pass fido = Dog() 

If we say fido.walk(), the first thing Python does is to look for a function called walk in fido (as an entry in fido.__dict__) and call it with no arguments - so, one that's been defined something like this:

def walk(): print "Yay! Walking! My favourite thing!" fido.walk = walk 

and fido.walk() will work. If we hadn't done that, it would look for an attribute walk in type(fido) (which is Dog) and call it with the instance as the first argument (ie, self) - that is triggered by the usual way we define methods in Python:

class Dog: def walk(self): print "Yay! Walking! My favourite thing!" 

Now, when you call repr(fido), it ends up calling the special method __repr__. It might be (poorly, but illustratively) defined like this:

class Dog: def __repr__(self): return 'Dog()' 

But, the bold text is saying that it also makes sense to do this:

 repr(Dog) 

Under the lookup process I just described, the first thing it looks for is a method called __repr__ assigned to Dog... and hey, look, there is one, because we just poorly but illustratively defined it. So, Python calls:

Dog.__repr__() 

And it blows up in our face:

>>> Dog.__repr__() Traceback (most recent call last): File "<pyshell#38>", line 1, in <module> Dog.__repr__() TypeError: __repr__() takes exactly 1 argument (0 given) 

because __repr__() expects a Dog instance to be passed to it as its self argument. We could do this to make it work:

class Dog: def __repr__(self=None): if self is None: # return repr of Dog # return repr of self 

But, then, we would need to do this every time we write a custom __repr__ function. That it needs to know how to find the __repr__ of the class is a problem, but not much of a one - it can just delegate to Dog's own class (type(Dog)) and call its __repr__ with Dog as its self-argument:

 if self is None: return type(Dog).__repr__(Dog) 

But first, this breaks if the classname changes in the future, since we've needed to mention it twice in the same line. But the bigger problem is that this is basically going to be boilerplate: 99% of implementations will just delegate up the chain, or forget to and hence be buggy. So, Python takes the approach described in those paragraphs - repr(foo) skips finding an __repr__ attached to foo, and goes straight to:

type(foo).__repr__(foo) 
Sign up to request clarification or add additional context in comments.

4 Comments

Good! But how to understand 1 .__hash__() is ok while int.__hash__() is not? I think 1 and int are both instances of their types (1 from int, int from type). This inconsistency makes me crazy...x x
@ymfoi for the same reason that Dog.__hash__() doesn't work: int.__hash__ doesn't know how to hash int itself, only how to hash instances of int, and it expects to be given an instance of int as self. To hash int itself, we need type.__hash__(int).
well, it's strange that 1 .__hash__() does know how to do the hash thing --- to invoke int.__hash__. But why does not int.__hash__() know it can use type.__hash__(int)?
@ymfoi because the lookup rules for 1 .__hash__ say that an __hash__ directly on 1 takes precedence if it exists - it doesn't exist, so Python looks for __hash__ on int, which does exist, so it uses it. For int.__hash__, the one on int does exist, so it picks that one.
1

What you have to remember is that classes are instances of their metaclass. Some operations need to be performed not just on instances, but on types as well. If the method on the instance was run then it would fail since the method on the instance (really a class in this case) would require an instance of the class rather than the metaclass.

class MC(type): def foo(self): print 'foo' class C(object): __metaclass__ = MC def bar(self): print 'bar' C.foo() C().bar() C.bar() 

1 Comment

..hmm..could you plz use that int and type(int) to make an example?
1

Normal attribute retrieval obj.attr looks up attr in the instance attributes and class attributes of obj. It is defined in object.__getattribute__ and type.__getattribute__.

Implicit special method call special(obj, *args, **kwargs) (e.g. hash(1)) looks up __special__ (e.g. __hash__) in the class attributes of obj (e.g. 1), bypassing the instance attributes of obj instead of performing the normal attribute retrieval obj.__special__, and calls it. The rationale is that the instance attributes of obj may require a receiver argument (usually called self) which is an instance of obj to be called (e.g. function attributes) whereas special(obj, *args, **kwargs) does not provide one, contrary to the class attributes of obj which may require a receiver argument (usually called self) which is an instance of the class type(obj) to be called (e.g. function attributes) and special(obj, *args, **kwargs) provides one: obj.

Example

The special method __hash__ takes a single argument. Compare these two expressions:

>>> 1 .__hash__ <method-wrapper '__hash__' of int object at 0x103c1f930> >>> int.__hash__ <slot wrapper '__hash__' of 'int' objects> 
  • The first expression retrieves the method vars(type(1))['__hash__'].__get__(1) bound to 1 from the class attribute vars(type(1))['__hash__']. So the class attribute requires a receiver argument which is an instance of type(1) to be called, and we have already provided one: 1.
  • The second expression retrieves the function vars(int)['__hash__'].__get__(None, int) from the instance attribute vars(int)['__hash__']. So the instance attribute requires a receiver argument which is an instance of int to be called, and we have not provided one yet.
>>> 1 .__hash__() 1 >>> int.__hash__(1) 1 

Since the built-in function hash takes a single argument, hash(1) can provide the 1 required in the first call (a class attribute call) while hash(int) cannot provide the 1 required in the second call (an instance attribute call). Consequently, hash(obj) should bypass the instance attribute vars(obj)['__hash__'] and directly access the class attribute vars(type(obj))['__hash__']:

>>> hash(1) == vars(type(1))['__hash__'].__get__(1)() True >>> hash(int) == vars(type(int))['__hash__'].__get__(int)() True 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.