Virtual Functions Dilemma in C++

Question

I have two questions to ask...

a)

Class A{ int a; public: virtual void f(){} }; Class B { int b; public: virtual void f1(){} }; Class C: public A, public B { int c; public: virtual void f(){} // Virtual is optional here virtual void f1(){} // Virtual is optional here virtual void f2(){} }; Class D: public C { int d; public: void f2(){} };

Now C++ says that there won't be 3 virtual pointers in C's instance but only 2. And then, how could a call to say,

C* c = new D();

c->f2(); // Since there is no virtual pointer corresponding to the virtual function defined in f2(). How is the late binding done ?..

I read saying that , the virtual pointer to this function is added in the virtual pointer of the first super class of C. Why is that so ?.. Why is there no virtual table ?...

sizeof(*c); // It would be 24 and not 28.. Why ?...

Also say, considering the above code, i do this ,

void (C::*a)() = &C::f; void (C::*b)() = &C::f1; printf("%u", a); printf("%u",b); // Both the above printf() statements print the same address. Why is that so ?... // Now consider this, C* c1 = new C(); c1->(*a)(); c1->(*b)();

// Inspite of a and b having the same address, the function invoked is different. How is the definition of the function bounded here ?...

Hope I get a reply soon.

I formatted your code. You can use the {} to do it yourself. — Mark B
– Mark B, Commented Oct 4, 2011 at 15:20
"Now C++ says that there won't be 3 virtual pointers in C's instance but only 2." No, C++ says nothing about that. — R. Martinho Fernandes
– R. Martinho Fernandes, Commented Oct 4, 2011 at 15:21
I meant that why is there not 3 virtual pointer in C's instance?.. Any specific reason for that — Rishi Mehta
– Rishi Mehta, Commented Oct 4, 2011 at 15:24
How do you disable optimization in visual C++ ? Is is by using the volatile keyword?.. — Rishi Mehta
– Rishi Mehta, Commented Oct 4, 2011 at 15:58

Mark B · Accepted Answer · 2011-10-04 17:07:44Z

The C++ standard makes no mention of virtual tables so the compiler is free to optimize it in any way it chooses. In this case it appears to have consolidated C's vtable with one of the parent ones, but this certainly isn't required. What is required is that if you do:

C* c = new D(); c->f2();

That it calls D::f2 because it's virtual in C.

Member function pointers aren't allowed to be converted even to void* let alone unsigned so it's no surprise that they may not print in an expected manner in printf (which just reads raw bytes to print out). The reason is that with %u you're lying to printf, telling it to print an int when you're actually passing in a parameter of something that is totally NOT an int. In other words, the a and b member function pointers are actually different in spite of what printf appears to be telling you. Since they're really different it's no surprise that they work properly.

If you want to try to print the real function pointer that the compiler gives you, the "most portable" way is to memcpy it into a a vector of unsigned char and then print that. Lengthy example:

#include <iostream> #include <vector> class Foo { public: virtual void f1() { } virtual void f2() { } void f3() { } }; int main() { void (Foo::*a)() = &Foo::f1; void (Foo::*b)() = &Foo::f2; void (Foo::*c)() = &Foo::f3; std::cout << a <<std::endl; std::cout << sizeof(a) << std::endl; std::cout << b <<std::endl; std::cout << sizeof(b) << std::endl; std::cout << c <<std::endl; std::cout << sizeof(c) << std::endl; std::vector<unsigned char> a_vec(sizeof(a)); memcpy(&a_vec[0], &a, sizeof(a)); for(size_t i = 0; i < sizeof(a); ++i) { std::cout << std::hex << static_cast<unsigned>(a_vec[i]) << " "; } std::cout << std::endl; std::vector<unsigned char> b_vec(sizeof(b)); memcpy(&b_vec[0], &b, sizeof(b)); for(size_t i = 0; i < sizeof(b); ++i) { std::cout << std::hex << static_cast<unsigned>(b_vec[i]) << " "; } std::cout << std::endl; std::vector<unsigned char> c_vec(sizeof(c)); memcpy(&c_vec[0], &c, sizeof(c)); for(size_t i = 0; i < sizeof(c); ++i) { std::cout << std::hex << static_cast<unsigned>(c_vec[i]) << " "; } std::cout << std::endl; return 0; }

On g++ 4.2 this produces:

1 8 1 8 1 8 1 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 c6 1d 5 8 0 0 0 0

And you can see clearly here that all three member function pointers are different.

No, there are only 2 virtual pointers in C. I have checked it out. Alexander is right, but how come the optimization is done. Even if the optimization is done, how is the call resolved ?
Also how is the last snippet in the code working? Could you please explain that? It is showing a very odd behaviour?
My question is how come a and b have same address, and how come the call to function f and f1 is resolved ?..
@Rishi Mehta a and b do not have the same address. They only appear to when you illegally print them with printf.
What do you mean by illegally print then ? Could you tell me the correct way to print them ? Also if one is allowed to get the address of a virtual function in this way. The whole sense of virtual function getting binded at run time goes waste. Isnt it ?

Alexandre C. · Accepted Answer · 2011-10-04 15:23:29Z

1

The vtable for C is usually merged with the vtable for one of its superclasses (A or B) as an optimization. But you shouldn't rely on this.

answered Oct 4, 2011 at 15:23

Alexandre C.

57.4k13 gold badges136 silver badges200 bronze badges

3 Comments

Mike Seymour Over a year ago

@Rishi: by adding an entry for f2() to the end of one of the two existing vtables, rather than creating a new vtable.

Rishi Mehta Over a year ago

How does the compiler know the offset ? Since the number of virtual function in a class can vary ?...

Luc Touraille Over a year ago

@RishiMehta: when you modify a base class, derived classes need to be recompiled, so the number of virtual functions provided by a base class is always known to derived classes (but not the other way around, which is useless and would imply recompiling base classes when deriving them).

Luc Touraille · Accepted Answer · 2011-10-04 17:12:22Z

A good read if you like to understand what is going on under the hood: Inside the C++ Object Model, de Stanley Lippman. The content starts to show its age, but it provides a comprehensive presentation of some techniques that were (and sometimes still are) used to implement the C++ features such as inheritance, polymorphism, templates, etc.

Now, to answer your question: first of all, you should know that the way a vendor must implement a given feature is usually not specified by the C++ standard. This is the case here: an implementation is not required to use virtual method tables at all (even though they often do).

That being said, we can still try to guess what is happening here. First, let's see what the memory would like if we created an A instance:

A someA; ________________ ---------------- | @A_vtable | vptr -------->| @A::f | ________________ ---------------- | [some value] | a A_vtable ________________ someA

You can see that an instance of Acontains a virtual table pointer (vptr) in addition to its member variable. This vptr points to A's virtual table, which contains the address of the A's implementation of f.

An instance of B should be quite similar, so I won't bother drawing one. Let's see now what would a C instance look like:

C someC; ________________ ------->---------------- | @C_A_vtable | A_vptr / | @C::f | ________________ ---------------- | [some value] | a | @C::f2 | ---------------- ---------------- | @C_B_vtable | B_vptr \ C_A_vtable ________________ \ | [some value] | b \ ________________ \ someC ---->---------------- | @C::f1 | ---------------- C_B_vtable

You can see that a someC contains an A part and a B part, both containing a vptr. This way, we can cast a C into an A or a B simply by using an offset into the class. Now, regarding the method added by C, you'll notice that I placed its address at the end of the existing vtable for A: instead of creating an entirely new table which would require an additional vptr, I simply extended the existing one. A call to f2 will simply fetch the good address in the table pointed to by A_vptr, and call it, in a way completely similar to the other virtual methods.

D's instances just need to set their two vptr to point to the correct tables (one containing the address of C::f (since f is not overriden) and D::f2, and the other one containing the address of C::f1).

Branko Dimitrijevic · Accepted Answer · 2011-10-04 16:52:54Z

Here is how my Visual C++ 2010 lays out objects of these classes in memory:

object_a {a=-858993460 } A __vfptr 0x009d5740 const A::`vftable' * [0] 0x009d11f9 A::f(void) * a -858993460 int object_b {b=-858993460 } B __vfptr 0x009d574c const B::`vftable' * [0] 0x009d1203 B::f1(void) * b -858993460 int object_c {c=-858993460 } C A {a=-858993460 } A __vfptr 0x009d5764 const C::`vftable'{for `A'} * [0] 0x009d108c C::f(void) * a -858993460 int B {b=-858993460 } B __vfptr 0x009d5758 const C::`vftable'{for `B'} * [0] 0x009d10a5 C::f1(void) * b -858993460 int c -858993460 int object_d {d=-858993460 } D C {c=-858993460 } C A {a=-858993460 } A __vfptr 0x009d5780 const D::`vftable'{for `A'} * [0] 0x009d108c C::f(void) * a -858993460 int B {b=-858993460 } B __vfptr 0x009d5774 const D::`vftable'{for `B'} * [0] 0x009d10a5 C::f1(void) * b -858993460 int c -858993460 int d -858993460 int

As you can see, multiple inheritance produces more than one virtual table per type and more than virtual table pointer per object.

Based on that, answers to your questions are as follows:

c->f2(); // Since there is no virtual pointer corresponding to the virtual function defined in f2(). How is the late binding done ?.

Compiler knows the layout of C, therefore it knows to use the second __vfptr and on which offset the C::f1 is in that table.

sizeof(*c); // It would be 24 and not 28.. Why ?...

On my system (in 32-bit build):

sizeof(C) == sizeof(__vfptr) + sizeof(a) + sizeof(__vfptr) + sizeof(b) + sizeof(c) == 4 + 4 + 4 + 4 + 4 == 20

Apparently, your compiler does something differently.

void (C::*a)() = &C::f; void (C::*b)() = &C::f1; printf("%u", a); printf("%u", b); // Both the above printf() statements print the same address. Why is that so ?...

Because they are member function pointers, not ordinary function pointers. Implementation details vary, but these might be small structures or even thunks. Apparently, both function calls are "covered" by the same structure or thunk in this case, but there may be a separate "part" of the member pointer that is not visible through printf and differs between a and b.

Please keep in mind that all this is an implementation detail and you should never write code that relies on it.

Collectives™ on Stack Overflow

Virtual Functions Dilemma in C++

4 Answers 4

8 Comments

3 Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

8 Comments

3 Comments

Comments

Comments

Related