7

Any idea why the C# version of sqrt (System.Math.Sqrt) is ~10 times slower than c++ version ? Furthermore, C# version seems to have one extra digit of precision. I have run my test under MSVC2012.

I have used double and call System.Math.Sqrt once before doing the bench in order to force Jit

13
  • 5
    It's managed code. Having it run significantly slower--even with a JIT--is to be expected, although a figure of 10X is a bit surprising. Commented Dec 24, 2012 at 18:42
  • 20
    Can you post your benchmark code? Commented Dec 24, 2012 at 18:57
  • 5
    If not done right, it might be caused by differences in the optimizers. For instance, if the result is not used, the C++ compiler might simply erase it. If the computations are constant, the compiler might run some sort of constant folding. Commented Dec 24, 2012 at 19:15
  • 3
    Still, sqrt(2.0) is a constant value. One of the optimizers might replace that with the result. Commented Dec 24, 2012 at 19:20
  • 20
    This has nothing to do with C# or C++. At best this has to do with certain implementations of certain functions in certain libraries. The only true way to actually figure out what's really happening is to profile the actual executed code and step through it instruction-per-instruction basis to evaluate what happens where, why, and what does it cost. Anything else is mere guesses, which more often than not go wrong. Commented Dec 24, 2012 at 21:07

1 Answer 1

21

I am speaking only from the C side (and thus applicable to C++). I have no system that can run C# to work from.

The first program I wrote was the trivial:

#include <math.h> #include <stdio.h> int main(void) { printf("%f\n",sqrt(2.0)); } 

Using gcc -S -O3 sqrt.c I got the compiled source in sqrt.s and looked at that.

 .file "sqrt.c" .section .rodata.str1.1,"aMS",@progbits,1 .LC1: .string "%f\n" .text .p2align 4,,15 .globl main .type main, @function main: .LFB14: .cfi_startproc movsd .LC0(%rip), %xmm0 movl $.LC1, %edi movl $1, %eax jmp printf .cfi_endproc .LFE14: .size main, .-main .section .rodata.cst8,"aM",@progbits,8 .align 8 .LC0: .long 1719614413 .long 1073127582 .ident "GCC: (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision 167585]" .section .comment.SUSE.OPTs,"MS",@progbits,1 .string "Ospwg" .section .note.GNU-stack,"",@progbits 

One will note that there is no call to sqrt in the code - it looks like its just loading a constant (which it is).

This became more apparent when writing one that used a variable and doing the compile to demonstrate what a call to sqrt would look like.

I'm not going for any sort of elegance with this code.

#include <math.h> #include <stdio.h> void main(int argc, char **argv) { double num = atoi(argv[0]); printf("%f\n",sqrt(num)); } 

While gcc -O3 -S sqrt.c worked, this second program as gcc -O3 -S sqrt2.c returned

/tmp/cckmgfMS.o: In function `main': sqrt2.c:(.text+0x46): undefined reference to `sqrt' collect2: ld returned 1 exit status 

It was calling sqrt, and I forgot to link the math library.

When adding the link to the code, one can see the call to sqrt in it:

 .file "sqrt2.c" .section .rodata.str1.1,"aMS",@progbits,1 .LC0: .string "%f\n" .text .p2align 4,,15 .globl main .type main, @function main: .LFB14: .cfi_startproc subq $8, %rsp .cfi_def_cfa_offset 16 movq (%rsi), %rdi xorl %eax, %eax call atoi cvtsi2sd %eax, %xmm1 sqrtsd %xmm1, %xmm0 ucomisd %xmm0, %xmm0 jp .L5 .L2: movl $.LC0, %edi movl $1, %eax addq $8, %rsp .cfi_remember_state .cfi_def_cfa_offset 8 jmp printf .L5: .cfi_restore_state movapd %xmm1, %xmm0 call sqrt jmp .L2 .cfi_endproc .LFE14: .size main, .-main .ident "GCC: (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision 167585]" .section .comment.SUSE.OPTs,"MS",@progbits,1 .string "Ospwg" .section .note.GNU-stack,"",@progbits 

One can see in this code the call to sqrt, and the lack of the constants that the optimizer put in.

From the comment above:

I have of course store the 10^6 sqrt(2.0) calls by doing a sum in a variable ( i.e: var += sqrt(2.0) ) and print it on screen at the end to be sure that compilator will not skip some codes. – Guillaume07 Dec 24 '12 at 19:19

So, consider - if you are dealing with constants, this is something that the C and C++ optimizers will identify and optimize out.


Failing having access to C#, I looked at how Java deals with the line:

System.out.println(Math.sqrt(2.0)); 

This instruction is compiled to the Java byte code of:

0 getstatic java.lang.System.out : java.io.PrintStream [16] 3 ldc2_w <Double 2.0> [22] 6 invokestatic java.lang.Math.sqrt(double) : double [24] 9 invokevirtual java.io.PrintStream.println(double) : void [30] 

One can see that the Java complier doesn't have access to the information of the output of sqrt() to be able to optimize into a constant. It is possible that the JIT optimizer might have access to the information about the purity of calls through Math to StrictMath and replace multiple calls of Math.sqrt(2.0) to the same value (and not call it again), however it still has to call it once at that point to get the value. That said, I don't have any insight into what goes on at runtime in the JIT and how calls to pure functions that end up native might be optimized.

However, the C optimizer is still ahead of the game with a big loop (assuming that the JIT optimizer only needs to make one call to sqrt() to get that first value).

When looking at the optimization of the loop in C, the optimizer even precalculates the loop.

#include <math.h> #include <stdio.h> int main(void) { double sum = 0; int i = 0; for(i; i < 10; i++) { sum += sqrt(2.0); } printf("%f\n",sum); } 

through gcc -O3 -S sqrt3.c (still no -lm needed) becomes:

 .file "sqrt3.c" .section .rodata.str1.1,"aMS",@progbits,1 .LC1: .string "%f\n" .text .p2align 4,,15 .globl main .type main, @function main: .LFB14: .cfi_startproc movsd .LC0(%rip), %xmm0 movl $.LC1, %edi movl $1, %eax jmp printf .cfi_endproc .LFE14: .size main, .-main .section .rodata.cst8,"aM",@progbits,8 .align 8 .LC0: .long 2034370 .long 1076644038 .ident "GCC: (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision 167585]" .section .comment.SUSE.OPTs,"MS",@progbits,1 .string "Ospwg" .section .note.GNU-stack,"",@progbits 

And one can see that this code is identical to the first one, with different constants in .LC0 section. The loop has been calculated down to just "the ultimate value is this, don't bother doing it at run time."

8
  • 4
    This is the answer. I'm wondering though how come the MSVC C# complier isn't optimizing it out to a constant? Seems that this is one of the simpler optmizations for compilers to implement. Commented Jan 31, 2013 at 17:10
  • 4
    @suslik The C# compiler is likely is looking at it and seeing "This is a call to a system library that is calling a native code library - I'll let it stay a it is because I don't know what it is going to do." That is, no information on the purity of the function call to native code is available for the optimizer to examine. Commented Jan 31, 2013 at 17:13
  • @suslik I've added another two bits of code - what the java calls compiles to and another that shows how gcc handles looping over the constant (precalculatig the value at compile time rather than actually doing the loop). Commented Jan 31, 2013 at 17:33
  • 1
    I have to correct the author on one important thing. There is no such thing as the "MSVC C#" compiler. The C# compiler is called "csc.exe" which likely stands for "C Sharp Compiler". Besides MSVC stands for "Microsoft Visual C" and "Visual C" is NOT Visual C#. There is also MSVC++ "Microsoft Visual C++" in both cases its a managed varition of C or C++ Commented Feb 1, 2013 at 14:33
  • @MichaelT: Interestingly, one conclusion that one might draw from this is that Java is slow because part of it is written in C (and thus can't be understood by the Java compiler) whereas if parts of Java weren't written in C but rather all in Java, then it would be as fast as C! (In fact, that's precisely how e.g. the Self system got its amazing performance; when it came out, it was faster than some production-quality C++ compilers despite being even more dynamic than, say, Ruby, Python or JavaScript and certainly Java and C#.) Commented May 26, 2013 at 16:05

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.