I am using g++ to compile a C++ code; a scientific simulation software.
Currently I am using the -O3 and -funroll-loops flags. I could notice a big difference between -O0, -O1, -O2, and -O3, and almost no difference with -funroll-loops.
Would you have any suggestions to help me to increase the optimization or tricks that I can use to get even better performances ?
Thanks !
Edit, as suggested in the comments: I am asking here about 'pure' compiling optimization, ie. is there clever things to do than just -O3. The computing intensive part of the code deals with manipulation of blitz::array in huge loops.
Edit2: I actually deal with a lot of fp (double) math