2

I have a C++ program which is compiled under gcc (gcc version 4.5.1) with the -O3 flag. I'm thinking about whether or not it would be worthwhile making an SSE2 version of this program (or at least, the busiest of it). However, I'm worried that the compiler has already done this through automatic vectorization.

Question: How do I determine (a) whether or not my program is using SSE/SSE2 and (b) how much time is spent using SSE/SSE2 (i.e. profiling)?

1 Answer 1

1

The easiest way to tell if you are gaining any benefit from compiler vectorization is to run the code with and without the -ftree-vectorize flag and compare the results.

-O3 will automatically enable that option. So you might want to try it under -O2 instead.

To see which loops were vectorized, which were not, and why, you can add the -ftree-vectorizer-verbose option.

The last option, of course, is to look at the assembly. It's very easy to identify vectorized code in assembly.

Sign up to request clarification or add additional context in comments.

1 Comment

Note that you may well still see some SSE code even when nothing has been vectorized, since gcc tends to optimise various scalar floating point operations using SSE instructions (e.g. float<->int conversions).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.