Timeline for Parallelization of element-wise matrix multiplication
Current License: CC BY-SA 3.0
7 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Aug 4, 2015 at 16:10 | comment | added | mtall | Another observation: Armadillo uses column-major layout, so to get best performance you need to work on columns instead of rows. | |
| Aug 4, 2015 at 16:07 | comment | added | mtall | To speed up your first code, use the -O3 optimization switch in GCC or clang (or the equivalent in MSVC) to enable auto-vectorization. This will make Armadillo use SSE2 instructions. For even more speed, use -O3 -march=native, which will enable AVX instructions. More information is on the Armadillo FAQ page. | |
| May 10, 2015 at 1:03 | answer | added | Bill Greene | timeline score: 2 | |
| May 9, 2015 at 23:43 | history | edited | The Quantum Physicist | CC BY-SA 3.0 | added 134 characters in body |
| May 9, 2015 at 23:07 | answer | added | Brian Borchers | timeline score: 3 | |
| May 9, 2015 at 20:16 | history | edited | The Quantum Physicist | CC BY-SA 3.0 | deleted 7 characters in body |
| May 9, 2015 at 20:08 | history | asked | The Quantum Physicist | CC BY-SA 3.0 |