A "Fair Fight" performance comparison between Single-Threaded CPU, Multi-Threaded CPU (OpenMP), and GPU (OpenGL Compute Shader).
The task is
- CPU Single: Standard C++ loop.
- CPU OpenMP: Uses
#pragma omp parallel forto utilize all available cores. - GPU: OpenGL 4.6 Compute Shader (Naive implementation).
Hardware: Intel Core i7-4500U (2 Cores / 4 Threads) | Intel HD 4400 | NVIDIA GT 735M
| Device | Time (ms) | Speedup vs Single | Speedup vs OpenMP |
|---|---|---|---|
| CPU Single Thread | ~20,339 ms | 1x | - |
| CPU OpenMP (4 Threads) | ~11,742 ms | 1.7x | 1x |
| Intel HD 4400 | ~243 ms | 83x | 48x |
| NVIDIA GT 735M | ~149 ms | 136x | 76x |
Note: The i7-4500U is a dual-core CPU with HyperThreading, so OpenMP scaling is limited compared to true quad-core systems.
- CMake (3.15+)
- C++ Compiler with OpenMP support (GCC/Clang/MSVC)
- vcpkg (for GLFW/GLAD)
# Install dependencies (if not already done) ./vcpkg install glfw3 glad --triplet=x64-linux # Configure cmake -B build -S . -DCMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake # Build cmake --build build --config Release Standard Run (CPU + Default GPU):
./build/gpgpu_vs_omp Run on Discrete GPU (NVIDIA Optimus):
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia ./build/gpgpu_vs_omp --skip-single Options:
--skip-single: Skips the slow single-threaded benchmark.--skip-omp: Skips the multi-threaded benchmark.