Skip to content

Foadsf/GPGPU_vs_OpenMP

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPGPU vs OpenMP Benchmark

A "Fair Fight" performance comparison between Single-Threaded CPU, Multi-Threaded CPU (OpenMP), and GPU (OpenGL Compute Shader).

The task is $1024 \times 1024$ Matrix Multiplication ($O(N^3)$).

The Contenders

  1. CPU Single: Standard C++ loop.
  2. CPU OpenMP: Uses #pragma omp parallel for to utilize all available cores.
  3. GPU: OpenGL 4.6 Compute Shader (Naive implementation).

Performance Results (Sony Vaio SVF15N2C5E)

Hardware: Intel Core i7-4500U (2 Cores / 4 Threads) | Intel HD 4400 | NVIDIA GT 735M

Device Time (ms) Speedup vs Single Speedup vs OpenMP
CPU Single Thread ~20,339 ms 1x -
CPU OpenMP (4 Threads) ~11,742 ms 1.7x 1x
Intel HD 4400 ~243 ms 83x 48x
NVIDIA GT 735M ~149 ms 136x 76x

Note: The i7-4500U is a dual-core CPU with HyperThreading, so OpenMP scaling is limited compared to true quad-core systems.

Prerequisites

  • CMake (3.15+)
  • C++ Compiler with OpenMP support (GCC/Clang/MSVC)
  • vcpkg (for GLFW/GLAD)

Build Instructions

# Install dependencies (if not already done) ./vcpkg install glfw3 glad --triplet=x64-linux # Configure cmake -B build -S . -DCMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake # Build cmake --build build --config Release 

Usage

Standard Run (CPU + Default GPU):

./build/gpgpu_vs_omp 

Run on Discrete GPU (NVIDIA Optimus):

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia ./build/gpgpu_vs_omp --skip-single 

Options:

  • --skip-single: Skips the slow single-threaded benchmark.
  • --skip-omp: Skips the multi-threaded benchmark.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C++ 85.0%
  • CMake 7.7%
  • GLSL 7.3%