I have developed a distributed memory MPI application which involves processing of a grid. Now i want to apply shared memory techniques (essentially making it a hybrid - parallel program), with OpenMP, to see if it can become any faster, or more efficient. I'm having a hard time with OpenMP, especially with a nested for loop. My application involves printing the grid to the screen every half a second, but when i parallelize it with OpenMP, execution proceeds 10 times slower, or not at all. The console screen lags and refreshes itself with random / unexpected data. In other words, it is going completely wrong. Take a look at the following function, which does the printing:
void display2dGrid(char** grid, int nrows, int ncolumns, int ngen) { //#pragma omp parallel updateScreen(); int y, x; //#pragma omp parallel shared(grid) // garbage //#pragma omp parallel private(y) // garbage output! //#pragma omp for for (y = 0; y < nrows; y++) { //#pragma omp parallel shared(grid) // nothing? //#pragma omp parallel private(x) // 10 times slower! for (x = 0; x < ncolumns; x++) { printf("%c ", grid[y][x]); } printf("\n"); } printf("Gen #%d\n", ngen); fflush(stdout); } (updateScreen() just clears the screen and writes from top left corner again.)
The function is executed by only one process, which makes it a perfect target for thread parallelization. As you can see i have tried many approaches and one is worse than the other. Best case, i get semi proper output every 2 seconds (because it refreshes very slowly). Worst case i get garbage output.
I would appreciate any help. Is there a place where i can find more information to proper parallelize loops with OpenMP? Thanks in advance.