1

I have a quick question regarding the OpenMP. Usually one can do a section parallel like this (written in fortran, and has two sections):

!$OMP PARALLEL SECTIONS !$OMP SECTION < Fortran code block A> !$OMP SECTION < Fortran code block B> !$OMP END SECTIONS 

Now what I really want to run fortran code block A and B within a do loop, which itself should not be parallelized, because this do-loop is a time-dependent loop that every new step depend on previous step’s results. And before the parallel section, I need to run a serial code (let's call it block C). Now both block A, B, C are function of do loop variable t. Then naively one might propose such code by simply embedded this parallel within a do loop:

do t=1:tmax < Fortran serial code block C> !$OMP PARALLEL SECTIONS !$OMP SECTION < Fortran code block A> !$OMP SECTION < Fortran code block B> !$OMP END SECTIONS end do 

However, it is obvious that the creation of the thread overheads will largely decelerate this speed, which even possibly make it slower than a standard serial code. Therefore, one might come up with smarter idea to solve this.

I was wondering whether you can help me on giving some hints on how to do this. What's the best approach (fastest computation) on this?

2
  • 3
    it is obvious that the creation of the thread overheads will largely decelerate this speed This is not at all obvious and really depends on what you compute. If you solve a simple ODE than yes. If you solve a PDE with millions of degrees of freedom, you can probably neglect it, as I do. The threads are not created each time, just synchronized and re-used. Commented Sep 16, 2014 at 18:48
  • 1
    I concur with @VladimirF. Most OpenMP implementations spin up the required number of threads at the start of the program (some at the start of the outermost parallel region first encountered) but don't kill them until the program terminates. The best hint I can offer is that you should test your proposed program design, gather some data, and if the picture the data paints is unclear, come back to SO for illumination. Commented Sep 16, 2014 at 18:52

1 Answer 1

1

I concur with both comments that it is not at all obvious how much the OpenMP overhead would be compared to the computation. If you find it (after performing the corresponding measurements) to be really high, then the typical way to handle this case is to put the loop inside a parallel region:

!$OMP PARALLEL PRIVATE(t) do t=1,tmax !$OMP SINGLE < Fortran code block C > !$OMP END SINGLE !$OMP SECTIONS !$OMP SECTION < Fortran code block A > !$OMP SECTION < Fortran code block B > !$OMP END SECTIONS end do !$OMP END PARALLEL 

Each thread will loop independently. The SECTIONS construct has an implicit barrier at its end so the threads are synchronised before the next loop iteration. If there is some additional code before the end of the parallel region that does not synchronise, an explicit barrier has to be inserted just before end do.

The SINGLE construct is used to isolate block C such that it gets executed by one thread only.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.