Revisions to MPI Program Does Not "Speed Up" After Implementing Parallel Computing Techniques

added 70 characters in body

edited Jun 23, 2022 at 13:26

6k
2
29
31

This is a case of the granularity being too fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes one, or a couple of, microseconds. (A figure that has stayed fairly constant over the past few decades!) That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

This is a case of the granularity being too fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes a couple of microseconds. That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

This is a case of the granularity being too fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes one, or a couple of, microseconds. (A figure that has stayed fairly constant over the past few decades!) That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

added 1 character in body

Source Link

edited Jun 21, 2022 at 18:24

ryyker

23.2k
3
46
96

This is a case of the granularity being totoo fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes a couple of microseconds. That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

This is a case of the granularity being to fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes a couple of microseconds. That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

This is a case of the granularity being too fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes a couple of microseconds. That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

Source Link

answered Jun 21, 2022 at 17:31

Victor Eijkhout

6k
2
29
31

This is a case of the granularity being to fine. Granularity is defined as the amount of work between synchronization points vs the cost of synchronization. Let's say your MPI_Reduce takes a couple of microseconds. That's enough time to do a few thousand operations. So for speedup to occur, you need many thousands of operations between the reductions. You don't have that, so the time of your code is completely dominated by the cost of the MPI calls, and that does not go down with the number of processes.

Collectives™ on Stack Overflow

Return to Answer