Skip to main content
Accounting for the edit.
Source Link
eyorble
  • 9.7k
  • 1
  • 26
  • 38

Assignment is typically much faster than computations involving large symbolic matrices, so let's construct the list of expressions so that we can offload them to ParallelTable:

{f1, f2, f3} = ParallelTable[ReleaseHold[expr], {expr, Thread[Hold[ {3*D[h1[x],x]*Exp[1*x], 3*D[h2[x],x]*Exp[2*x], 3*D[h3[x],x]*Exp[3*x]} ]]} ]; 

Thread[Hold[...]] is used to prevent the expressions from being evaluated as part of the expr iterator. If we don't do that the values will be expanded before ever being dispatched to the parallel kernels, which means that there won't be any speedup.

Note that if your expressions have side-effects (e.g. assignment within the expression), trying this may result in some very unusual bugs as the parallel kernels will typically not share their side-effects with each other.

Regarding the second part of the edited question, it appears that passing back large arrays from parallel kernels may be comparatively slow. I assume that the intent of Plus@Table is to replicate the effects of Sum (which would be Plus@@Table, in most cases). In that case, the fastest code I could find for this particular case is:

({f1, f2} = {ParallelSum[3^x, {x, 10^5}], ParallelSum[5^x, {x, 10^5}]}); // AbsoluteTiming 

{4.61335, Null}

Note that ParallelSum is used directly instead of using Sum in a ParallelTable. This is because more localized parallel operations tend to be faster, as memory access can be made more consistent. The above ParallelTable trick assumes that there's no particular correlation to be had between the individual expressions to begin with, and that random memory access is not going to be the primary contributing factor the slowness of the evaluation. Notably, this is rather faster than Parallelize's result using Sum:

{k1, k2} = Parallelize[{Sum[3^x, {x, 10^5}], Sum[5^x, {x, 10^5}]}]; // AbsoluteTiming 

{10.9441, Null}

Assignment is typically much faster than computations involving large symbolic matrices, so let's construct the list of expressions so that we can offload them to ParallelTable:

{f1, f2, f3} = ParallelTable[ReleaseHold[expr], {expr, Thread[Hold[ {3*D[h1[x],x]*Exp[1*x], 3*D[h2[x],x]*Exp[2*x], 3*D[h3[x],x]*Exp[3*x]} ]]} ]; 

Thread[Hold[...]] is used to prevent the expressions from being evaluated as part of the expr iterator. If we don't do that the values will be expanded before ever being dispatched to the parallel kernels, which means that there won't be any speedup.

Note that if your expressions have side-effects (e.g. assignment within the expression), trying this may result in some very unusual bugs as the parallel kernels will typically not share their side-effects with each other.

Assignment is typically much faster than computations involving large symbolic matrices, so let's construct the list of expressions so that we can offload them to ParallelTable:

{f1, f2, f3} = ParallelTable[ReleaseHold[expr], {expr, Thread[Hold[ {3*D[h1[x],x]*Exp[1*x], 3*D[h2[x],x]*Exp[2*x], 3*D[h3[x],x]*Exp[3*x]} ]]} ]; 

Thread[Hold[...]] is used to prevent the expressions from being evaluated as part of the expr iterator. If we don't do that the values will be expanded before ever being dispatched to the parallel kernels, which means that there won't be any speedup.

Note that if your expressions have side-effects (e.g. assignment within the expression), trying this may result in some very unusual bugs as the parallel kernels will typically not share their side-effects with each other.

Regarding the second part of the edited question, it appears that passing back large arrays from parallel kernels may be comparatively slow. I assume that the intent of Plus@Table is to replicate the effects of Sum (which would be Plus@@Table, in most cases). In that case, the fastest code I could find for this particular case is:

({f1, f2} = {ParallelSum[3^x, {x, 10^5}], ParallelSum[5^x, {x, 10^5}]}); // AbsoluteTiming 

{4.61335, Null}

Note that ParallelSum is used directly instead of using Sum in a ParallelTable. This is because more localized parallel operations tend to be faster, as memory access can be made more consistent. The above ParallelTable trick assumes that there's no particular correlation to be had between the individual expressions to begin with, and that random memory access is not going to be the primary contributing factor the slowness of the evaluation. Notably, this is rather faster than Parallelize's result using Sum:

{k1, k2} = Parallelize[{Sum[3^x, {x, 10^5}], Sum[5^x, {x, 10^5}]}]; // AbsoluteTiming 

{10.9441, Null}

Source Link
eyorble
  • 9.7k
  • 1
  • 26
  • 38

Assignment is typically much faster than computations involving large symbolic matrices, so let's construct the list of expressions so that we can offload them to ParallelTable:

{f1, f2, f3} = ParallelTable[ReleaseHold[expr], {expr, Thread[Hold[ {3*D[h1[x],x]*Exp[1*x], 3*D[h2[x],x]*Exp[2*x], 3*D[h3[x],x]*Exp[3*x]} ]]} ]; 

Thread[Hold[...]] is used to prevent the expressions from being evaluated as part of the expr iterator. If we don't do that the values will be expanded before ever being dispatched to the parallel kernels, which means that there won't be any speedup.

Note that if your expressions have side-effects (e.g. assignment within the expression), trying this may result in some very unusual bugs as the parallel kernels will typically not share their side-effects with each other.