Skip to main content
added 694 characters in body
Source Link

As pointed out by @szabolcs part of the problem is associated with the distribution over the kernels. But I think that is not only problem.

We can generalize the foo function to run in parallel and achieve the same performance as the unpackage code.

The code is much slower in the packaged case, because it is different and lot more complex. If you create a function with the same instructions as the unpackaged code, you should get the equivalent timing and the same results.

BeginPackage["package`"]; foo::usage = "foo[x]"foo[x, test] is a function to calculate stuff"; Begin["Private`"]; foo[ x_Real, test_:False ] := If[ test == True , ParallelTable[ Sum[ BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 10000} ] , {n, 0, 12}  ] , Table[   Sum[   BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 10000}   ]  , {n, 0, 12} ] ]   End[]; EndPackage[]; 
ClearSystemCache[] res1 = AbsoluteTiming[ Table[ Sum[ BesselJ[0, 10^-9 k]/(n + 1.6^k) , {k, 0, 10000} ] , {n, 0, 12}] ]; ClearSystemCache[] res3res3a = AbsoluteTiming[AbsoluteTiming[foo[1.6,True]]; ClearSystemCache[] res3b foo[1= AbsoluteTiming[foo[1.6]6,False]]; ClearSystemCache[] res3c ];= AbsoluteTiming[foo[1.6]]; 
{ res1[[1]]res1[[ 1 ]], res3[[1]]res3a[[ 1 ]], res3b[[ 1 ]], res3c[[ 1 ]] } res1[[2]]==res3[[2]]res1[[ 2 ]] == res3a[[ 2 ]] == res3b[[ 2 ]] == res3c[[ 2 ]]  {4.352175524, 1.92163, 4.2368838698, 4.38055}   True 

The code is much slower in the packaged case, because it is different and lot more complex. If you create a function with the same instructions as the unpackaged code, you should get the equivalent timing and the same results.

BeginPackage["package`"]; foo::usage = "foo[x] is a function to calculate stuff"; Begin["Private`"]; foo[ x_Real ] := Table[ Sum[ BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 10000} ] , {n, 0, 12} ] End[]; EndPackage[]; 
ClearSystemCache[] res1 = AbsoluteTiming[ Table[ Sum[ BesselJ[0, 10^-9 k]/(n + 1.6^k) , {k, 0, 10000} ] , {n, 0, 12}] ]; ClearSystemCache[] res3 = AbsoluteTiming[ foo[1.6] ]; 
{ res1[[1]], res3[[1]] } res1[[2]]==res3[[2]] {4.35217, 4.23688} True 

As pointed out by @szabolcs part of the problem is associated with the distribution over the kernels. But I think that is not only problem.

We can generalize the foo function to run in parallel and achieve the same performance as the unpackage code.

The code is much slower in the packaged case, because it is different and lot more complex. If you create a function with the same instructions as the unpackaged code, you should get the equivalent timing and the same results.

BeginPackage["package`"]; foo::usage = "foo[x, test] is a function to calculate stuff"; Begin["Private`"]; foo[ x_Real, test_:False ] := If[ test == True , ParallelTable[ Sum[ BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 10000} ] , {n, 0, 12}  ] , Table[   Sum[   BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 10000}   ]  , {n, 0, 12} ] ]   End[]; EndPackage[]; 
ClearSystemCache[] res1 = AbsoluteTiming[ Table[ Sum[ BesselJ[0, 10^-9 k]/(n + 1.6^k) , {k, 0, 10000} ] , {n, 0, 12}] ]; ClearSystemCache[] res3a = AbsoluteTiming[foo[1.6,True]]; ClearSystemCache[] res3b = AbsoluteTiming[foo[1.6,False]]; ClearSystemCache[] res3c = AbsoluteTiming[foo[1.6]]; 
{ res1[[ 1 ]], res3a[[ 1 ]], res3b[[ 1 ]], res3c[[ 1 ]] } res1[[ 2 ]] == res3a[[ 2 ]] == res3b[[ 2 ]] == res3c[[ 2 ]]  {4.5524, 1.92163, 4.38698, 4.38055}   True 
Source Link

The code is much slower in the packaged case, because it is different and lot more complex. If you create a function with the same instructions as the unpackaged code, you should get the equivalent timing and the same results.

For instance, consider only the serial case

BeginPackage["package`"]; foo::usage = "foo[x] is a function to calculate stuff"; Begin["Private`"]; foo[ x_Real ] := Table[ Sum[ BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 10000} ] , {n, 0, 12} ] End[]; EndPackage[]; 

which runs roughly the same time as the unpackaged code (using ClearSystemCache[] to ensure correct timing)

ClearSystemCache[] res1 = AbsoluteTiming[ Table[ Sum[ BesselJ[0, 10^-9 k]/(n + 1.6^k) , {k, 0, 10000} ] , {n, 0, 12}] ]; ClearSystemCache[] res3 = AbsoluteTiming[ foo[1.6] ]; 

with comparable timing and exactly the same results

{ res1[[1]], res3[[1]] } res1[[2]]==res3[[2]] {4.35217, 4.23688} True 

Now, if we define an unpackaged function equivalent to your packaged one

Options[bar] = {RunInParallel -> False}; bar[x_, OptionsPattern[]] := Block[{TableCommand, SumCommand}, Which[ OptionValue[RunInParallel] === False , TableCommand = Table; SumCommand = Sum; , OptionValue[RunInParallel] === True , TableCommand = ParallelTable; SumCommand = Sum; , True , TableCommand = OptionValue[RunInParallel][[1]]; SumCommand = OptionValue[RunInParallel][[2]]; ]; TableCommand[ SumCommand[ BesselJ[0, 10^-9 k]/(n + x^k), {k, 0, 50000} ] , {n, 0, 12}] ] 

evaluating it and comparing with you packaged function (res2)

ClearSystemCache[] res4 = AbsoluteTiming[ bar[1.6, RunInParallel -> False] ] { res2[[1]], res4[[1]] } res2[[2]]==res4[[2]] {42.4399, 42.0656} True 

All results are equivalent

res1[[ 2 ]] == res2[[ 2 ]] == res3[[ 2 ]] == res4[[ 2 ]] True