What is the fastest way to make multiple selections from a list? Compiled methods included.
For example, here are two methods for selecting a subset, compared:-
biglist = {{5, "e", 500}, {4, "d", 400}, {3, "c", 300}, {2, "b", 200}, {1, "a", 100}}; subset = {2, 5, 4}; Cases[biglist, {#, __}] & /@ subset {{{2, "b", 200}}, {{5, "e", 500}}, {{4, "d", 400}}}
Cases[biglist, {Apply[Alternatives, subset], __}] {{5, "e", 500}, {4, "d", 400}, {2, "b", 200}}
Only first method returns items in the order matched in the subset list, but it is much slower:-
n = 10000; biggerlist = Map[{#, FromCharacterCode[Mod[# - 1, 26] + 97], #*100} &, Range[n]]; unsortedbiglist = RandomSample[biggerlist, n]; unsortedsubset = RandomSample[Range[n], Round[n/10]]; Row[{First[Timing[selection1 = Map[Cases[unsortedbiglist, {#, __}] &, unsortedsubset];]], " seconds"}] 1.123 seconds
Row[{First[Timing[selection2 = Cases[unsortedbiglist, {Apply[Alternatives, unsortedsubset], __}];]], " seconds"}] 0.171 seconds
The selections are the same, but differently ordered:-
SameQ[Flatten[selection1, 1], Extract[selection2, Flatten[Map[Position[First /@ selection2, #] &, unsortedsubset], 1]]] True
Including the sorting routine in the selection process still gives a better timing:-
Row[{First[Timing[selection3 = Function[selection2, Extract[selection2, Flatten[Map[Position[First /@ selection2, #] &, unsortedsubset], 1]]][Cases[unsortedbiglist, {Apply[Alternatives, unsortedsubset], __}]];]], " seconds"}] 0.359 seconds
SameQ[Flatten[selection1, 1], selection3] True
Nevertheless, only the slow Cases method returns {} when there are unmatched subset items, which is sometimes useful.
Ideas for speedy selections would be great.