I have a very big file and contains data in the format of {x,y}. I got the minimum Y value by using
data[[All, 2]] // Min[#] & Now I want to get the X value corresponding to the minimum Y value. How can I get it? Thanks a lot for your help ..!
One solution that is three to four times as fast as the fastest solution so far (halirutan's compiled Do loop) is:
data[[Ordering[data[[All, 2]], 1], 1]] The obligatory timings:
MinimalBy[data, Last][[1, 1]] // RepeatedTiming SortBy[data, Last][[1, 1]] // RepeatedTiming SortBy[data, {Last}][[1, 1]] // RepeatedTiming TakeSmallestBy[data, Last, 1][[1, 1]] // RepeatedTiming minByLast[data] // RepeatedTiming data[[Ordering[data[[All, 2]], 1], 1]] // RepeatedTiming Output for random seed 5:
{2.0, 362.181} {0.53, 362.181} {0.49, 362.181} {0.756, 362.181} {0.12, 362.181} {0.04, {362.181}}
Random seed 42:
{1.7, 375.714} {0.50, 375.714} {0.46, 375.714} {0.78, 375.714} {0.12, 375.714} {0.032, {375.714}}
I note that even if you compile halirutan's code with CompilationTarget->"C" Ordering is still almost twice as fast.
Ordering[#,1] is probably treated as special case, because data[[Ordering[data[[All, 2]], 2], 1]] is almost 10 times slower... $\endgroup$ Ordering[#,1] can be seen as the first stage in an unfinished sort. Ordering[#,2] is a successive stage, so necessarily must be slower. $\endgroup$ data[[Ordering[data[[All, 2]], 1], 1]][[1]] or data[[Sequence @@ Ordering[data[[All, 2]], 1], 1]] to get the same output format (without the List) as the other methods. $\endgroup$ As it seem a compiled stupid Do loop is a viable alternative and still the fastest on my machine:
minByLast = Compile[{{data, _Real, 2}}, Module[{min = First[data]}, Do[ If[Last[min] > Last[d], min = d], {d, data}]; First[min] ] ] And in comparison with the methods proposed it still seems to win
SeedRandom[5] data = RandomReal[1000, {2000000, 2}]; MinimalBy[data, Last][[1, 1]] // RepeatedTiming SortBy[data, Last][[1, 1]] // RepeatedTiming SortBy[data, {Last}][[1, 1]] // RepeatedTiming TakeSmallestBy[data, Last, 1][[1, 1]] // RepeatedTiming minByLast[data] // RepeatedTiming SortBy is faster than non-compiled MinimalBy is beyond me... $\endgroup$ TakeSmallestBy function which should outperform all other solutions, because it is specifically made for, well, taking the smallest element from a list.. $\endgroup$ minByLast is the fasted even when the time needed by Compile is included in the timing measurement. $\endgroup$ @BlackKow raised an interesting point about speed of the two solutions we proposed in comments. Out of curiosity I timed the two solutions on a random data set:
SeedRandom[5] data = RandomReal[1000, {2000000, 2}]; MinimalBy[data, Last] // RepeatedTiming SortBy[data, Last][[1, 1]] // RepeatedTiming SortBy[data, {Last}][[1, 1]] // RepeatedTiming (* Out: {1.68, {{362.181, 0.000374484}}} {0.507, 362.181} {0.473, 362.181} *) It seems that SortBy is still faster than MinimalBy. The stable sort version (third option) is slightly faster still, since it doesn't go into breaking ties after the sort-by-last has been completed.
MinimalBy should take O(N)... $\endgroup$ TakeSmallestBy[data, Last, 1] $\endgroup$ TakeSmallestBy, introduced in version 10.1 is twice as slow as a SortBy and 4x slower than a simple Do loop written in less than a minute. This makes such things nothing more then syntactic sugar. $\endgroup$ My two one candidate:
SeedRandom[5] data = RandomReal[1000, {2000000, 2}]; (* First@Extract[data, Ordering[data[[All, 2]], 1]] // RepeatedTiming *) (* Sjoerd's *) First@Nearest[#2 -> #1, Min[#2]] & @@ Transpose[data] // RepeatedTiming (* {0.038, 362.181} (* Didnt' read Sjoerd's answer carefully enough first *) {0.029, 362.181} *) Sorry about that Sjoerd!
Nearest was improved in V10.1. $\endgroup$ Nearest has become very fast indeed. In v9 the same code is almost 20 times slower. +1 $\endgroup$ Nearest isn't as fast as Pick in the two closest questions, mathematica.stackexchange.com/q/10143 and mathematica.stackexchange.com/q/900. And, to my mind, Pick seems clearer. (I answered one question with this method, but just checked, and Pick is faster under certain conditions.) If you think I should add it to another question (of the two I mentioned, say), please suggest it and I will do it. $\endgroup$
MinimalBy[data,Last]$\endgroup$SortBy[data, Last] [[1, 1]]$\endgroup$Sorttake much more time in general case? $\endgroup$