How to find rows that have maximum value?

Question

Suppose if I have following list

{ {10,b,30}, {100,a,40}, {1000,b,10}, {1000,b,70}, {100,b,20}, {10,b,70} }

How to find rows that have max value in 3rd column, in this case

(*{{1000,b,70},{10,b,70}}*)

How big is your real life list? I think Pick would probably be fastest (@J.M solution) if the list is long. — Mike Honeychurch
– Mike Honeychurch, Commented Jan 28, 2012 at 23:23

Community · Accepted Answer · 2017-04-13 12:56:06Z

With:

dat = {{10, b, 30}, {100, a, 40}, {1000, b, 10}, {1000, b, 70}, {100, b, 20}, {10, b, 70}};

Perhaps most directly:

Cases[dat, {_, _, Max@dat[[All, 3]]}]

More approaches:

Last @ SplitBy[SortBy[dat, {#[[3]] &}], #[[3]] &]
Pick[dat, #, Max@#] &@dat[[All, 3]]
Reap[Fold[(If[#2[[3]] >= #, Sow@#2]; #2[[3]]) &, dat]][[2, 1]]

Of these Pick appears to be concise and efficient, so it is my recommendation.

Edit: Position and Extract are three times as efficient as Pick on some data. Using Transpose is slightly more efficient on packed rectangular data.

dat ~Extract~ Position[#, Max@#] & @ dat[[All, 3]]
dat ~Extract~ Position[#, Max@#] & @ Part[dat\[Transpose], 3]

Here are some timings performed in version 7:

SetAttributes[timeAvg, HoldFirst] timeAvg[func_] := Do[If[# > 0.3, Return[#/5^i]] & @@ Timing@Do[func, {5^i}], {i, 0, 15}] SeedRandom[1] dat = RandomInteger[99999, {500000, 3}]; Cases[dat, {_, _, Max@dat[[All, 3]]}] // timeAvg Last@SplitBy[SortBy[dat, {#[[3]] &}], #[[3]] &] // timeAvg Pick[dat, #, Max@#] &@dat[[All, 3]] // timeAvg Reap[Fold[(If[#2[[3]] >= #, Sow@#2]; #2[[3]]) &, dat]][[2, 1]] // timeAvg dat ~Extract~ Position[#, Max@#] &@dat[[All, 3]] // timeAvg dat ~Extract~ Position[#, Max@#] &@Part[dat\[Transpose], 3] // timeAvg

0.1278

0.764

0.0904

0.904

0.02996

0.02496

(In actuality I restarted the Kernel between each individual timing line as otherwise each run gets slower, unfairly biasing the test toward the earlier lines.)

These can be further optimized by using faster position functions for numeric data.
Michael E2 recommended compiling (probably faster in versions after 7):

pos = Compile[{{list, _Real, 1}, {pat, _Real}}, Position[list, pat]]; dat ~Extract~ pos[#, Max@#] & @ Part[dat\[Transpose], 3] // timeAvg

0.01372

My favorite method is SparseArray properties:

spos = SparseArray[Unitize[#], Automatic, 1]["AdjacencyLists"] &; dat[[spos[# - Max@#]]] & @ Part[dat\[Transpose], 3] // timeAvg

0.002872

This is now about 30X faster than Pick, my original recommendation.

In this case, the #[[3]] & is replaceable with Last[], e.g. Last[SplitBy[SortBy[data, Last], Last]]. — J. M.'s missing motivation
– J. M.'s missing motivation, Commented Jan 28, 2012 at 23:26
@Istvan perhaps, but I was typing these (and updating the question) as I thought of them. I didn't compare performance (or significantly consider performance, really) until after I posted the four methods. After testing I could have changed my answer to keep only the Pick version but by that time someone else posted a method using Pick. Also, I think it is interesting to show different approaches, even if for a specific problem some of them are contrived or awkward because they may not be on another problem. — Mr.Wizard
– Mr.Wizard, Commented Jan 29, 2012 at 16:56
The compiled version of Position is even faster (I think it is optimized to check only level 1). Try sticking pos = Compile[{{list, _Real, 1}, {pat, _Real}}, Position[list, pat]] into the last two. I get a speedup of more than a factor of 2 in V9.0.1 — Michael E2
– Michael E2, Commented Sep 18, 2013 at 18:38
This is the earliest ref. to timeAvg I've found on Mma.SE. I thought you might like to compare it with a new function in V10, Needs["GeneralUtilities"]; AccurateTiming[Range[10^4]]`. — Michael E2
– Michael E2, Commented Jul 15, 2014 at 5:44

J. M.'s missing motivation · Accepted Answer · 2012-01-28 23:17:56Z

14

This works:

data = {{10, b, 30}, {100, a, 40}, {1000, b, 10}, {1000, b, 70}, {100, b, 20}, {10, b, 70}}; Pick[data, data[[All, 3]], Max[data[[All, 3]]]]

answered Jan 28, 2012 at 23:17

J. M.'s missing motivation

127k11 gold badges411 silver badges590 bronze badges

$\begingroup$ Or maybe Pick[#, #[[All, 3]], Max[Last /@ #]] &[data], following Wizard's style. $\endgroup$

J. M.'s missing motivation
– J. M.'s missing motivation

2012-01-28 23:31:46 +00:00
Commented Jan 28, 2012 at 23:31
$\begingroup$ It appears that I copied this in my edit but I did not. For what it's worth you are doing the extraction twice which seems inefficient, so I think mine is still better. EDIT I see you see that now. I need to refresh the page more often I guess. $\endgroup$

Mr.Wizard
– Mr.Wizard

2012-01-28 23:32:10 +00:00
Commented Jan 28, 2012 at 23:32
$\begingroup$ Not that much; but I suppose a With[] can be useful... $\endgroup$

J. M.'s missing motivation
– J. M.'s missing motivation

2012-01-28 23:36:49 +00:00
Commented Jan 28, 2012 at 23:36

Add a comment |

rm -rf · Accepted Answer · 2012-01-28 23:20:43Z

You can use Select to choose only those rows with the maximum value in the third column.

list = {{10, b, 30}, {100, a, 40}, {1000, b, 10}, {1000, b, 70}, {100, b, 20}, {10, b, 70}}; With[{max = Max@list[[All, 3]]}, Select[list, (#[[3]] == max) &]]

Greg Hurst · Accepted Answer · 2016-10-28 21:27:54Z

As of version 10 you can use MaximalBy:

data = {{10, b, 30}, {100, a, 40}, {1000, b, 10}, {1000, b, 70}, {100, b, 20}, {10, b, 70}}; MaximalBy[data, Last]

{{1000, b, 70}, {10, b, 70}}

Community · Accepted Answer · 2017-04-13 12:56:02Z

Here's a method with a stable sort involving two of my favorite functions, Reap and Sow:

Module[{a = -Infinity}, Reap[ Sow[{##}, a = Max[a, #3]; #3] & @@@ dat, _, If[#1 == a, #2, Unevaluated[Sequence[]]] & ][[2, 1]] ]

The way Reap and Sow work is that Sow attaches to each term a tag, and Reap collects those tags according to a Pattern (second parameter), and a function can then be applied to the collected terms (third parameter).

In this case, I use the third element of the tuple as the tag, while keeping a running total of the Max value, a. And for the function, it determines which tuple has a tag equal to the Max, spitting out an empty Sequence if it doesn't.

As a curious note, initially I tried attaching the test to the Pattern parameter, but it is applied before the list has been fully traversed, so it included tuples that did not have a max third term. Apparently, the function is applied after the list has been traversed, so a had attained its maximum value by the point it was used.

Community · Accepted Answer · 2017-04-13 12:55:49Z

Update: Here's a nice and short one (if not fast):

data = {{10, b, 30}, {100, a, 40}, {1000, b, 10}, {1000, b, 70}, {100, b, 20}, {10, b, 70}} Last@SplitBy[SortBy[data, Last], Last] (* ==> {{10, b, 70}, {1000, b, 70}} *)

You got many nice solutions. I'd like to add one more, which is less general, and only works when there's a singe maximum, but illustrates nicely how Ordering is useful for minimum/maximum element problems:

Analogously to SortBy, we can define

MaxBy[list_, fun_] := list[[First@Ordering[fun /@ list, -1]]]

Then with your data,

MaxBy[data, Last]

Again, this will give you a single result only, not two as in your example.

@Prashant: I'll only note that, as shown by the result of SortBy[data, Last], the sorting method used is unstable. It may or may not matter for your application, but you need to keep this in mind. — J. M.'s missing motivation
– J. M.'s missing motivation, Commented Jan 28, 2012 at 23:58

LCarvalho · Accepted Answer · 2016-10-28 21:20:22Z

Another option:

data = {{10, b, 30}, {100, a, 40}, {1000, b, 10}, {1000, b, 70}, {100, b, 20}, {10, b, 70}}

$\left( \begin{array}{ccc} 10 & b & 30 \\ 100 & a & 40 \\ 1000 & b & 10 \\ 1000 & b & 70 \\ 100 & b & 20 \\ 10 & b & 70 \\ \end{array} \right)$

data[[#]] & /@ First /@ Position[data, Max@data[[All, 3]]]

$\left( \begin{array}{ccc} 1000 & b & 70 \\ 10 & b & 70 \\ \end{array} \right)$

Stack Exchange Network

How to find rows that have maximum value?

7 Answers 7

Linked

Hot Network Questions

How to find rows that have maximum value?

7 Answers 7

Linked

Related

Hot Network Questions