Function to find the most common numeric ordered pairings (value, count)

Question

Let us say that I have a list of numeric ordered pairs where the first number represents a value to be counted and the second number represents the count of that value. The following is one way to generate the type of list I am thinking of:

Table[{RandomInteger[{1, 20}], RandomInteger[{1, 100}]}, 60]

As far as I am aware, there is no built-in function in MMA that can tell me which counted value has the highest count. E.g., in the following starting list, I'm looking for a result that would yield, for each counted value, the total of all counts across all pairs:

nop = {{14,7},{10,59},{12,99},{4,66},{16,80},{7,41},{1,37},{1,91},{19,6},{8,65},{18,90},{17,97},{4,64},{17,93},{20,67},{14,75},{19,88},{1,84},{10,5},{10,80},{3,78},{16,90},{4,71},{1,47},{5,97},{11,6},{4,73},{8,94},{11,42},{10,29},{2,34},{6,94},{13,64},{8,72},{4,19},{13,33},{13,47},{11,97},{3,27},{18,94},{7,34},{3,61},{15,59},{10,57},{6,25},{10,63},{5,99},{6,14},{11,17},{9,42},{15,63},{15,93},{18,58},{2,44},{19,48},{1,16},{11,41},{5,50},{17,15},{1,18}};

The result (sorted) would be:

result = {{1,293},{2,78},{3,166},{4,293},{5,246},{6,133},{7,75},{8,231},{9,42},{10,293},{11,203},{12,99},{13,144},{14,82},{15,215},{16,170},{17,205},{18,242},{19,142},{20,67}};

Please note that in result, there are three values that have an equivalent and maximal count: 1, 4, 10. I looked at Commonest, but I couldn't get it to work with example data such as the above.

Ultimately, I came up with the following which will take a list like nop and from that return the value portion(s) of the ordered pair(s) with the highest count value once the sub lists have been collated.

ClearAll[CommonestPairs]; CommonestPairs[events_List, counts_List]:=CommonestPairs[Thread[{events,counts}]]; CommonestPairs[data_List]:=Module[ {grouped,keySort,valueCounts}, (* group the sublists by their first element. this creates an association with the key being the value of a first element and the value being any sublists that have that key as a first element *) grouped = GroupBy[data, First]; (* create a list of lists where the first element will be a key from the previous step and the value will be the sum of all second elements in the lists associated with the key *) valueCounts = {#,Total[grouped[#][[All,2]]]}&/@ Keys[grouped]; (* PositionIndex gets all of the index values for each of the second (count) elements in valueCounts, which are then sorted in ascending order *) keySort = KeySort[PositionIndex[valueCounts[[All,2]]]]; (* get the index(es) associated with the largest key value and then retrieve the first element from valueCounts for each *) valueCounts[[Last[keySort]]][[All,1]] ];

Please forgive the verbosity of my comments within. I have a tenuous grasp of a lot of these concepts at the moment! That being said, for the input list of nop, it gives the expected answer of 1, 4, and 10.

My question is whether there's something extant in the standard MMA functions that solves these types of problems that I should use, and if not, how the above could be streamlined? As to the latter, I ask because whenever I write something at this point, I'm pretty sure there's a nice, elegant, less cumbersome solution that will cause me to learn more about MMA. FWIW, I saw another post that made use of GatherBy, but ultimately, that approach used the same number of steps (at least as far as my handling of it is concerned) as the approach I used in the function CommonestPairs above. Thank you.

lericr · Accepted Answer · 2024-08-16 22:42:05Z

Maybe not exactly what you're asking for, but related. Your data is basically a weighted data structure, and Commonest will work on that kind of data.

wd = WeightedData @@ Thread[nop] Commonest[wd] (* {10, 4, 1} *)

You can do statistics with this and it also works with Histogram.

Histogram[wd, {1}]

That is so sweet. Exactly what I was looking for: you took something that I had written which in my opinion is rather inscrutable and turned it into something succinct and totally approachable. — anonmous
– anonmous, Commented Aug 16, 2024 at 22:45

lericr · Accepted Answer · 2024-08-16 22:58:24Z

If you like the GroupBy functionality, there are functions that work on associations "transparently".

If you know how many "largest" counts you want to take, you could use TakeLargestBy:

TakeLargest[GroupBy[nop, First -> Last, Total], 5] (* <|10 -> 293, 4 -> 293, 1 -> 293, 5 -> 246, 18 -> 242|> *)

If you just want the max:

Max[GroupBy[nop, First -> Last, Total]] (* 293 *)

If you want all of the "maximals":

MaximalBy[GroupBy[nop, First -> Last, Total], Identity] (* <|10 -> 293, 4 -> 293, 1 -> 293|> *)

And you can also just sort an association by values:

ReverseSort[GroupBy[nop, First -> Last, Total]] (* <|1 -> 293, 4 -> 293, 10 -> 293, 5 -> 246, 18 -> 242, 8 -> 231, 15 -> 215, 17 -> 205, 11 -> 203, 16 -> 170, 3 -> 166, 13 -> 144, 19 -> 142, 6 -> 133, 12 -> 99, 14 -> 82, 2 -> 78, 7 -> 75, 20 -> 67, 9 -> 42|> *)

Succinct and powerful. I'll definitely be making use of these. — anonmous
– anonmous, Commented Aug 17, 2024 at 13:49

creidhne · Accepted Answer · 2024-08-16 22:34:14Z

The GroupBy function has a form that can apply Total to get your result. We group using the first element of nop then total the last elements of each group. The result is an association of rules, which we convert to a list, and replace Rule with List.

nop = {{14,7},{10,59},{12,99},{4,66},{16,80},{7,41},{1,37},{1,91},{19,6},{8,65},{18,90},{17,97},{4,64},{17,93},{20,67},{14,75},{19,88},{1,84},{10,5},{10,80},{3,78},{16,90},{4,71},{1,47},{5,97},{11,6},{4,73},{8,94},{11,42},{10,29},{2,34},{6,94},{13,64},{8,72},{4,19},{13,33},{13,47},{11,97},{3,27},{18,94},{7,34},{3,61},{15,59},{10,57},{6,25},{10,63},{5,99},{6,14},{11,17},{9,42},{15,63},{15,93},{18,58},{2,44},{19,48},{1,16},{11,41},{5,50},{17,15},{1,18}}; result = {{1,293},{2,78},{3,166},{4,293},{5,246},{6,133},{7,75},{8,231},{9,42},{10,293},{11,203},{12,99},{13,144},{14,82},{15,215},{16,170},{17,205},{18,242},{19,142},{20,67}}; Sort[ (GroupBy[nop, First -> Last, Total] // Normal) /. Rule -> List] == result (* True *)

A different sort using the same GroupBy method shows the most common totals are 1, 4, and 10.

ReverseSortBy[(GroupBy[nop, First -> Last, Total] // Normal) /. Rule -> List, {Last@#, -First@#}&]

{{1,293},{4,293},{10,293},{5,246},{18,242},{8,231},{15,215},{17,205},{11,203},{16,170},{3,166},{13,144},{19,142},{6,133},{12,99},{14,82},{2,78},{7,75},{20,67},{9,42}}

Your expected answer is easily found from result:

MaximalBy[result, Last][[All, 1]] (* {1, 4, 10} *)

Stack Exchange Network

Function to find the most common numeric ordered pairings (value, count)

3 Answers 3

Hot Network Questions

Function to find the most common numeric ordered pairings (value, count)

3 Answers 3

Related

Hot Network Questions