Skip to main content
edited body
Source Link

Its a bit late to add a comment, but I found that GeneralUtilities` has a some operators such as AssociationPairs and AssociationMapThread. I used them to adjust the internal Dataset format. Since a GroupBy leaves a single association in the Dataset whose keys are the grouping keys, you need to process what is essentially a single association and make each k-> v in that association a "row." I used

dsGroupByResult[AssociationPairs] 

to fix the structure. However it loses the column names and the "rows" are now lists.

To add columns back in, I use AssociationMapThread to "add" the columns back in and restructured back into a list of associations. My GroupBys usually output an association of values (e.g. mean, min, max for a numerical leaf column) so I just use ##2 since it already has keys on the values.

dsGroupByResult[AssocationMapThread[<|"theGroupingKeyColumnName"->#1 (*or whatever *), ##2|>)&]##2|>]&] 

I think both of these functions should be included in the standard package.

Its a bit late to add a comment, but I found that GeneralUtilities` has a some operators such as AssociationPairs and AssociationMapThread. I used them to adjust the internal Dataset format. Since a GroupBy leaves a single association in the Dataset whose keys are the grouping keys, you need to process what is essentially a single association and make each k-> v in that association a "row." I used

dsGroupByResult[AssociationPairs] 

to fix the structure. However it loses the column names and the "rows" are now lists.

To add columns back in, I use AssociationMapThread to "add" the columns back in and restructured back into a list of associations. My GroupBys usually output an association of values (e.g. mean, min, max for a numerical leaf column) so I just use ##2 since it already has keys on the values.

dsGroupByResult[AssocationMapThread[<|"theGroupingKeyColumnName"->#1 (*or whatever *), ##2|>)&] 

I think both of these functions should be included in the standard package.

Its a bit late to add a comment, but I found that GeneralUtilities` has a some operators such as AssociationPairs and AssociationMapThread. I used them to adjust the internal Dataset format. Since a GroupBy leaves a single association in the Dataset whose keys are the grouping keys, you need to process what is essentially a single association and make each k-> v in that association a "row." I used

dsGroupByResult[AssociationPairs] 

to fix the structure. However it loses the column names and the "rows" are now lists.

To add columns back in, I use AssociationMapThread to "add" the columns back in and restructured back into a list of associations. My GroupBys usually output an association of values (e.g. mean, min, max for a numerical leaf column) so I just use ##2 since it already has keys on the values.

dsGroupByResult[AssocationMapThread[<|"theGroupingKeyColumnName"->#1 (*or whatever *), ##2|>]&] 

I think both of these functions should be included in the standard package.

Source Link

Its a bit late to add a comment, but I found that GeneralUtilities` has a some operators such as AssociationPairs and AssociationMapThread. I used them to adjust the internal Dataset format. Since a GroupBy leaves a single association in the Dataset whose keys are the grouping keys, you need to process what is essentially a single association and make each k-> v in that association a "row." I used

dsGroupByResult[AssociationPairs] 

to fix the structure. However it loses the column names and the "rows" are now lists.

To add columns back in, I use AssociationMapThread to "add" the columns back in and restructured back into a list of associations. My GroupBys usually output an association of values (e.g. mean, min, max for a numerical leaf column) so I just use ##2 since it already has keys on the values.

dsGroupByResult[AssocationMapThread[<|"theGroupingKeyColumnName"->#1 (*or whatever *), ##2|>)&] 

I think both of these functions should be included in the standard package.