6
$\begingroup$

Very basic question. Probably showing my ignorance, but if I have a dataset like the example in the docs

dataset = Dataset[{ <|"a" -> 1, "b" -> "x", "c" -> {1}|>, <|"a" -> 2, "b" -> "y", "c" -> {2, 3}|>, <|"a" -> 3, "b" -> "z", "c" -> {3}|>, <|"a" -> 4, "b" -> "x", "c" -> {4, 5}|>, <|"a" -> 5, "b" -> "y", "c" -> {5, 6, 7}|>, <|"a" -> 6, "b" -> "z", "c" -> {}|>}] 

and I want to group by column "b" I type

dataset[GroupBy["b"]] 

I get the beautiful result

enter image description here

But if I just want columns "a" and "c", how do I get them? Naively I type

dataset[GroupBy["b"], {"a", "c"}] 

but that crashes and burns enter image description here

What's going on? How do I select what columns I want after GroupBy?

$\endgroup$
3
  • $\begingroup$ Note you get a nested dataset after you GroupBy["b"] $\endgroup$ Commented Jul 18, 2017 at 15:53
  • $\begingroup$ So GroupBy doesn't "descend"? $\endgroup$ Commented Jul 18, 2017 at 16:44
  • $\begingroup$ Dataset`AscendingQ and Dataset`DescendingQ can tell you it is or not. $\endgroup$ Commented Jul 18, 2017 at 16:53

3 Answers 3

10
$\begingroup$
dataset[GroupBy[Key["b"] -> KeyDrop["b"]]] 

or

dataset[GroupBy[Key["b"] -> KeyTake[{"a", "c"}]]] 

or

dataset[GroupBy["b"], KeyTake[{"a", "c"}]] 

or

dataset[GroupBy["b"], All, {"a", "c"}] 
$\endgroup$
3
  • $\begingroup$ Thanks for the answer, but it's only half what I'm trying to understand. What specifically is wrong with dataset[GroupBy["b"], {"a", "c"}]? Why is the "All" needed? How would I have known that? What did the error message try to tell me? $\endgroup$ Commented Jul 18, 2017 at 15:54
  • $\begingroup$ @ChrisNadovich I try to stay away from explaining to much when it comes to datasets. I failed to grasp details and there are people around who didn't fail while I'm just using them intuitively :) You can hold on with an accept to not discourage those folks from answering. $\endgroup$ Commented Jul 19, 2017 at 8:29
  • $\begingroup$ Intuition never seems to work for me with these Mathematica Dataset things. I do just fine with SQL intuition, when using SQL, but it leads me to madness here. Anyway, I think I found the answer I was looking for. Group by is considered "descending" (you can look it up!) but it, in fact, does not descend. It's a kind of a non-operator. So that's why the "All" is needed: because All would have been needed to descend past the rows without the GroupBy, so it's still needed. $\endgroup$ Commented Jul 21, 2017 at 17:02
6
$\begingroup$

This answer (19542) is incorrect in stating that GroupBy is not a descending operator. Evaluate the following:

ClearAll[f]; dataset[GroupBy["b"] /* f] 
f[<|"x" -> {<|"a" -> 1, "b" -> "x", "c" -> {1}|>, <|"a" -> 4, "b" -> "x", "c" -> {4, 5}|>}, "y" -> {<|"a" -> 2, "b" -> "y", "c" -> {2, 3}|>, <|"a" -> 5, "b" -> "y", "c" -> {5, 6, 7}|>}, "z" -> {<|"a" -> 3, "b" -> "z", "c" -> {3}|>, <|"a" -> 6, "b" -> "z", "c" -> {}|>} |>] 

When you GroupBy you add another level to the Association hierarchy. The first level (where f is above) contains the keys of the grouping.

dataset[GroupBy["b"], f] 

Mathematica graphics

The second level (where f is above) contains all the records that have been grouped. This is a list of Associations. You must indicate which of these Associations in the list you want to access. In this case all are wanted so All is specified.

dataset[GroupBy["b"], All, f] 

Mathematica graphics

f is now at the level where the Associations can be accessed. Here you can enter the keys of the items you wish to return.

dataset[GroupBy["b"], All, {"a", "b"}] 

Mathematica graphics

As can be seen from the steps above, GroupBy does act while descending. The issue the OP has (hopefully now: had) is not understanding the affect of effecting GroupBy.

Hope this helps.

$\endgroup$
3
$\begingroup$

Practical answers are given by @Kuba, but the real answer I was looking for is this.

Even though GroupBy is considered "descending" (you can look it up!), it, in fact, does not descend. It's a kind of a non-operator. So that's why selecting the columns doesn't work in my question.

First I need to start at the top level and move past the rows. So the All is needed: because All would have been needed to descend past the rows without the GroupBy, so it's still needed.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.