21

I have a List name availableSeats I am sorting and grouping by the blockIndex property like below:

availableSeats.stream() .sorted(Comparator.comparing(SeatedTicketAssignment::getBlockIndex)) .collect(Collectors.groupingBy(SeatedTicketAssignment::getBlockIndex)) .forEach((block, blockAssignments) -> { //Rest of the code } 

The problem is that the result of grouping by is not sorted by the blockIndex.

3 Answers 3

50

Keep in mind, Collectors#groupingBy(Function) will return a HashMap, which does not guarantee order. If you want the ordering to be in the order that the aggregate identity (your i % 2 == 0 result) shows up, then you can use LinkedHashMap:

.collect(Collectors.groupingBy(i -> i % 2 == 0, LinkedHashMap::new, Collectors.toList())) 

Would return a LinkedHashMap<Boolean, List<SeatedTicketAssignment>> (since your collector is grouped by a boolean). Additionally, since the list utilized by the collector is an ArrayList, it should preserve the iteration order of the stream relative to the list.

Sign up to request clarification or add additional context in comments.

5 Comments

If the key type is Boolean, you can use partitioningBy and get a map that has an intrinsic order, however, where in the question does that key function appear? Since the order is that of the previous sorting (by the same value) step, removing the sort operation and collecting into a TreeMap would be much simpler…
Thaks for saving my time.
@Holger, the list created by the third argument of groupingBy is supposed to preserve the entered order of the initial list?
@Cristiano It returns an ArrayList, so relative to whatever order the stream was in before for a sequential stream.
@Cristiano the toList() collector preserves the order. But the final result is a Map containing up to two groups. When grouping into a LinkedHashMap, the map’s order depends on the first element. When grouping into a TreeMap, it will always be false, true. When using partitioningBy(i -> i % 2 == 0), the order also will always be false, true, but this has not been documented. Further, when using partitioningBy(i -> i % 2 == 0) both keys will always exist even when not appearing in the stream (then, with an empty list). This missing information has been added to the doc in Java 9
3

Unfortunately Stream API implementation is not aware of the fact, that the stream you pass is already sorted by what you need so "grouping" is actually trivial. Thus it uses the default way which is essentially similar to this SO answer i.e. create a Map for groups and fill it with elements of the stream. And by default the Map implementation that is used is the HashMap (see code here) which is good for performance reasons but bad for your goal because HashMap doesn't preserve the order of the keys than first sorting.

It might seems a bit unlucky that Group By is implemented in Stream API only as a "collector" so you can't first group and then sort in a one-liner. But this seems intentional: there is no way to implement Group By without fully materializing result so it can't be lazy and thus has to be a collector. @Rogue provided a nice trick with LinkedHashMap but to me it is to bound to implementation details. Still I would write a few more lines of code and first group and then sort the entries of the list (i.e. actually grouped HashMap) by key. Most probably it would be faster.

11 Comments

Why sorting afterwards, when you can sort while grouping? Collectors.groupingBy(SeatedTicketAssignment::getBlockIndex, TreeMap::new, Collectors.toList()), if that’s not a one-liner, I don’t know…
@Holger, it really depends on the data. The issue is that TreeMap as any other sorting has O(N*log(N)) complexity against O(N) for HashMap. If groupping just reduce number of elements just 2 of 3 time TreeMap would probably win, but if we group in something like sqrt(N) buckets, first grouping and then sorting will be faster. Another point is that IMHO this as Rogue answer relies on implementation details too much which is bad (Example: would you guarantee, that the order will always be preserved if later someone change the code to parallel streams?). Thus I prefer explicit code.
Rogue’s answer does not rely on implementation details. Preserving the encounter order is guaranteed, regardless of whether you use a parallel stream or not. But his answer, using LinkedHashMap, still relies on a preceding sort operation, which not only has an O(n log n) worst case, but also needs a temporary O(n) storage operation before the collection. Plus the actual collect operation. So collecting directly into a TreeMap without a preceding sort operation will likely be faster.
As a fun fact, even if you don’t need a concurrent map here, if you are looking for a concurrent map that is intrinsically sorted, ConcurrentSkipListMap is there for you since Java 6…
groupingByConcurrent is an unordered collector, so it does not maintain the stream’s original encounter order, if there ever was one, but if you use ConcurrentSkipListMap as target, it will sort the inserted elements anyway and that sorted order is what you are interested in.
|
3

Since the groupingBy collector does not require sorted input, you can sort the groups after collection. This will be faster than sorting items first, anyway, assuming that there are fewer groups than items:

availableSeats.stream() .collect(Collectors.groupingBy(SeatedTicketAssignment::getBlockIndex)) .entrySet().stream() .sorted(Comparator.comparing(Map.Entry::getKey)) .forEach(mapEntry -> { //Rest of the code } 

1 Comment

You can also use Map.Entry.comparingByKey()

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.