Stream doesn't have a last() method:
Stream<T> stream; T last = stream.last(); // No such method What's the most elegant and/or efficient way to get the last element (or null for an empty Stream)?
Do a reduction that simply returns the current value:
Stream<T> stream; T last = stream.reduce((a, b) -> b).orElse(null); O(n), even if divided by the number of CPU cores. Since the stream doesn’t know what the reduction function does, it still has to evaluate it for every element.This heavily depends on the nature of the Stream. Keep in mind that “simple” doesn’t necessarily mean “efficient”. If you suspect the stream to be very large, carrying heavy operations or having a source which knows the size in advance, the following might be substantially more efficient than the simple solution:
static <T> T getLast(Stream<T> stream) { Spliterator<T> sp=stream.spliterator(); if(sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) { for(;;) { Spliterator<T> part=sp.trySplit(); if(part==null) break; if(sp.getExactSizeIfKnown()==0) { sp=part; break; } } } T value=null; for(Iterator<T> it=recursive(sp); it.hasNext(); ) value=it.next(); return value; } private static <T> Iterator<T> recursive(Spliterator<T> sp) { Spliterator<T> prev=sp.trySplit(); if(prev==null) return Spliterators.iterator(sp); Iterator<T> it=recursive(sp); if(it!=null && it.hasNext()) return it; return recursive(prev); } You may illustrate the difference with the following example:
String s=getLast( IntStream.range(0, 10_000_000).mapToObj(i-> { System.out.println("potential heavy operation on "+i); return String.valueOf(i); }).parallel() ); System.out.println(s); It will print:
potential heavy operation on 9999999 9999999 In other words, it did not perform the operation on the first 9999999 elements but only on the last one.
hasCharacteristics() block? What value does it add that is not already covered by the recursive() method? The latter already navigates to the last split point. Furthermore, recursive() can never return null so you can remove the it != null check.SUBSIZED stream that can guaranty non-empty split halfs so we never need to go back to the left side. Note that in that case recursive won’t actually recurse as the trySplit has already proven to return null.null-check stems from an earlier version but then I discovered that for non-SUBSIZED streams you have to deal with possible empty split parts, i.e. you have to iterate to find out whether it has values, therefore I moved the Spliterators.iterator(…) call into a recursive method to be able to back up to the left side if the right side is empty. The loop is still the preferred operation.parallel() as this may actually perform some operations (like sorting) in parallel unexpectedly consuming more CPU cores..parallel(), but indeed, it can have an effect on sorted() or distinct(). I don’t think, that there should be an effect for any of the other intermediate operations…Guava has Streams.findLast:
Stream<T> stream; T last = Streams.findLast(stream); reduce((a, b) -> b) because it uses Spliterator.trySplit internallyThis is just a refactoring of Holger's answer because the code, while fantastic, is a bit hard to read/understand, especially for people who were not C programmers before Java. Hopefully my refactored example class is a little easier to follow for those who are not familiar with spliterators, what they do, or how they work.
public class LastElementFinderExample { public static void main(String[] args){ String s = getLast( LongStream.range(0, 10_000_000_000L).mapToObj(i-> { System.out.println("potential heavy operation on "+i); return String.valueOf(i); }).parallel() ); System.out.println(s); } public static <T> T getLast(Stream<T> stream){ Spliterator<T> sp = stream.spliterator(); if(isSized(sp)) { sp = getLastSplit(sp); } return getIteratorLastValue(getLastIterator(sp)); } private static boolean isSized(Spliterator<?> sp){ return sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED); } private static <T> Spliterator<T> getLastSplit(Spliterator<T> sp){ return splitUntil(sp, s->s.getExactSizeIfKnown() == 0); } private static <T> Iterator<T> getLastIterator(Spliterator<T> sp) { return Spliterators.iterator(splitUntil(sp, null)); } private static <T> T getIteratorLastValue(Iterator<T> it){ T result = null; while (it.hasNext()){ result = it.next(); } return result; } private static <T> Spliterator<T> splitUntil(Spliterator<T> sp, Predicate<Spliterator<T>> condition){ Spliterator<T> result = sp; for (Spliterator<T> part = sp.trySplit(); part != null; part = result.trySplit()){ if (condition == null || condition.test(result)){ result = part; } } return result; } } Here is another solution (not that efficient):
List<String> list = Arrays.asList("abc","ab","cc"); long count = list.stream().count(); list.stream().skip(count-1).findFirst().ifPresent(System.out::println); substream method, and even if there were this wouldn't work because count is a terminal operation. So what is the story behind this?count==0 first as Stream.skip doesn’t like -1 as input. Besides that, the question didn’t say that you can acquire the Stream twice. Nor did it say that acquiring the Stream twice is guaranteed to get the same number of elements.Parallel unsized streams with 'skip' methods are tricky and the @Holger's implementation gives a wrong answer. Also @Holger's implementation is a bit slower because it uses iterators.
An optimisation of @Holger answer:
public static <T> Optional<T> last(Stream<? extends T> stream) { Objects.requireNonNull(stream, "stream"); Spliterator<? extends T> spliterator = stream.spliterator(); Spliterator<? extends T> lastSpliterator = spliterator; // Note that this method does not work very well with: // unsized parallel streams when used with skip methods. // on that cases it will answer Optional.empty. // Find the last spliterator with estimate size // Meaningfull only on unsized parallel streams if(spliterator.estimateSize() == Long.MAX_VALUE) { for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) { lastSpliterator = prev; } } // Find the last spliterator on sized streams // Meaningfull only on parallel streams (note that unsized was transformed in sized) for (Spliterator<? extends T> prev = lastSpliterator.trySplit(); prev != null; prev = lastSpliterator.trySplit()) { if (lastSpliterator.estimateSize() == 0) { lastSpliterator = prev; break; } } // Find the last element of the last spliterator // Parallel streams only performs operation on one element AtomicReference<T> last = new AtomicReference<>(); lastSpliterator.forEachRemaining(last::set); return Optional.ofNullable(last.get()); } Unit testing using junit 5:
@Test @DisplayName("last sequential sized") void last_sequential_sized() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed(); stream = stream.skip(50_000).peek(num -> count.getAndIncrement()); assertThat(Streams.last(stream)).hasValue(expected); assertThat(count).hasValue(9_950_000L); } @Test @DisplayName("last sequential unsized") void last_sequential_unsized() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed(); stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel()); stream = stream.skip(50_000).peek(num -> count.getAndIncrement()); assertThat(Streams.last(stream)).hasValue(expected); assertThat(count).hasValue(9_950_000L); } @Test @DisplayName("last parallel sized") void last_parallel_sized() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel(); stream = stream.skip(50_000).peek(num -> count.getAndIncrement()); assertThat(Streams.last(stream)).hasValue(expected); assertThat(count).hasValue(1); } @Test @DisplayName("getLast parallel unsized") void last_parallel_unsized() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel(); stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel()); stream = stream.peek(num -> count.getAndIncrement()); assertThat(Streams.last(stream)).hasValue(expected); assertThat(count).hasValue(1); } @Test @DisplayName("last parallel unsized with skip") void last_parallel_unsized_with_skip() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel(); stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel()); stream = stream.skip(50_000).peek(num -> count.getAndIncrement()); // Unfortunately unsized parallel streams does not work very well with skip //assertThat(Streams.last(stream)).hasValue(expected); //assertThat(count).hasValue(1); // @Holger implementation gives wrong answer!! //assertThat(Streams.getLast(stream)).hasValue(9_950_000L); //!!! //assertThat(count).hasValue(1); // This is also not a very good answer better assertThat(Streams.last(stream)).isEmpty(); assertThat(count).hasValue(0); } The only solution to support both's scenarios is to avoid detecting the last spliterator on unsized parallel streams. The consequence is that the solution will perform operations on all elements but it will give always the right answer.
Note that in sequential streams, it will anyway perform operations on all elements.
public static <T> Optional<T> last(Stream<? extends T> stream) { Objects.requireNonNull(stream, "stream"); Spliterator<? extends T> spliterator = stream.spliterator(); // Find the last spliterator with estimate size (sized parallel streams) if(spliterator.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) { // Find the last spliterator on sized streams (parallel streams) for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) { if (spliterator.getExactSizeIfKnown() == 0) { spliterator = prev; break; } } } // Find the last element of the spliterator //AtomicReference<T> last = new AtomicReference<>(); //spliterator.forEachRemaining(last::set); //return Optional.ofNullable(last.get()); // A better one that supports native parallel streams return (Optional<T>) StreamSupport.stream(spliterator, stream.isParallel()) .reduce((a, b) -> b); } With regard to the unit testing for that implementation, the first three tests are exactly the same (sequential & sized parallel). The tests for unsized parallel are here:
@Test @DisplayName("last parallel unsized") void last_parallel_unsized() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel(); stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel()); stream = stream.peek(num -> count.getAndIncrement()); assertThat(Streams.last(stream)).hasValue(expected); assertThat(count).hasValue(10_000_000L); } @Test @DisplayName("last parallel unsized with skip") void last_parallel_unsized_with_skip() throws Exception { long expected = 10_000_000L; AtomicLong count = new AtomicLong(); Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel(); stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel()); stream = stream.skip(50_000).peek(num -> count.getAndIncrement()); assertThat(Streams.last(stream)).hasValue(expected); assertThat(count).hasValue(9_950_000L); } StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel()), going through an Iterable detour that has no characteristics at all, in other words creates an unordered stream. Thus, the result has nothing to do with parallel or using skip, but just with the fact that “last” has no meaning for an unordered stream, so any element is a valid result.We needed last of a Stream in production - I am still not sure we really did, but various team members on my team said we did because of various "reasons". I ended up writing something like this:
private static class Holder<T> implements Consumer<T> { T t = null; // needed to null elements that could be valid boolean set = false; @Override public void accept(T t) { this.t = t; set = true; } } /** * when a Stream is SUBSIZED, it means that all children (direct or not) are also SIZED and SUBSIZED; * meaning we know their size "always" no matter how many splits are there from the initial one. * <p> * when a Stream is SIZED, it means that we know it's current size, but nothing about it's "children", * a Set for example. */ private static <T> Optional<Optional<T>> last(Stream<T> stream) { Spliterator<T> suffix = stream.spliterator(); // nothing left to do here if (suffix.getExactSizeIfKnown() == 0) { return Optional.empty(); } return Optional.of(Optional.ofNullable(compute(suffix, new Holder()))); } private static <T> T compute(Spliterator<T> sp, Holder holder) { Spliterator<T> s; while (true) { Spliterator<T> prefix = sp.trySplit(); // we can't split any further // BUT don't look at: prefix.getExactSizeIfKnown() == 0 because this // does not mean that suffix can't be split even more further down if (prefix == null) { s = sp; break; } // if prefix is known to have no elements, just drop it and continue with suffix if (prefix.getExactSizeIfKnown() == 0) { continue; } // if suffix has no elements, try to split prefix further if (sp.getExactSizeIfKnown() == 0) { sp = prefix; } // after a split, a stream that is not SUBSIZED can give birth to a spliterator that is if (sp.hasCharacteristics(Spliterator.SUBSIZED)) { return compute(sp, holder); } else { // if we don't know the known size of suffix or prefix, just try walk them individually // starting from suffix and see if we find our "last" there T suffixResult = compute(sp, holder); if (!holder.set) { return compute(prefix, holder); } return suffixResult; } } s.forEachRemaining(holder::accept); // we control this, so that Holder::t is only T return (T) holder.t; } And some usages of it:
Stream<Integer> st = Stream.concat(Stream.of(1, 2), Stream.empty()); System.out.println(2 == last(st).get().get()); st = Stream.concat(Stream.empty(), Stream.of(1, 2)); System.out.println(2 == last(st).get().get()); st = Stream.concat(Stream.iterate(0, i -> i + 1), Stream.of(1, 2, 3)); System.out.println(3 == last(st).get().get()); st = Stream.concat(Stream.iterate(0, i -> i + 1).limit(0), Stream.iterate(5, i -> i + 1).limit(3)); System.out.println(7 == last(st).get().get()); st = Stream.concat(Stream.iterate(5, i -> i + 1).limit(3), Stream.iterate(0, i -> i + 1).limit(0)); System.out.println(7 == last(st).get().get()); String s = last( IntStream.range(0, 10_000_000).mapToObj(i -> { System.out.println("potential heavy operation on " + i); return String.valueOf(i); }).parallel() ).get().get(); System.out.println(s.equalsIgnoreCase("9999999")); st = Stream.empty(); System.out.println(last(st).isEmpty()); st = Stream.of(1, 2, 3, 4, null); System.out.println(last(st).get().isEmpty()); st = Stream.of((Integer) null); System.out.println(last(st).isPresent()); IntStream is = IntStream.range(0, 4).filter(i -> i != 3); System.out.println(last(is.boxed())); First is the return type of Optional<Optional<T>> - it looks weird, I agree. If the first Optional is empty it means that there are no elements in the Stream; if the second Optional is empty it means that the element that was last, was actually null, i.e.: Stream.of(1, 2, 3, null) (unlike guava's Streams::findLast that throws an Exception in such a case).
I admit I got inspired mainly by Holger's answer on a similar of mine question and guava's Streams::findLast.
Stream, you may want to reconsider your design and if you really want to be using aStream.Streams are not necessarily ordered or finite. If yourStreamis unordered, infinite, or both, the last element has no meaning. In my mind, the point of aStreamis to provide a layer of abstraction between data and how you process it. As such, aStreamitself does not need to know anything about the relative ordering of its elements. Finding the last element in aStreamis O(n). If you had a different data structure, it could be O(1).count()either, but Stream still has acount()method. Really, that argument applies to any non-short-circuiting terminal operation on infinite streams.last()method. There could be a survey on April 1st how it should be defined for infinite streams. I would propose: "It never returns and it utilizes at least one processor core at 100%. On parallel streams it is required to utilize all cores at 100%."max()method as instream()...max(Comparator...).