22

If I have a Stream<T>, I can easily use skip(long) to skip the first few elements of a stream. However, there seems to be no equivalent for skipping a given number of elements at the end of the stream.

The most obvious solution is to use limit(originalLength - elementsToRemoveAtEnd), but that requires knowing the initial length beforehand, which isn't always the case.

Is there a way to remove the last few elements of a stream of unknown length without having to collect it into a Collection, count the elements and stream it again?

3
  • 7
    The reason there is no such method is probably the one you are giving. There is normally no way to know when a Stream will end during streaming, so you can't really remove the last few elements until you are done. Commented Oct 16, 2014 at 11:38
  • 1
    How do you determine the point when you want to skip the remaining elements? Couldnt you implement your own Stream to wrap around the original Stream? Commented Oct 16, 2014 at 11:50
  • How do you know an element is last if you don't know the length of the stream? Commented Oct 16, 2014 at 13:59

3 Answers 3

11

There is no general storage-free solution for Streams that may have an unknown length. However, you don’t need to collect the entire stream, you only need a storage as large as the number of elements you want to skip:

static <T> Stream<T> skipLastElements(Stream<T> s, int count) { if(count<=0) { if(count==0) return s; throw new IllegalArgumentException(count+" < 0"); } ArrayDeque<T> pending=new ArrayDeque<T>(count+1); Spliterator<T> src=s.spliterator(); return StreamSupport.stream(new Spliterator<T>() { public boolean tryAdvance(Consumer<? super T> action) { while(pending.size()<=count && src.tryAdvance(pending::add)); if(pending.size()>count) { action.accept(pending.remove()); return true; } return false; } public Spliterator<T> trySplit() { return null; } public long estimateSize() { return src.estimateSize()-count; } public int characteristics() { return src.characteristics(); } }, false); } public static void main(String[] args) { skipLastElements(Stream.of("foo", "bar", "baz", "hello", "world"), 2) .forEach(System.out::println); } 
Sign up to request clarification or add additional context in comments.

1 Comment

The above has a serious hidden problem! Calling Stream.spliterator() can return a StreamSpliterators.WrappingSpliterator. The method initPartialTraversalState() stores all of the stream elements into a Collection (i.e. SpinedBuffer)! This could blow up the heap for streams based on file inputs.
1

The following code uses ArrayDeque to buffer n elements where n is the number of elements to skip at the end. The trick is to use skip(n). This causes the first n elements to be added to the ArrayDeque. Then, once n elements have been buffered, the stream continues processing elements but pops elements from ArrayDeque. When the end of the stream is reached, the last n elements are stuck in the ArrayDeque and discarded.

ArrayDeque does not allow null elements. The code below maps null into NULL_VALUE before adding to ArrayDeque and then maps NULL_VALUE back to null after popping from ArrayDeque.

private static final Object NULL_VALUE = new Object(); public static <T> Stream<T> skipLast(Stream<T> input, int n) { ArrayDeque<T> queue; if (n <= 0) return(input); queue = new ArrayDeque<>(n + 1); input = input. map(item -> item != null ? item : NULL_VALUE). peek(queue::add). skip(n). map(item -> queue.pop()). map(item -> item != NULL_VALUE ? item : null); return(input); } 

3 Comments

Well, I prefer a solution that might be inefficient, depending on how Streams are implemented, over a solution that might break, depending on how Streams are implemented. Your solution relies on the current implementation’s inability to fuse a skip with the stream source. A future version might be able to truly skip all preceding steps that do not change the stream size, just like Java 9’s count() may not evaluate these functions…
I have heard those rumors but those optimizations will have to be limited in scope of application or a lot of code will end up broken. peek() should stop such optimizations because its lambda is almost always going to have side-effects. Thus, skip() and count() cannot optimize it away.
Java 9’s count() does already break such code. As already discussed in In Java streams is peek really only for debugging?, you should not use a method contrary to its purpose. Since this method’s primary purpose is to debug, i.e. detect whether elements are processed at this stage, it would be nonsensical to force processing at this stage because of this method. You can not only use the method to detect the presence of such optimization, you may detect that streams sometimes process more elements than necessary, under Java 8 and Java 9.
1

Here's a solution using the Stream Gatherers feature in upcoming Java 24:

List<T> result = myStream.gather(skipLast(5)).toList() 
private static <T> Gatherer<T, ?, T> skipLast(int n) { return Gatherer.<T, Queue<T>, T>ofSequential( LinkedList::new, Gatherer.Integrator.ofGreedy((state, element, downstream) -> { state.add(element); if (state.size() == n + 1) { return downstream.push(state.remove()); } else { return true; } })); } 

This gatherer uses a queue as its state, keeping the most recent n elements in it. Whenever an additional element would be added once n elements have been kept, the first element is removed from the queue and emitted to the stream.

Once the entire stream has been sent to the gatherer, the queue will contain the last n elements of the stream, none of which will have been pushed downstream. The net result is that all but the last n elements are included (in order) in the resulting stream.

2 Comments

warning: ArrayDeque does not accept null elements - if these are needed, a LinkedList can be used instead (same code)
@user85421 Thanks; I did not know that. I'll update my solution.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.