4

What would be a suitable candidate for a high-performance concurrent collection with the following requirements:

  1. The number of elements in the collection is small (a few elements, usually less than 10) and rarely changes.
  2. The main use case is iterating through the elements. This happens a lot and must be super fast (i.e. should be lock free).
  3. Occasionally an element will be removed by use of the iterator remove() method. This should preferably also work very fast (but it's less significant than the iterator next() method).
  4. The order of the elements is insignificant so it doesn't matter how elements are inserted to the collection.
  5. Preferably something from the standard Java library.

I considered using ConcurrentLinkedQueue<> for this, but found mentions of it leaking memory (by design) if you never call the poll() method. I'm not sure if this is still the case (the posts mentioning this are from ~ 2011 and I found some mentions that this may have been addressed).

I also considered ConcurrentSkipListSet<>, but I'm not sure what is the effect of the sorting on the performance (since I don't care about the order).

8
  • 1
    If your main use case is to find an element, an HashMap will be best. As for CopyOnWriteArrayList, as javadoc: Element-changing operations on iterators themselves (remove, set, and add) are not supported. These methods throw UnsupportedOperationException. Commented Apr 5, 2015 at 14:56
  • 2
    Except for removing the element using the iterator, which is not supported, A CopyOnWriteArrayList looks to me like what you're looking for. You can still call the list's remove() method directly though, even during the iteration. But then you'd have no guarantee that you're removing the right element. Have you tried simply synchronizing on an ArrayList? Have you proven it was not fast enough? Commented Apr 5, 2015 at 14:57
  • +1 to the CopyOnWriteArrayList, provided that the "small number of elements" and "rarely changes" are assumptions you can make. If they're not -- if, for instance, a malicious user can cause there to be a lot of elements and frequent updates -- then I would avoid it, since said malicious user could make a DOS-style attack (create a list with thousands of elements, and then frequently add and remove elements from it, each operation being O(n)). Commented Apr 5, 2015 at 15:05
  • ConcurrentHashMap should give good performance, especially if you use optimal init params for your use case. Commented Apr 5, 2015 at 16:09
  • @JFPicard the main use case is to iterate through the elements and perform an action on each one - not to find a specific element. Commented Apr 6, 2015 at 8:24

2 Answers 2

6

If you are looking for a "fast enough" solution, you can use ConcurrentLinkedQueue. The well-known problem of memory leak in its Iterator.remove() was fixed as a part of http://bugs.java.com/view_bug.do?bug_id=6785442. Now removed ConcurrentLinkedQueue$Node's should be GC'ed successfully. But if you are looking for the most performant solution, then...

  1. Don't use iterators at all (and for-each over Collection, therefore), since Collection.iterator() prepares new instance of Iterator and, btw, its next() method isn't for free :) Each time when you iterate a collection with Iterator, you spend CPU time for at least: about 10 instructions to allocate memory for new object + a number of instructions of constructor (see sources of ConcurrentLinkedQueue$Itr) + ConcurrentLinkedQueue$Itr.next() + removing of the object from Eden on minor GC.

  2. Plain array+direct indexing is the fastest technique of iterating. So, use CopyOnWriteArrayList or implement your own collection to use a plain array to iterate over a number of items. For example, if you add/remove items very rarely and you'd prefer to remove them while iterating rather than to remove by index, you could try something like the following:

    public enum IterationResult { NEXT, REMOVE, BREAK; } public interface CollectionIterator<T> { IterationResult onObject(T object); } public interface CollectionModification<T> { CollectionModification<T> add(T item); CollectionModification<T> remove(T item); } public class MyCollection<T> { private volatile State state; private final ReentrantLock modificationLock = new ReentrantLock(); private State currentModification; public MyCollection() { this(10); } public MyCollection(int initialSize) { state = new State(initialSize); } public CollectionModification<T> startModification() { modificationLock.lock(); currentModification = new State(state); return currentModification; } public void finishModification() { state = currentModification; modificationLock.unlock(); } @SuppressWarnings("unchecked") public void iterate(CollectionIterator<T> it) { final State currentState = state; State modifiedState = null; try { out_: for (int i = 0; i < currentState.size; i++) { final T item = (T) currentState.items[i]; // unchecked final IterationResult result = it.onObject(item); switch (result) { case BREAK: break out_; case REMOVE: if (modifiedState == null) { modifiedState = (State) startModification(); } modifiedState.removeExactly(item); break; default: break; } } } finally { if (modifiedState != null) { finishModification(); } } } private class State implements CollectionModification<T> { private Object[] items; private int size; private State(int initialSize) { items = new Object[initialSize]; } private State(State from) { items = new Object[from.items.length]; size = from.size; System.arraycopy(from.items, 0, items, 0, size); } @Override public CollectionModification<T> add(T item) { if (size == items.length) { final Object[] newItems = new Object[size + size >>> 1]; System.arraycopy(items, 0, newItems, 0, size); items = newItems; } items[size] = item; size++; return this; } @Override public CollectionModification<T> remove(T item) { for (int i = 0; i < size; i++) { if (Objects.equals(item, items[i])) { removeItem(i); break; } } return this; } private void removeExactly(T item) { for (int i = 0; i < size; i++) { if (item == items[i]) { removeItem(i); break; } } } private void removeItem(int index) { if (index < items.length - 1) { System.arraycopy(items, index + 1, items, index, size - 1); } size--; } } } 

Usage:

 CollectionIterator<Integer> remove2 = new CollectionIterator<Integer>() { @Override public IterationResult onObject(Integer object) { return object == 2 ? IterationResult.REMOVE : IterationResult.NEXT; } }; MyCollection<Integer> col = new MyCollection<>(); CollectionModification<Integer> mod = col.startModification(); try { mod.add(new Integer(1)) .add(new Integer(2)) .add(new Integer(3)); } finally { col.finishModification(); } col.iterate(remove2); 

This is very similar to CopyOnWriteArrayList. BTW, if you have only one thread which modifies the collection (single writer) and many readers, you can get rid off the lock, since volatile is enough to guarantee visibility of all changes between the writer and all readers. Also, you could replace the classic lock by a busy-wait to get lock-free collection if latency is important for you.

The main thing you should understand is that in many cases the most performant solution for very specific requirements is to write a piece of your own fine tuned code. That's the way to don't pay for things you don't really use. That's why high performance/low-latency apps usually don't use common 3rd party libs in their main paths

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the elaborate answer, it's obvious you invested time and thought into it! :) My attention has been diverted to other issues for the time being, but I appreciate it a lot and will also invest time and thought looking into it in the following days as soon as I can get back to this.
1

I ran some tests. It's not a perfect test but it gives a notion on the performance differences.

The program:

import java.util.*; import java.util.concurrent.*; public class Main { public static void main(String[] args) { testCollection(new ArrayList<Integer>()); testCollection(Collections.synchronizedList(new ArrayList<Integer>())); testCollection(new CopyOnWriteArrayList<Integer>()); testCollection(new LinkedList<Integer>()); testCollection(Collections.synchronizedList(new LinkedList<Integer>())); testCollection(Collections.newSetFromMap(new ConcurrentHashMap<Integer, Boolean>())); testCollection(new ConcurrentLinkedQueue<Integer>()); testCollection(new ConcurrentSkipListSet<Integer>()); } static void testCollection(Collection<Integer> collection) { testCollection(collection, 3); } static void testCollection(Collection<Integer> collection, int count) { Test t = new Test(collection); for (int i = 0; i < 10; i++) System.gc(); while (count > 0) { long time = 0, iterationTime; for (int x = 0; x < 1010; x++) { iterationTime = t.timeIteration(); if (x >= 10) { // skip first 10 runs results to reduce the effect of runtime optimizations time += iterationTime; } } System.out.println(collection.getClass() + ": " + time / 1000000.0 + " milliseconds"); count--; } } static class Test { private Collection<Integer> list; public Test(Collection<Integer> list) { list.addAll(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)); this.list = list; } long timeIteration() { Iterator<Integer> iterator; long start = System.nanoTime(); for (int i = 0; i < 10000; i++) { for (iterator = list.iterator(); iterator.hasNext(); ) { Integer x = iterator.next(); if (x > 20) System.out.println("this doesn't happen"); } } return System.nanoTime() - start; } } } 

The results: (Formatted for clarity)

╔══════════════════════════════════════════╦══════════════╗ ║ class (java.util.) ║ milliseconds ║ ╠══════════════════════════════════════════╬══════════════╣ ║ ArrayList ║ 138.242208 ║ ║------------------------------------------║--------------║ ║ ArrayList ║ 135.795334 ║ ║------------------------------------------║--------------║ ║ ArrayList ║ 160.516023 ║ ║------------------------------------------║--------------║ ║ Collections$SynchronizedRandomAccessList ║ 371.29873 ║ ║------------------------------------------║--------------║ ║ Collections$SynchronizedRandomAccessList ║ 318.442416 ║ ║------------------------------------------║--------------║ ║ Collections$SynchronizedRandomAccessList ║ 335.079316 ║ ║------------------------------------------║--------------║ ║ concurrent.CopyOnWriteArrayList ║ 299.203427 ║ ║------------------------------------------║--------------║ ║ concurrent.CopyOnWriteArrayList ║ 361.800762 ║ ║------------------------------------------║--------------║ ║ concurrent.CopyOnWriteArrayList ║ 316.672923 ║ ║------------------------------------------║--------------║ ║ LinkedList ║ 776.843317 ║ ║------------------------------------------║--------------║ ║ LinkedList ║ 807.458514 ║ ║------------------------------------------║--------------║ ║ LinkedList ║ 793.940155 ║ ║------------------------------------------║--------------║ ║ Collections$SynchronizedList ║ 793.125156 ║ ║------------------------------------------║--------------║ ║ Collections$SynchronizedList ║ 782.003326 ║ ║------------------------------------------║--------------║ ║ Collections$SynchronizedList ║ 790.937425 ║ ║------------------------------------------║--------------║ ║ Collections$SetFromMap ║ 1452.049046 ║ ║------------------------------------------║--------------║ ║ Collections$SetFromMap ║ 1457.275127 ║ ║------------------------------------------║--------------║ ║ Collections$SetFromMap ║ 1450.322531 ║ ║------------------------------------------║--------------║ ║ concurrent.ConcurrentLinkedQueue ║ 976.803439 ║ ║------------------------------------------║--------------║ ║ concurrent.ConcurrentLinkedQueue ║ 855.64423 ║ ║------------------------------------------║--------------║ ║ concurrent.ConcurrentLinkedQueue ║ 845.05258 ║ ║------------------------------------------║--------------║ ║ concurrent.ConcurrentSkipListSet ║ 926.423234 ║ ║------------------------------------------║--------------║ ║ concurrent.ConcurrentSkipListSet ║ 924.370203 ║ ║------------------------------------------║--------------║ ║ concurrent.ConcurrentSkipListSet ║ 933.504106 ║ ╚══════════════════════════════════════════╩══════════════╝ 

Comments are welcome.

2 Comments

I have formatted your table for easier reading. If you want, you can edit your answer : pastebin.com/uheN0N9y
This is useless without understanding the code as each item appears thrice. If those are for separate operations, the table needs reformatting.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.