13

The following code produces the out put [1,2]even though hashset is not sorted.

Set set = new HashSet(); set.add(new Integer(2)); set.add(new Integer(1)); System.out.println(set); 

Why is that?

1
  • Use multiple test cases. Include 20 numbers and see if the result is the same. Commented Sep 6, 2013 at 1:49

5 Answers 5

17

EDIT: As of Java 8 and later, the following is no longer applicable. This proves that you shouldn't rely on undocumented Java behaviours.


This behaviour is caused by several separate reasons:

  • Integers hash to themselves
  • in Java, HashMaps and HashSets are backed up by an array
  • they also modify hashes using the higher bits to modify the lower bits; if the hash is in range 0..15, it's therefore not modified
  • what bucket an object goes depends on the lower bits of the modified hash
  • when iterating over a map or a set, the inner table is scanned sequentially

So if you add several small (<16) integers to a hashmap/hashset, this is what happens:

  • integer i has hashcode i
  • since it's less than 16, it's modified hash is also i
  • it lands in the bucket no. i
  • when iterating, the buckets are visited sequentially, so if all you stored there are small integers, they will be retrieved in ascending order

Note that if the initial number of buckets is too small, the integers may land in buckets not numbered after them:

HashSet<Integer> set = new HashSet<>(4); set.add(5); set.add(3); set.add(1); for(int i : set) { System.out.print(i); } 

prints 153.

Sign up to request clarification or add additional context in comments.

2 Comments

I'm getting it as 513.
@RajRajeshwarSinghRathore Good to know. My answer was based upon implementation details of Java 7 and therefore could have been invalidated at literally any moment.
9

A HashSet as per the documentation does not guarantee any concept of order, so what you're seeing could very well change in a future update of Java.

However, if you're wondering why Java's (as of now) specific implementation of HashSet produces the result you're seeing: it's because the Integer of value 1 hashes to a location in the internal entry table of a HashMap that comes before the location to which 2 hashes (note that a HashSet is really backed by a HashMap with arbitrary values). This makes sense since the hash code of an Integer object is just its value.

In fact, you can see this even if you add even more numbers (within a certain range: the size of the entry table which is 16 by default):

Set<Integer> set = new HashSet<>(); set.add(2); set.add(1); set.add(4); set.add(3); set.add(0); System.out.println(set); 
 [0, 1, 2, 3, 4] 

Iteration over a HashSet takes place by iterating over the internal entry table, which means items earlier in the table come first.

Comments

1

A HashSet is an unordered collection. It has no guarantees and no concepts of "ordering". See this answer for more details: What is the difference between Set and List?

You can consider using a TreeSet if you need ordered, sorted Set.

There is also a LinkedHashSet for an ordered Set that is not sorted.

1 Comment

@superEb I actually just added that blurb to my response. Looks like we realized it at the same time!
0

A set in java is not supposed to be ordered list. Use ArrayList instead. Also check Java Collection API for further reference.

2 Comments

LinkedHashSet is usually a better choice if you want fast (constant-time) lookups but with predictable iteration order. Only use ArrayList if you're trying to save space, and can tolerate slow (linear-time) lookups.
@Ashwin Note that you can get logarithm time O(log n) look up times for a sorted List by using the Collections.binarySearch() method.
0

Got a conclusion after several trial and error. This is purely based on hashing order. And that is why we call HASH set. Hashing is based on length of the input. Try 4 single digit numbers with 4 double digit numbers. You will get result.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.