0

How does set compare objects in java?

I was reading java documentation for equals method for arrays and list and from the documentation it looks similar. Is there a better way to differentiate.

This post just recommends to use list but I wanted to understand how do I know for other data structures? https://stackoverflow.com/questions/7488098/storing-arrays-in-set-and-avoiding-duplicates

For example:

 int[] arr1 = new int[] {1, 2}; int[] arr2 = new int[] {1, 2}; if (arr1 == arr2) { // doesn't get printed System.out.println("hello"); } Set<int[]> set1 = new HashSet<>(); set1.add(arr1); set1.add(arr2); System.out.println(set1.size()); // Prints 2 
 Set<List<Integer>> set = new HashSet<>(); List<Integer> list1 = Arrays.asList(1, 2, 3, 4); List<Integer> list2 = Arrays.asList(1, 2, 3, 4); set.add(list1); set.add(list2); if (list1 == list2) { // doesn't get printed System.out.println("hello"); } System.out.println(set); System.out.println(set.size()); // Prints // [[1, 2, 3, 4]] // 1 
3
  • 1
    Where did you read equals documentation for arrays? In any event, what matters is .equals, not ==. Commented Nov 28, 2023 at 3:08
  • 2
    This might be a partial duplicate of What is the difference between == and equals() in Java?. Commented Nov 28, 2023 at 3:13
  • Always search Stack Overflow thoroughly before posting. When drafting your Question, note how similar existing questions do not address your issue. Commented Nov 28, 2023 at 21:35

2 Answers 2

1

In Java, double equals (==) on objects compares pointer equality. So only two pointers to the same object instance will return true.

int[] arr1 = new int[] { 1, 2 }; int[] arr2 = new int[] { 1, 2 }; int[] arr3 = arr2; System.out.println(arr1 == arr2); // false System.out.println(arr3 == arr2); // true 

By default Object.equals() does the same thing. You can override it to do something sensible for a class you define. Obviously java.util.ArrayList does this.

A set is supposed to ensure that no two members are the same according to Object.equals(). According to the documentation on java.util.Set: "More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2)" However, this doesn't work the way you might expect.

For example, a java.util.HashSet doesn't call Object.equals() on its members when they're added unless the Object.hashCode() method returns the same value. Here's an example of how you would create a custom class that avoids duplicating entries using HashSet:

import java.util.Set; import java.util.HashSet; public class CustomArray { protected int[] _a; public CustomArray(int[] a) { _a = a; } public int hashCode() { int result = _a.length; for (int ii = 0; ii < _a.length; ++ii) result += ((Integer)_a[ii]).hashCode(); return result; } public boolean equals(Object o) { if ((o != null) && (o instanceof CustomArray) && (((CustomArray)o)._a.length == _a.length)) { boolean result = true; for (int ii = 0; ii < _a.length; ++ii) if (((CustomArray)o)._a[ii] != _a[ii]) result = false; } else result = false; return result; } public static void main(String[] args) { Set<CustomArray> s = new HashSet<>(); s.add(new CustomArray(new int[] {1, 2})); s.add(new CustomArray(new int[] {1, 2})); System.out.println(s.size()); // prints "1" } } 

In practice you might want to implement hashCode() differently to make collisions less likely (a linear congruential generator might work). I also recommend making your custom class immutable if possible, since changing the contents of a set member may invalidate the set assumption with surprising results.

Sign up to request clarification or add additional context in comments.

Comments

1

"... How does set compare objects in java? ..."

There are different implementations of the Set interface.

In your particular example, you're using the HashSet, which utilizes a Map, or more accurately, a HashMap.
Of which, in turn, generates a "hash-code" value of the supplied key, upon entry.
Meaning, no two keys may have the same hash-code.

Here is the source code, for HashMap#put.
GitHub – OpenJDK – HashMap.java – put(K, V).

/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with {@code key}, or * {@code null} if there was no mapping for {@code key}. * (A {@code null} return can also indicate that the map * previously associated {@code null} with {@code key}.) */ public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } 

And, here is HashMap#putVal.
GitHub – OpenJDK – HashMap.java – putVal(int, K, V, boolean, boolean).

It appears to be the typical Set abstraction; allowing only unique values.
Here is a relevant Wikipedia article.
Wikipedia – Set (abstract data type).

/** * Implements Map.put and related methods. * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; } 

"... This post just recommends to use list but I wanted to understand how do I know for other data structures? ..."

Here is a Wikipedia article, listing the different types of data structures.
Wikipedia – List of data structures.

6 Comments

Meaning, no two keys may have the same hash-code: sorry but this is nonsense. The Strings "aa" and "bB" have the same hashcode (3104) but I can easily add both of them to a HashSet.
@ThomasKläger, putVal(hash(key), key, value, false, true).
@ThomasKläger, and, why don't you just write an answer?
The condition in putVal() is p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))) which means that not only the hash values must match, it must also be true that the keys are either the same object (p.key == key) or if key is not null key.equals(k)
I would write something like "no two equal keys (as defined by the equals(Object) method) may be added. Note that objects that are equal according to equals(Object) must have the same hashCode()".
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.