0

I have a list of objects of type Person and I want to get rid of elements that have the same name, using streams. I have found on the internet a suggestion to use a Wrapper class and my code looks like this so far:

List<Person> people = Arrays.asList(new Person("Kowalski"), new Person("Nowak"), new Person("Big"), new Person("Kowalski")); List<Person> distPeople = people.stream() .map(Wrapper::new) .distinct() .map(Wrapper::unwrap) .collect(Collectors.toList()); 

In documentation it is said that distinct()

Returns a stream consisting of the distinct elements (according to Object.equals(Object)) of this stream.

Implementation of Wrapper that doesn't work (I get the same stream with two Kowalski):

public class Wrapper { private final Person person; Wrapper(Person p) { person = p; } public Person unwrap() { return person; } public boolean equals(Object other) { if(other instanceof Wrapper) return ((Wrapper) other).person.getName().equals(person.getName()); else return false; } } 

Implementation of Wrapper class works after adding this:

@Override public int hashCode() { return person.getName().hashCode(); } 

Can someone explain why after overriding hashCode() in the Wrapper class distinct() works?

2
  • 5
    Because distinct() likely uses a HashSet for best performance. You should always implement hashCode() when you implement equals(). Commented Jun 26, 2018 at 18:29
  • 4
    Irrespective of whether you are doing this with your objects, equals and hashCode should always be overridden together so as to yield consistent results. Otherwise you are violating the contract of Object. Commented Jun 26, 2018 at 18:41

3 Answers 3

2

The answer lies in the class DistinctOps. The method makeRef is used to return an instance of ReferencePipeline containing distinct elements. This method makes use of LinkedHashSet for performing a reduce operation in order to get distinct elements. Note that LinkedHashSet extends from HashSet which uses HashMap for storing elements. Now inorder for a HashMap to work properly, you should provide the implementation for hashCode() which follows the correct contract between hashCode() and equals() and therfore, it is required that you provide an implementation to hasCode() so that Stream#distinct() works properly.

Sign up to request clarification or add additional context in comments.

Comments

2

From equals Java doc

It is generally necessary to override the hashCode method whenever equals method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

Please read the details about contract here

2 Comments

Seems you quoted the javadoc of equals(), so you should link to it.
But you didn't link to the source of the quote.
2

The distinct() operation uses a HashSet internally to check whether it already processed a certain element. The HashSet in turn relies on the hashCode() method of its elements to sort them into buckets.

If you don't override the hashCode() method, it falls back to its default, returning the object's identity, which usually differs between two objects even though they are the same according to equal(). Thus the HashSet puts them into different buckets and can no longer determine that they're the 'same' object.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.