118

I have a List of objects in C#. All of the objects contain a property ID. There are several objects that have the same ID property.

How can I trim the List (or make a new List) where there is only one object per ID property?

[Any additional duplicates are dropped out of the List]

0

5 Answers 5

235

If you want to avoid using a third-party library, you could do something like:

var bar = fooArray.GroupBy(x => x.Id).Select(x => x.First()).ToList(); 

That will group the array by the Id property, then select the first entry in the grouping.

Sign up to request clarification or add additional context in comments.

4 Comments

This worked perfectly here is my implementation: List<InputRow> uniqueRows = inputRows.GroupBy(x => x.Id).Select(x => x.First()).ToList<InputRow>();
Glad to help! One note: The <InputRow> on your ToList() is redundant. You should be able to just do .ToList()
You are right it works with just ToList() instead of ToList<InputRow>()
a good alternatif than trying to figure out why using distinct and iquatable not working.
39

MoreLINQ DistinctBy() will do the job, it allows using object proeprty for the distinctness. Unfortunatly built in LINQ Distinct() not flexible enoght.

var uniqueItems = allItems.DistinctBy(i => i.Id); 

DistinctBy()

Returns all distinct elements of the given source, where "distinctness" is determined via a projection and the default eqaulity comparer for the projected type.

PS: Credits to Jon Skeet for sharing this library with community

2 Comments

I think this is a great solution but am trying to avoid using a 3rd party library for this. Thank You.
Fortunately you can see how it is implemented
17

Starting from .NET 6, a new DistinctBy LINQ operator is available:

public static IEnumerable<TSource> DistinctBy<TSource,TKey> ( this IEnumerable<TSource> source, Func<TSource,TKey> keySelector); 

Returns distinct elements from a sequence according to a specified key selector function.

Usage example:

List<Item> distinctList = listWithDuplicates .DistinctBy(i => i.Id) .ToList(); 

There is also an overload that has an IEqualityComparer<TKey> parameter.


Update in-place: In case creating a new List<T> is not desirable, here is a RemoveDuplicates extension method for the List<T> class:

/// <summary> /// Removes all the elements that are duplicates of previous elements, /// according to a specified key selector function. /// </summary> /// <returns> /// The number of elements removed. /// </returns> public static int RemoveDuplicates<TSource, TKey>( this List<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> keyComparer = null) { ArgumentNullException.ThrowIfNull(source); ArgumentNullException.ThrowIfNull(keySelector); HashSet<TKey> hashSet = new(keyComparer); return source.RemoveAll(item => !hashSet.Add(keySelector(item))); } 

This method is efficient (O(n)) but also a bit dangerous, because it is based on the potentially corruptive List<T>.RemoveAll method¹. In case the keySelector lambda succeeds for some elements and then fails for another element, the partially modified List<T> will neither be restored to its initial state, nor it will be in a state recognizable as the result of successful individual Removes. Instead it will transition to a corrupted state that includes duplicate occurrences of existing elements. So in case the keySelector lambda is not fail-proof, the RemoveDuplicates method should be invoked in a try block that has a catch block where the potentially corrupted list is discarded.

Alternatively you could substitute the dangerous built-in RemoveAll with a safe custom implementation, that offers predictable behavior.

¹ For all .NET versions and platforms, including the latest .NET 7. I have submitted a proposal on GitHub to document the corruptive behavior of the List<T>.RemoveAll method, and the feedback that I received was that neither the behavior should be documented, nor the implementation should be fixed.

Comments

7
var list = GetListFromSomeWhere(); var list2 = GetListFromSomeWhere(); list.AddRange(list2); .... ... var distinctedList = list.DistinctBy(x => x.ID).ToList(); 

More LINQ at GitHub

Or if you don't want to use external dlls for some reason, You can use this Distinct overload:

public static IEnumerable<TSource> Distinct<TSource>( this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) 

Usage:

public class FooComparer : IEqualityComparer<Foo> { // Products are equal if their names and product numbers are equal. public bool Equals(Foo x, Foo y) { //Check whether the compared objects reference the same data. if (Object.ReferenceEquals(x, y)) return true; //Check whether any of the compared objects is null. if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null)) return false; return x.ID == y.ID } } list.Distinct(new FooComparer()); 

Comments

4

Not sure if anyone is still looking for any additional ways to do this. But I've used this code to remove duplicates from a list of User objects based on matching ID numbers.

private ArrayList RemoveSearchDuplicates(ArrayList SearchResults) { ArrayList TempList = new ArrayList(); foreach (User u1 in SearchResults) { bool duplicatefound = false; foreach (User u2 in TempList) if (u1.ID == u2.ID) duplicatefound = true; if (!duplicatefound) TempList.Add(u1); } return TempList; } 

Call: SearchResults = RemoveSearchDuplicates(SearchResults);

1 Comment

This is pointlessly O(n ^2) when regular GroupBy is just O(n)...

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.