C# LINQ find duplicates in List

Question

Using LINQ, from a List<int>, how can I retrieve a list that contains entries repeated more than once and their values?

Vadim Ovchinnikov · Accepted Answer · 2018-03-28 05:05:32Z

911

The easiest way to solve the problem is to group the elements based on their value, and then pick a representative of the group if there are more than one element in the group. In LINQ, this translates to:

var query = lst.GroupBy(x => x)
              .Where(g => g.Count() > 1)
              .Select(y => y.Key)
              .ToList();

If you want to know how many times the elements are repeated, you can use:

var query = lst.GroupBy(x => x)
              .Where(g => g.Count() > 1)
              .Select(y => new { Element = y.Key, Counter = y.Count() })
              .ToList();

This will return a List of an anonymous type, and each element will have the properties Element and Counter, to retrieve the information you need.

And lastly, if it's a dictionary you are looking for, you can use

var query = lst.GroupBy(x => x)
              .Where(g => g.Count() > 1)
              .ToDictionary(x => x.Key, y => y.Count());

This will return a dictionary, with your element as key, and the number of times it's repeated as value.

edited Mar 28, 2018 at 5:05

Vadim Ovchinnikov

14.1k7 gold badges68 silver badges95 bronze badges

answered Aug 31, 2013 at 10:58

Save

12k1 gold badge20 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Mirko Arcese Over a year ago

Now just a wonder, let's say that duplicated int are distributed into n int arrays, im using dictionary and for loop to understand which array contains a duplicate and remove it according to a logic of distribution, is there a fastest way (linq wondering) to achieve that result ? thank you in advance for interest.

Mirko Arcese Over a year ago

I'm doing something like this : code for (int i = 0; i < duplicates.Count; i++) { int duplicate = duplicates[i]; duplicatesLocation.Add(duplicate, new List<int>()); for (int k = 0; k < hitsList.Length; k++) { if (hitsList[k].Contains(duplicate)) { duplicatesLocation.ElementAt(i).Value.Add(k); } } // remove duplicates according to some rules. } code

Save Over a year ago

if you want to find duplicates in a list of arrays, give a look to SelectMany

Mirko Arcese Over a year ago

I'm searching for duplicates in an array of lists, but didnt get how selectmany can help me to make it out

Harald Coppoolse Over a year ago

To check if any collection has more than one element if is more efficient to use Skip(1).Any() instead of Count(). Imagine a collection with 1000 elements. Skip(1).Any() will detect there is more than 1 once it finds the 2nd element. Using Count() requires to access the complete collection.

|

maxbeaudoin · Accepted Answer · 2024-07-04 14:29:27Z

228

Find out if an enumerable contains any duplicate :

var anyDuplicate = enumerable.GroupBy(x => x.Key).Any(g => g.Count() > 1);

Find out if values in an enumerable are all unique :

var allUnique = enumerable.GroupBy(x => x.Key).All(g => g.Count() == 1);

edited Jul 4, 2024 at 14:29

answered Dec 1, 2014 at 17:34

maxbeaudoin

7,0165 gold badges41 silver badges55 bronze badges

3 Comments

Geoduck Over a year ago

Is there any possibility these are not always boolean opposites? anyDuplicate == !allUnique in all cases.

Caltor Over a year ago

@GarrGodfrey They are always boolean opposites

Ariwibawa Over a year ago

to get what were duplicated, just change Any to Where.

Flater · Accepted Answer · 2022-06-22 09:34:08Z

37

To find the duplicate values only:

var duplicates = list.GroupBy(x => x.Key).Where(g => g.Count() > 1);

E.g.

var list = new[] {1,2,3,1,4,2};

GroupBy will group the numbers by their keys and will maintain the count (number of times it is repeated) with it. After that, we are just checking the values which have repeated more than once.

To find the unique values only:

var unique = list.GroupBy(x => x.Key).Where(g => g.Count() == 1);

E.g.

var list = new[] {1,2,3,1,4,2};

GroupBy will group the numbers by their keys and will maintain the count (number of times it repeated) with it. After that, we are just checking the values who have repeated only once means are unique.

edited Jun 22, 2022 at 9:34

Flater

14.1k4 gold badges45 silver badges67 bronze badges

answered Nov 9, 2018 at 5:47

Lav Vishwakarma

1,42615 silver badges22 bronze badges

6 Comments

Malu MN Over a year ago

Below code will also find unique items. var unique = list.Distinct(x => x)

Silviu Preda Over a year ago

Your ANY syntax will NOT return the duplicates, it will merely tell you if there are any. Use the ALL syntax in the first example as well, and that should sort it!

DarkBarbarian Over a year ago

Both examples only return booleans which is not what the OP asked.

Flater Over a year ago

@MaluMN: The answer uses "unique values only" to mean "only the values which appear only once". Distinct works differently, in that it will not just return the values which appear only once, but also the values which appear multiple times (but it will return them only once instead of all of the multiple times); which is different from what the answer was referring to.

Flater Over a year ago

.All(g => g.Count() == 1) should be .Where(g => g.Count() == 1). All would not "find the unique values" as you suggest, it would confirm that there are no duplicates in the entire list (= that all groups have a count of 1)

|

AlexMelw · Accepted Answer · 2019-12-04 16:49:31Z

33

Another way is using HashSet:

var hash = new HashSet<int>();
var duplicates = list.Where(i => !hash.Add(i));

If you want unique values in your duplicates list:

var myhash = new HashSet<int>();
var mylist = new List<int>(){1,1,2,2,3,3,3,4,4,4};
var duplicates = mylist.Where(item => !myhash.Add(item)).Distinct().ToList();

Here is the same solution as a generic extension method:

public static class Extensions
{
  public static IEnumerable<TSource> GetDuplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector, IEqualityComparer<TKey> comparer)
  {
    var hash = new HashSet<TKey>(comparer);
    return source.Where(item => !hash.Add(selector(item))).ToList();
  }

  public static IEnumerable<TSource> GetDuplicates<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
  {
    return source.GetDuplicates(x => x, comparer);      
  }

  public static IEnumerable<TSource> GetDuplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
  {
    return source.GetDuplicates(selector, null);
  }

  public static IEnumerable<TSource> GetDuplicates<TSource>(this IEnumerable<TSource> source)
  {
    return source.GetDuplicates(x => x, null);
  }
}

edited Dec 4, 2019 at 16:49

AlexMelw

2,72431 silver badges37 bronze badges

answered Oct 20, 2013 at 10:00

HuBeZa

4,7933 gold badges40 silver badges58 bronze badges

6 Comments

BCA Over a year ago

This does not work as expected. Using List<int> { 1, 2, 3, 4, 5, 2 } as the source, the result is an IEnumerable<int> with one element having the value of 1 (where the correct duplicate value is 2)

HuBeZa Over a year ago

@BCA yesterday, I think you're wrong. Check out this example: dotnetfiddle.net/GUnhUl

BCA Over a year ago

Your fiddle prints out the correct result. However, I added the line Console.WriteLine("Count: {0}", duplicates.Count()); directly below it and it prints 6. Unless I'm missing something about the requirements for this function, there should only be 1 item in the resulting collection.

HuBeZa Over a year ago

@BCA yesterday, it's a bug caused by LINQ deferred execution. I've added ToList in order to fix the issue, but it means that the method is executed as soon as it called, and not when you iterate over the results.

solid_luffy Over a year ago

var hash = new HashSet<int>(); var duplicates = list.Where(i => !hash.Add(i)); will lead to a list that includes all occurrences of duplicates. So if you have four occurrences of 2 in your list, then your duplicate list will contain three occurrences of 2, since only one of the 2's can be added to the HashSet. If you want your list to contain unique values for each duplicate, use this code instead: var duplicates = mylist.Where(item => !myhash.Add(item)).ToList().Distinct().ToList();

|

hunch_hunch · Accepted Answer · 2018-05-16 21:43:27Z

15

You can do this:

var list = new[] {1,2,3,1,4,2};
var duplicateItems = list.Duplicates();

With these extension methods:

public static class Extensions
{
    public static IEnumerable<TSource> Duplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
    {
        var grouped = source.GroupBy(selector);
        var moreThan1 = grouped.Where(i => i.IsMultiple());
        return moreThan1.SelectMany(i => i);
    }

    public static IEnumerable<TSource> Duplicates<TSource, TKey>(this IEnumerable<TSource> source)
    {
        return source.Duplicates(i => i);
    }

    public static bool IsMultiple<T>(this IEnumerable<T> source)
    {
        var enumerator = source.GetEnumerator();
        return enumerator.MoveNext() && enumerator.MoveNext();
    }
}

Using IsMultiple() in the Duplicates method is faster than Count() because this does not iterate the whole collection.

edited May 16, 2018 at 21:43

hunch_hunch

2,3411 gold badge23 silver badges27 bronze badges

answered Aug 31, 2013 at 13:28

Alex Siepman

2,62626 silver badges33 bronze badges

7 Comments

Johnbot Over a year ago

If you look at the reference source for Grouping you can see that Count() is pre computed and your solution is likely slower.

Alex Siepman Over a year ago

@Johnbot. You are right, in this case it is faster and the implementatation is likely to never changes... but it depends on an implementation detail of implemetation class behind IGrouping. With my implementaion, you know it will never iterate the whole collection.

Jogi Over a year ago

so counting [Count()] is basically different than iterating the whole list. Count() is pre-computed but iterating the whole list is not.

Alex Siepman Over a year ago

@rehan khan: I do not understand the difference between Count() and Count()

Alex Siepman Over a year ago

@RehanKhan: IsMultiple is NOT doing a Count(), it stops Immediately after 2 items. Just like Take(2).Count >= 2;

|

Tshilidzi Mudau · Accepted Answer · 2017-04-20 10:13:40Z

I created a extention to response to this you could includ it in your projects, I think this return the most case when you search for duplicates in List or Linq.

Example:

//Dummy class to compare in list
public class Person
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Surname { get; set; }
    public Person(int id, string name, string surname)
    {
        this.Id = id;
        this.Name = name;
        this.Surname = surname;
    }
}


//The extention static class
public static class Extention
{
    public static IEnumerable<T> getMoreThanOnceRepeated<T>(this IEnumerable<T> extList, Func<T, object> groupProps) where T : class
    { //Return only the second and next reptition
        return extList
            .GroupBy(groupProps)
            .SelectMany(z => z.Skip(1)); //Skip the first occur and return all the others that repeats
    }
    public static IEnumerable<T> getAllRepeated<T>(this IEnumerable<T> extList, Func<T, object> groupProps) where T : class
    {
        //Get All the lines that has repeating
        return extList
            .GroupBy(groupProps)
            .Where(z => z.Count() > 1) //Filter only the distinct one
            .SelectMany(z => z);//All in where has to be retuned
    }
}

//how to use it:
void DuplicateExample()
{
    //Populate List
    List<Person> PersonsLst = new List<Person>(){
    new Person(1,"Ricardo","Figueiredo"), //fist Duplicate to the example
    new Person(2,"Ana","Figueiredo"),
    new Person(3,"Ricardo","Figueiredo"),//second Duplicate to the example
    new Person(4,"Margarida","Figueiredo"),
    new Person(5,"Ricardo","Figueiredo")//third Duplicate to the example
    };

    Console.WriteLine("All:");
    PersonsLst.ForEach(z => Console.WriteLine("{0} -> {1} {2}", z.Id, z.Name, z.Surname));
    /* OUTPUT:
        All:
        1 -> Ricardo Figueiredo
        2 -> Ana Figueiredo
        3 -> Ricardo Figueiredo
        4 -> Margarida Figueiredo
        5 -> Ricardo Figueiredo
        */

    Console.WriteLine("All lines with repeated data");
    PersonsLst.getAllRepeated(z => new { z.Name, z.Surname })
        .ToList()
        .ForEach(z => Console.WriteLine("{0} -> {1} {2}", z.Id, z.Name, z.Surname));
    /* OUTPUT:
        All lines with repeated data
        1 -> Ricardo Figueiredo
        3 -> Ricardo Figueiredo
        5 -> Ricardo Figueiredo
        */
    Console.WriteLine("Only Repeated more than once");
    PersonsLst.getMoreThanOnceRepeated(z => new { z.Name, z.Surname })
        .ToList()
        .ForEach(z => Console.WriteLine("{0} -> {1} {2}", z.Id, z.Name, z.Surname));
    /* OUTPUT:
        Only Repeated more than once
        3 -> Ricardo Figueiredo
        5 -> Ricardo Figueiredo
        */
}

Consider using Skip(1).Any() instead of Count(). If you have 1000 duplicates, then Skip(1).Any() will stop after it finds the 2nd one. Count() will access all 1000 elements.
If you add this extension method, consider using HashSet.Add instead of GroupBy, as suggeted in one of the other answers. As soon as HashSet.Add finds a duplicate it will stop. Your GroupBy will continue grouping all elements, even if a group with more than one element has been found

Aykut Gündoğdu · Accepted Answer · 2020-03-29 16:12:16Z

3

there is an answer but i did not understand why is not working;

var anyDuplicate = enumerable.GroupBy(x => x.Key).Any(g => g.Count() > 1);

my solution is like that in this situation;

var duplicates = model.list
                    .GroupBy(s => s.SAME_ID)
                    .Where(g => g.Count() > 1).Count() > 0;
if(duplicates) {
    doSomething();
}

answered Mar 29, 2020 at 16:12

Aykut Gündoğdu

712 bronze badges

1 Comment

Silviu Preda Over a year ago

The first syntax doesn't work because it's actually a boolean extension: the ANY method will return true if at least one element satisfies the predicate, and false otherwise. So your code will tell you only IF you have duplicates, not WHICH are they

fthtnrvr · Accepted Answer · 2022-11-09 13:42:55Z

3

Just an another approach:

For just HasDuplicate:

bool hasAnyDuplicate = list.Count > list.Distinct().Count;

For duplicate values

List<string> duplicates = new List<string>();
duplicates.AddRange(list);
list.Distinct().ToList().ForEach(x => duplicates.Remove(x));

// for unique duplicate values:
duplicates.Distinct():

answered Nov 9, 2022 at 13:42

fthtnrvr

312 bronze badges

Comments

user1785960 · Accepted Answer · 2020-07-16 10:26:59Z

2

Linq query:

var query = from s2 in (from s in someList group s by new { s.Column1, s.Column2 } into sg select sg) where s2.Count() > 1 select s2;

answered Jul 16, 2020 at 10:26

user1785960

6677 silver badges18 bronze badges

Comments

GeoB · Accepted Answer · 2018-09-11 08:03:07Z

Complete set of Linq to SQL extensions of Duplicates functions checked in MS SQL Server. Without using .ToList() or IEnumerable. These queries executing in SQL Server rather than in memory.. The results only return at memory.

public static class Linq2SqlExtensions {

    public class CountOfT<T> {
        public T Key { get; set; }
        public int Count { get; set; }
    }

    public static IQueryable<TKey> Duplicates<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).Select(s => s.Key);

    public static IQueryable<TSource> GetDuplicates<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).SelectMany(s => s);

    public static IQueryable<CountOfT<TKey>> DuplicatesCounts<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).Select(y => new CountOfT<TKey> { Key = y.Key, Count = y.Count() });

    public static IQueryable<Tuple<TKey, int>> DuplicatesCountsAsTuble<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).Select(s => Tuple.Create(s.Key, s.Count()));
}

Don Feto · Accepted Answer · 2021-08-08 00:18:41Z

This More simple way without use Groups just get the District elements and then iterate over them and check their count in the list if their count is >1 this mean it appear more than 1 item so add it to Repeteditemlist

var mylist = new List<int>() { 1, 1, 2, 3, 3, 3, 4, 4, 4 };
            var distList=  mylist.Distinct().ToList();
            var Repeteditemlist = new List<int>();
            foreach (var item in distList)
            {
               if(mylist.Count(e => e == item) > 1)
                {
                    Repeteditemlist.Add(item);
                }
            }
            foreach (var item in Repeteditemlist)
            {
                Console.WriteLine(item);
            }

Expected OutPut:

1 3 4

nawfal · Accepted Answer · 2022-06-20 08:02:01Z

All the GroupBy answers are the simplest but won't be the most efficient. They're especially bad for memory performance as building large inner collections has allocation cost.

A decent alternative is HuBeZa's HashSet.Add based approach. It performs better.

If you don't care about nulls, something like this is the most efficient (both CPU and memory) as far as I can think:

public static IEnumerable<TProperty> Duplicates<TSource, TProperty>(
    this IEnumerable<TSource> source,
    Func<TSource, TProperty> duplicateSelector,
    IEqualityComparer<TProperty> comparer = null)
{
    comparer ??= EqualityComparer<TProperty>.Default;

    Dictionary<TProperty, int> counts = new Dictionary<TProperty, int>(comparer);

    foreach (var item in source)
    {
        TProperty property = duplicateSelector(item);
        counts.TryGetValue(property, out int count);

        switch (count)
        {
            case 0:
                counts[property] = ++count;
                break;

            case 1:
                counts[property] = ++count;
                yield return property;
                break;
        }
    }
}

The trick here is to avoid additional lookup costs once the duplicate count has reached 1. Of course you could keep updating the dictionary with count if you also want the number of duplicate occurrences for each item. For nulls, you just need some additional handling there, that's all.

Masoud Darvishian · Accepted Answer · 2025-07-28 16:06:11Z

In case anyone is interested, an easy and simple way without using LINQ would be:

struct MyData {
    public int id;
    public string name;
}

var myList = new List<MyData> {
    new MyData { id = 1, name = "a" },
    new MyData { id = 2, name = "b" },
    new MyData { id = 1, name = "c" },
    new MyData { id = 3, name = "d" },
    new MyData { id = 2, name = "e" },
    new MyData { id = 1, name = "f" }
};

// map it to a dictionaly with the key you want 
// and increase the count when duplicate key found
var dic = new Dictionary<int, int>();
foreach (var item in myList) {
    if (dic.ContainsKey(item.id))
        dic[item.id]++;
    else 
        dic.Add(item.id, 1);
}

// display result
foreach (var item in dic) {
    Console.WriteLine($"itemId: {item.Key}, cound: {item.Value}");
}

Output:

itemId: 1, cound: 3
itemId: 2, cound: 2
itemId: 3, cound: 1

John · Accepted Answer · 2021-06-25 13:22:43Z

-2

Remove duplicates by key

myTupleList = myTupleList.GroupBy(tuple => tuple.Item1).Select(group => group.First()).ToList();

edited Jun 25, 2021 at 13:22

answered Jun 25, 2021 at 12:50

John

1,02111 silver badges18 bronze badges

1 Comment

Gert Arnold Over a year ago

The question is not about removing duplicates.

Collectives™ on Stack Overflow

C# LINQ find duplicates in List

14 Answers 14

11 Comments

3 Comments

6 Comments

6 Comments

7 Comments

2 Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

14 Answers 14

11 Comments

3 Comments

6 Comments

6 Comments

7 Comments

2 Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related