How to find List has duplicate values in List<string> [duplicate]

Question

How to find whether the List<string> has duplicate values or not ?

I tried with below code. Is there any best way to achieve ?

var lstNames = new List<string> { "A", "B", "A" };

if (lstNames.Distinct().Count() != lstNames.Count())
{
    Console.WriteLine("List contains duplicate values.");
}

Sorry Guys.. I missed the simple logic.

Prasad Kanaparthi
– Prasad Kanaparthi

2013-01-16 16:59:27 +00:00
Commented Jan 16, 2013 at 16:59 — Prasad Kanaparthi
– Prasad Kanaparthi, Commented Jan 16, 2013 at 16:59
Please don't say sorry. We all here for learning..

Soner Gönül
– Soner Gönül

2013-01-16 17:02:00 +00:00
Commented Jan 16, 2013 at 17:02 — Soner Gönül
– Soner Gönül, Commented Jan 16, 2013 at 17:02

Soner Gönül · Accepted Answer · 2013-01-16 16:57:04Z

126

Try to use GroupBy and Any like;

lstNames.GroupBy(n => n).Any(c => c.Count() > 1);

GroupBy method;

Groups the elements of a sequence according to a specified key selector function and projects the elements for each group by using a specified function.

Any method, it returns boolean;

Determines whether any element of a sequence exists or satisfies a condition.

edited Jan 16, 2013 at 16:57

answered Jan 16, 2013 at 16:50

Soner Gönül

99.1k103 gold badges224 silver badges375 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Servy Over a year ago

How is this better than the code in the OP? You still need to group all of the items, so you don't really have any short circuiting here.

Rawling Over a year ago

This not only has to iterate through all the elements to build the groups, it then has to iterate through potentially all of the groups too. Your original coffee will be faster.

Rawling Over a year ago

... Damn you, autocorrect.

GorkemHalulu Over a year ago

According to my tests, original code is at least 1.5 times faster (depending on inputs) than this

Oliver Over a year ago

How about replacing c.Count() > 1 with c.Skip(1).Any()?

|

Rawling · Accepted Answer · 2016-06-08 06:46:53Z

51

If you're looking for the most efficient way of doing this,

var lstNames = new List<string> { "A", "B", "A" };
var hashset = new HashSet<string>();
foreach(var name in lstNames)
{
    if (!hashset.Add(name))
    {
        Console.WriteLine("List contains duplicate values.");
        break;
    }
}

will stop as soon as it finds the first duplicate. You can wrap this up in a method (or extension method) if you'll be using it in several places.

edited Jun 8, 2016 at 6:46

answered Jan 16, 2013 at 16:55

Rawling

50.3k7 gold badges94 silver badges131 bronze badges

4 Comments

Ilya Ivanov Over a year ago

+1 performance ten times better in worst case, than in GroupBy

Servy Over a year ago

@IlyaIvanov Actually, in the worst case (no duplicates), it's about the same, maybe just a tad faster. In the best case (the first two items are duplicates) it's 100% faster, as it will be O(1) not O(n). In the general case it will be dependent on the actual rate of duplicates in the underlying data, while GroupBy and Distinct take the same time regardless of the underlying data.

John Shedletsky Over a year ago

"O" means "worst case" by the way. There is no "in the best case it will be O(x)"

Flonk Over a year ago

@JohnShedletsky 'O(f)' represents the set of functions that don't grow faster than f, that is to say, g(x) <= f(x) * C for g in O(f) and some constant C, if x is large enough. It doesn't imply anything about best or worst cases.

Zoltán Tamási · Accepted Answer · 2013-11-11 16:42:08Z

29

A generalized and compact extension version of the answer based on hash technique:

public static bool AreAnyDuplicates<T>(this IEnumerable<T> list)
{
    var hashset = new HashSet<T>();
    return list.Any(e => !hashset.Add(e));
}

answered Nov 11, 2013 at 16:42

Zoltán Tamási

12.9k8 gold badges71 silver badges101 bronze badges

5 Comments

Eluvatar Over a year ago

cool, I've added it to my linq extensions, I added an overload to provide a comparer though.

curiousBoy Over a year ago

I know this pretty old and even though creating an extension method is cool but this is a really bad from performance perspective.. It should be using group by rather than trying to insert each and every list object to hashset.

Zoltán Tamási Over a year ago

@curiousBoy I'm pretty sure that GroupBy is implemented using some kind of hashed structure internally, so basically it should have about the same performance. According to my best knowledge adding elements to a HashSet is "cheap" in terms of computation and uses at most the same amount of memory as the original list. Also, I'm not sure but having GroupBy and Any after each other might not be very lazy while it's obvious that this solution will stop on first duplicate item. Could you please clarify why you think it has poor performance?

Erick Brown Over a year ago

@Eluvatar I know you answered this a long time ago, but, I'd love to see your code if you're willing to share. Thanks!

Zoltán Tamási Over a year ago

@ErickBrown The constructor of a HashSet<T> does accept a custom comparer, I think @Eluvatar meant to expose that as a parameter of this extension.

Nasmi Sabeer · Accepted Answer · 2013-01-16 16:49:40Z

12

var duplicateExists = lstNames.GroupBy(n => n).Any(g => g.Count() > 1);

answered Jan 16, 2013 at 16:49

Nasmi Sabeer

1,3809 silver badges22 bronze badges

3 Comments

Prasad Kanaparthi Over a year ago

Hmmmm.. I missed the simple logic.

Nasmi Sabeer Over a year ago

I think Any() is preferred than Count(), but I don't know the performance difference between Distinct() and GroupBy()

Prasad Kanaparthi Over a year ago

I think in case of List<someClass> you need to group by all of the items and again you need to apply Any() of all items. I am not sure how can i compare with just using Count() in my example.

SUNIL DHAPPADHULE · Accepted Answer · 2019-02-14 06:32:55Z

 class Program
{
    static void Main(string[] args)
    {
        var listFruits = new List<string> { "Apple", "Banana", "Apple", "Mango" };
        if (FindDuplicates(listFruits)) { WriteLine($"Yes we find duplicate"); };
        ReadLine();
    }
    public static bool FindDuplicates(List<string> array)
    {
        var dict = new Dictionary<string, int>();
        foreach (var value in array)
        {
            if (dict.ContainsKey(value))
                dict[value]++;
            else
                dict[value] = 1;
        }
        foreach (var pair in dict)
        {
            if (pair.Value > 1)
                return true;
            else
                return false;
        }
        return false;
    }
}

Collectives™ on Stack Overflow

How to find List has duplicate values in List<string> [duplicate]

5 Answers 5

7 Comments

4 Comments

5 Comments

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

7 Comments

4 Comments

5 Comments

3 Comments

Comments

Linked

Related