I want to optimize this code for counting the number of occurrences in a list of strings. To be specific, I have two lists
1) cat: a huge list of string with duplicates (duplicates must exist).
2) cat_unq: the distinct elements from cat.
What I am currently doing in my code is looping all unique elements in cat_unq and counting how many times the unique element exists in the list of duplicates. The search runs on a mobile device.
I already tried switching to arrays from list but the performance was slightly better and not sufficient.
Another try was using parallel search using foreach parallel but the performance was not stable.
Here is the code I am currently using :
private List<int> GetCategoryCount(List<string> cat, List<string> cat_unq)
{
List<int> cat_count = new List<int>();
for (int i = 0; i < cat_unq.Count; i++)
cat_count.Add(cat.Where(x => x.Equals(cat_unq[i])).Count());
return cat_count;
}
cat_unqhave all the unique values fromcator a subset? If the former you can just do a grouping oncatand get the count of each occurance. And even if it's the latter it would be better to get those counts in one pass of thecatlist and then use it to get your counts in the desired order.cat_unqand then iterate the values incat.cat_unqrepresents some type of desired order. But really it almost sounds like they might be getting the distinct values up front and then doing this which is even more work than is needed.