I have a list for example List<string> ListProviderKeys
that has some values in it. I also have a second list from a class below, for example List<ChangesSummary> SecondList;
public class ChangesSummary
{
public string TableName { get; set; }
public string ProviderKey { get; set; }
public string ProviderAdrsKey { get; set; }
public string ProviderSpecialtyKey { get; set; }
public string FieldName{ get; set; }
}
Imagine the values that first list holds is the same kind of values we put in ProviderKey
field in the second list. Now What I want is to trim down the second list to only have values that their ProviderKey
IS NOT already in the first list. How Can I do that? I know the operator Except but not sure how to apply it in this situation!
The best I can think of is :
A) Create dictionary and use its fast lookups
B) Use LINQ .Where
method with .ContainsKey()
on this dictionary which internally uses Hashtable
and performs quick lookups.
This should reduce search complexity to almost O(1)
rather than O(N)
ro worse (when we use LINQ .Where()
with .Any()
or .Contains()
and that leads to nested loops).
From MSDN page :
The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.
So what we can do is :
Dictionary<string, string> dict = ListProviderKeys.ToDictionary(s => s);
var newList = SecondList.Where(e => !dict.ContainsKey(e.ProviderKey)).ToList();
Here is a very simple, short, but complete example illustrating it and also testing its performance :
class Person
{
public int Id { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<int> ints = new List<int>();
List<Person> People = new List<Person>(1000);
for (int i = 0; i < 7500; i++)
{
ints.Add(i);
ints.Add(15000 - i - 1);
}
for (int i = 0; i < 45000; i++)
People.Add(new Person() { Id = i });
Stopwatch s = new Stopwatch();
s.Start();
// code A (feel free to uncomment it)
//Dictionary<int, int> dict = ints.ToDictionary(p => p);
//List<Person> newList = People.Where(p => !dict.ContainsKey(p.Id)).ToList();
// code B
List<Person> newList = People.Where(p => !ints.Contains(p.Id)).ToList();
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
Console.WriteLine("Number of elements " + newList.Count);
Console.ReadKey();
}
On release mode results are :
Both code A & code B outputs 30 000 elements but :
It took more than 2000 ms with code B and only 5 ms with code A
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments