Excluding items from a list that exist in another list

Bohn Published at Dev

Bohn

I have a list for example List<string> ListProviderKeys that has some values in it. I also have a second list from a class below, for example List<ChangesSummary> SecondList;

public class ChangesSummary
{
    public string TableName { get; set; }
    public string ProviderKey { get; set; }
    public string ProviderAdrsKey { get; set; }
    public string ProviderSpecialtyKey { get; set; }
    public string FieldName{ get; set; }
}

Imagine the values that first list holds is the same kind of values we put in ProviderKey field in the second list. Now What I want is to trim down the second list to only have values that their ProviderKey IS NOT already in the first list. How Can I do that? I know the operator Except but not sure how to apply it in this situation!

Fabjan

The best I can think of is :

A) Create dictionary and use its fast lookups

B) Use LINQ .Where method with .ContainsKey() on this dictionary which internally uses Hashtable and performs quick lookups.

This should reduce search complexity to almost O(1) rather than O(N) ro worse (when we use LINQ .Where() with .Any() or .Contains() and that leads to nested loops).

From MSDN page :

The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.

So what we can do is :

Dictionary<string, string> dict = ListProviderKeys.ToDictionary(s => s);

var newList = SecondList.Where(e => !dict.ContainsKey(e.ProviderKey)).ToList();

Here is a very simple, short, but complete example illustrating it and also testing its performance :

class Person
{
    public int Id { get; set; }        
}

class Program
{
    static void Main(string[] args)
    {
        List<int> ints = new List<int>();
        List<Person> People = new List<Person>(1000);

        for (int i = 0; i < 7500; i++)
        {
            ints.Add(i);
            ints.Add(15000 - i - 1);
        }

        for (int i = 0; i < 45000; i++)
            People.Add(new Person() { Id = i });

        Stopwatch s = new Stopwatch();

        s.Start();

        // code A (feel free to uncomment it)         
        //Dictionary<int, int> dict = ints.ToDictionary(p => p);

        //List<Person> newList = People.Where(p => !dict.ContainsKey(p.Id)).ToList();

        // code B
        List<Person> newList = People.Where(p => !ints.Contains(p.Id)).ToList();

        s.Stop();

        Console.WriteLine(s.ElapsedMilliseconds);
        Console.WriteLine("Number of elements " + newList.Count);

        Console.ReadKey();
    }

On release mode results are :

Both code A & code B outputs 30 000 elements but :

It took more than 2000 ms with code B and only 5 ms with code A

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-02-23

Comments

0 comments

From Dev

Related Related

Article

Excluding items from a list that exist in another list

Excluding items from a list that exist in another list

Checking if List contains all items from another list

Using LINQ to remove items from a list that do not apear in another list

Optimized way to select items from a Collection excluding a list of certain member using Linq

Remove Items from sub list matching another sub list with Linq

Create a list with items from another list at indices specified in a third list

Remove a sublist from a list in which the numbers alreadys exist in another sublist

Copy List to another List, but excluding certain properties of List type

How to filter one list of items from another list of items?

removing items from list that exist in another list based on an ID member

Remove all objects from list that does not exist in another list

Copy list of items to another list

Python removing items from a list that exist in another list but keeping duplicates that aren't in that intersect

Remove items from a List which are in another List by a specific property?

Excluding grep from process list

Optimized way to select items from a Collection excluding a list of certain member using Linq

Remove Items from sub list matching another sub list with Linq

Create list from list items

Remove list of items from another file in bash

removing items from list that exist in another list based on an ID member

Remove list items if it contains text from another list?

How to move Items from one list to the another list in python?

Python removing items from a list that exist in another list but keeping duplicates that aren't in that intersect

Remove items from a List which are in another List by a specific property?

How to update all duplicates from List<Objects> if they exist in another list?

Delete items from list of list

How to create list of items from another list

find lists that start with items from another list

Finding indices of items from a list in another list even if they repeat

picking items from list of list