LINQ .Where query takes 5+ minutes to execute

Casey Crookston

I need to filter a List<object> so that I remove any items where a string property does not exist inside another List<string>.

I created this console app just to make sure I had the LINQ syntax correct:

class FooBar
{
    public int Id { get; set; }
    public string ValueName { get; set; }
}

and then...

List<FooBar> foobars = new List<FooBar>
{
    new FooBar { Id = 1, ValueName = "Val1" },
    new FooBar { Id = 2, ValueName = "Val2" },
    new FooBar { Id = 3, ValueName = "Val3" },
    new FooBar { Id = 4, ValueName = "Val4" }
};

List<string> myStrings = new List<string>
{
    "Val1",
    "Val3"
};

// Only keep records where ValueName is found in `myStrings`
foobars = foobars.Where(f => myStrings.Contains(f.ValueName)).ToList();

So, this line:

foobars = foobars.Where(f => myStrings.Contains(f.ValueName)).ToList();

does exactly what I want, and it gives me back these two records:

{ Id = 1, ValueName = "Val1" }
{ Id = 3, ValueName = "Val3" }

All's well. BUT... in the actual application, foobars has over 200k items, and myStrings has about 190k. And when that LINQ line gets executed, it takes upwards of 5 minutes to complete.

I'm clearly doing something wrong. 200k records isn't THAT big. And the real FooBar isn't all that complex (no nested objects, and only 9 properties).

What's going on here?

Julian

The problem here is that you're doing foobars.Where(f => myStrings.Contains(f.ValueName)) , so for every item in foobars your are checking all items of myStrings.

That scales quadratic. Also called O(n^2), read more here. So if you have 10+10 items, you do 100 checks(10*10), and if you have 10,000+10,000 items, you will do 100,000,000 checks. In your case you're doing 38,000,000,000+ checks ;)

Solution: create a HashSet from myStrings and use Contains of the HashSet.

e.g. replace with:

var myStringsSet = new HashSet<string>(myStrings);
foobars = foobars.Where(f => myStringsSet.Contains(f.ValueName)).ToList();

Now with 10,000+10,000 items, you will do 10,000 checks instead of 100,000,000. In your case that will 200,000 checks instead of 38,000,000,000.

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Linq query(Filter、where、if)

分類Dev

Mysql query takes a lot of time to execute

分類Dev

Cron email not sent by sSMTP if job takes > 5 minutes

分類Dev

Where to call .AsParallel() in a LINQ query

分類Dev

Oracle SQL query taking too long like 60 minutes to execute

分類Dev

sql query takes too much time with one more where clause

分類Dev

Where to find translated Linq to Entity query to Sql

分類Dev

LINQ to XML query Where() function not returning anything

分類Dev

Optimization of DB2 query which uses joins and takes 1.5 hours to execute

分類Dev

LINQ-to-SQL Insert ... Select ... Where ... in one query

分類Dev

PostgreSQL query takes 100x longer when I make the WHERE clauses more restrictive, counterintuitively

分類Dev

LINQ Where ... AND NOT

分類Dev

Time() By 5 Minutes Select

分類Dev

Linq Where inside a where

分類Dev

How to execute a LINQ to Entities query, pulling values from an existing resultset variable instead of against the DB

分類Dev

Linq not execute as expected

分類Dev

Android AsyncTask execute every x minutes

分類Dev

Why concurrent code takes more time to execute

分類Dev

twisted - detection of lost connection takes more than 30 minutes

分類Dev

Oracle SELECT * FROM LARGE_TABLE - takes minutes to respond

分類Dev

Windows 10 takes two minutes for network identification at boot

分類Dev

Linq to SQL: WHERE IN statement

分類Dev

Linq to SQL: WHERE IN statement

分類Dev

Linq select where contains

分類Dev

linq where AND with multiple OR's

分類Dev

Creating LINQ with where clause

分類Dev

MVC 5 - Why is my linq code selecting an entire query instead of a single value?

分類Dev

Sql query with Joins into Linq query

分類Dev

Translate SQL Query into LINQ Query

Related 関連記事

  1. 1

    Linq query(Filter、where、if)

  2. 2

    Mysql query takes a lot of time to execute

  3. 3

    Cron email not sent by sSMTP if job takes > 5 minutes

  4. 4

    Where to call .AsParallel() in a LINQ query

  5. 5

    Oracle SQL query taking too long like 60 minutes to execute

  6. 6

    sql query takes too much time with one more where clause

  7. 7

    Where to find translated Linq to Entity query to Sql

  8. 8

    LINQ to XML query Where() function not returning anything

  9. 9

    Optimization of DB2 query which uses joins and takes 1.5 hours to execute

  10. 10

    LINQ-to-SQL Insert ... Select ... Where ... in one query

  11. 11

    PostgreSQL query takes 100x longer when I make the WHERE clauses more restrictive, counterintuitively

  12. 12

    LINQ Where ... AND NOT

  13. 13

    Time() By 5 Minutes Select

  14. 14

    Linq Where inside a where

  15. 15

    How to execute a LINQ to Entities query, pulling values from an existing resultset variable instead of against the DB

  16. 16

    Linq not execute as expected

  17. 17

    Android AsyncTask execute every x minutes

  18. 18

    Why concurrent code takes more time to execute

  19. 19

    twisted - detection of lost connection takes more than 30 minutes

  20. 20

    Oracle SELECT * FROM LARGE_TABLE - takes minutes to respond

  21. 21

    Windows 10 takes two minutes for network identification at boot

  22. 22

    Linq to SQL: WHERE IN statement

  23. 23

    Linq to SQL: WHERE IN statement

  24. 24

    Linq select where contains

  25. 25

    linq where AND with multiple OR's

  26. 26

    Creating LINQ with where clause

  27. 27

    MVC 5 - Why is my linq code selecting an entire query instead of a single value?

  28. 28

    Sql query with Joins into Linq query

  29. 29

    Translate SQL Query into LINQ Query

ホットタグ

アーカイブ