How to improve query performance on a single node in MonetDB?

Tomáš Repík

I've installed the latest (MonetDB 5 server v11v.27.5 "Jul2017-SP1") on Windows 2012 Server and I'm trying to query large table 1,4 billion rows in a reasonable time 2-3s.

Is this even possible with MonetDB? What could I do to improve the performance?

Detailed description of what I've done so far:

  1. Created table:

    CREATE TABLE t939ba ( id INT, xa INT, xb INT, ya INT, yb INT, a1 TINYINT, a2 TINYINT, a3 TINYINT, a4 TINYINT, a5 TINYINT, a6 TINYINT, a7 TINYINT, a8 TINYINT, a9 TINYINT);
    
  2. Loaded the data:

    COPY 1450000000 OFFSET 2 RECORDS INTO tbl FROM 'D:\\es_export\\file.csv'
    USING DELIMITERS ',' NULL AS '' LOCKED;
    
  3. Run the query:

    SELECT COUNT(DISTINCT id) FROM tbl WHERE a1=22
    AND xb>=143455 AND yb>=90911 AND xa<=143615 AND ya<=91007
    AND a2 IN (2, 3, 4) AND a3 IN (0, 1, 2, 3, 4) AND a4 IN (0, 1, 2)
    AND a5 IN (-1, 1, 2, 3, 4, 5, 6, 7) AND a6 IN (-1, 11, 12, 13, 14);
    

When I run the query for the 1st time it took (14m 52s), 2nd run of the same query took (3m 23s), 3rd consecutive run of the same query took (14s) and a slightly rearranged query took (3m 11s).

Stefan Manegold

Tomas,

thanks for the plans and traces. I see that you used the revised query with range predicates instead of IN predicates, and that this query now runs in "a mere" ~39 s (compared to ~15 min) --- either because the range predicates are evaluated more efficiently than the IN predicates, or because, as Martin indicated, a later run of the query benefits from indexes that where automatically built by MonetDB when evaluating the first query, or because of both.

In any case, running a/each query (-version) more than once is a good idea to see the possible effect of automatically built indexes.

Further, I see that either you indeed have a 34-core machine, or your machine has "only" 2 GB RAM per core --- not too much given that you have a ~42 GB data set, where each column is ~1.5 GB to ~6 GB in size ...

Hence, the main reason for the query not running faster than ~39 s might be I/O activity due to "lack" of memory.

Best,

Stefan

ps:
You can check whether for this specific query, reducing (or even avoiding) multi-core parallelism helps to reduce I/O thrashing:
Try running your query after disabling MonetDB's "mitosis" optimizer using

set optimizer='no_mitosis_pipe';

You can re-enable full multi-core parallelism using

set optimzer='default_pipe';

Best,
Stefan

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

How to improve performance of this SQL Server query?

分類Dev

Improve SimpleMembership query performance?

分類Dev

How to improve the text column performance in below mentioned SQL query

分類Dev

Improve INSERT INTO SELECT Query performance?

分類Dev

Bitwise operation performance, how to improve

分類Dev

How to improve PostgreSQL performance on INSERT?

分類Dev

How to improve notes database performance?

分類Dev

How to improve the performance of the regular expression?

分類Dev

Improve Neo4j query performance

分類Dev

How can $facet improve $lookup performance

分類Dev

How can I improve performance of this function?

分類Dev

how to accelerate Flex Mobile and improve its performance

分類Dev

Why does adding OrderBy to LINQ to EF query improve its performance?

分類Dev

how to improve mysql query speedy with indexes?

分類Dev

How to improve my natural language search query

分類Dev

How to check performance of mysql query?

分類Dev

Unity runs slow, how can I improve its performance?

分類Dev

Expanding window (cumulative calculation) in data.table: how to improve performance

分類Dev

How to Improve Performance of Kafka Producer when used in Synchronous Mode

分類Dev

How to improve performance method GetThumbnailAsync in window phone 8.1

分類Dev

Improve performance of autograd jacobian

分類Dev

skSpriteKit improve Performance

分類Dev

Improve performance of elasticsearch signals

分類Dev

Improve Xquery performance in SQL Server

分類Dev

Improve Performance of VBA String Comparison

分類Dev

Index doesn't improve performance

分類Dev

Can someone Improve on this Query?

分類Dev

Improve SQL Server query

分類Dev

How much would a RAM increase + 64bit OS improve a system's performance?

Related 関連記事

  1. 1

    How to improve performance of this SQL Server query?

  2. 2

    Improve SimpleMembership query performance?

  3. 3

    How to improve the text column performance in below mentioned SQL query

  4. 4

    Improve INSERT INTO SELECT Query performance?

  5. 5

    Bitwise operation performance, how to improve

  6. 6

    How to improve PostgreSQL performance on INSERT?

  7. 7

    How to improve notes database performance?

  8. 8

    How to improve the performance of the regular expression?

  9. 9

    Improve Neo4j query performance

  10. 10

    How can $facet improve $lookup performance

  11. 11

    How can I improve performance of this function?

  12. 12

    how to accelerate Flex Mobile and improve its performance

  13. 13

    Why does adding OrderBy to LINQ to EF query improve its performance?

  14. 14

    how to improve mysql query speedy with indexes?

  15. 15

    How to improve my natural language search query

  16. 16

    How to check performance of mysql query?

  17. 17

    Unity runs slow, how can I improve its performance?

  18. 18

    Expanding window (cumulative calculation) in data.table: how to improve performance

  19. 19

    How to Improve Performance of Kafka Producer when used in Synchronous Mode

  20. 20

    How to improve performance method GetThumbnailAsync in window phone 8.1

  21. 21

    Improve performance of autograd jacobian

  22. 22

    skSpriteKit improve Performance

  23. 23

    Improve performance of elasticsearch signals

  24. 24

    Improve Xquery performance in SQL Server

  25. 25

    Improve Performance of VBA String Comparison

  26. 26

    Index doesn't improve performance

  27. 27

    Can someone Improve on this Query?

  28. 28

    Improve SQL Server query

  29. 29

    How much would a RAM increase + 64bit OS improve a system's performance?

ホットタグ

アーカイブ