SQLServerクエリの断続的なパフォーマンスの問題

debugcn 投稿 Dev

ベン

最近、SQL Server（2016）の特定のクエリでパフォーマンスの問題が発生しました。私が見ている問題は、パフォーマンスの問題が信じられないほど一貫性がなく、これを改善する方法がわからないことです。

表の詳細：

CREATE TABLE ContactRecord 
(
  ContactSeq BIGINT NOT NULL 
, ApplicationCd VARCHAR(2) NOT NULL 
, StartDt DATETIME2 NOT NULL 
, EndDt DATETIME2 
, EndStateCd VARCHAR(3) 
, UserId VARCHAR(10) 
, UserTypeCd VARCHAR(2) 
, LineId VARCHAR(3) 
, CallingLineId VARCHAR(20) 
, DialledLineId VARCHAR(20) 
, ChannelCd VARCHAR(2) 
, SubChannelCd VARCHAR(2) 
, ServicingAgentCd VARCHAR(7) 
, EucCopyTimestamp VARCHAR(30) 
, PRIMARY KEY (ContactSeq)
, FOREIGN KEY (ApplicationCd) REFERENCES ApplicationType(ApplicationCd)
, FOREIGN KEY (EndStateCd) REFERENCES EndStateType(EndStateCd)
, FOREIGN KEY (UserTypeCd) REFERENCES UserType(UserTypeCd)
)

CREATE TABLE TransactionRecord 
(
  TransactionSeq BIGINT NOT NULL 
, ContactSeq BIGINT NOT NULL 
, TransactionTypeCd VARCHAR(3) NOT NULL 
, TransactionDt DATETIME2 NOT NULL 
, PolicyId VARCHAR(10) 
, ProductId VARCHAR(7) 
, EucCopyTimestamp VARCHAR(30) 
, Detail VARCHAR(1000) 
, PRIMARY KEY (TransactionSeq)
, FOREIGN KEY (ContactSeq) REFERENCES ContactRecord(ContactSeq)
, FOREIGN KEY (TransactionTypeCd) REFERENCES TransactionType(TransactionTypeCd)
)

現在のレコード数：

ContactRecord 2000万
TransactionRecord 9000万

私の質問は：

select
   UserId,
   max(StartDt) as LastLoginDate 
from
   ContactRecord 
where
   ContactSeq in 
   (
      select
         ContactSeq 
      from
         TransactionRecord 
      where
         ContactSeq in 
         (
            select
               ContactSeq 
            from
               ContactRecord 
            where
               UserId in 
               (
                  '1234567890',
                  '1234567891' -- Etc.
               )
         )
         and TransactionRecord.TransactionTypeCd not in 
         (
            '122'
         )
   )
   and ApplicationCd not in 
   (
      '1',
      '4',
      '5'
   )
group by
   UserId;

現在、クエリは優れておらず、結合を使用して改善できますが、基本的には機能します。私が抱えている問題は、データジョブが約7100のuserIdの入力を受け取ることです。次に、これらは500のグループに分割されます。500ごとに、これらはINこのクエリの句で使用されます。IN句に500項目を含むこのクエリの最初の14回の実行は、正常に実行されます。結果は、それぞれ約15〜20秒で返されます。

問題は、このクエリの最後の実行に対する残りの100のギブまたはテイクにあります。それは決して完了しないようです。ハングするだけです。私たちのデータジョブでは、10分後にタイムアウトになります。理由がわかりません。私はSQLServerの専門家ではないので、これをデバッグする方法がよくわかりません。各サブクエリを個別に実行してから、サブクエリの内容を返されたデータに置き換えました。サブクエリごとにこれを行うと正常に機能します。

これが大量のパラメータで一貫してどのように機能するのか途方に暮れているので、ここでどんな助けも本当にありがたいですが、ほんの一部では機能しません。

編集

ここに実行計画の3つの例があります。これらのそれぞれはテストサーバーで実行され、このテストに相当するデータがほとんどないため、すべてほぼ瞬時に実行されることに注意してください。

これは、本番環境で正常に実行され、約15〜20秒で返される500個の引数の実行プランです。

これは、10分後にデータジョブでタイムアウトする119個の引数の実行プランです。

これは、正常に実行される5つの引数の実行プランです。このクエリは、データジョブで明示的に実行されているのではなく、比較のために実行されています。

すべての場合において、SSMSは次の警告を出しました。

/*
Missing Index Details from SQLQuery2.sql
The Query Processor estimates that implementing the following index could improve the query cost by 26.3459%.
*/

/*
USE [CloasIvr]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[TransactionRecord] ([TransactionTypeCd])
INCLUDE ([ContactSeq])
GO
*/

これがこの問題の根本的な原因ですか？

ショーンブ

何が起こっているのかを見ずに、何が起こっているのかを正確に知ることは困難です-特に失敗しているものでは。「良い」実行の実行プランは少し役立ちますが、悪い実行で何がうまくいかないかを推測しているだけです。

私の最初の推測（私のコメントと同様）は、それが期待するものの見積もりが非常に間違っており、それは非常に悪い計画を作成するということです。

Your TransactionRecord table in particular, with the detail column that is 1000 characters, would could have big issues with an unexpected large number of nested loops.

Indexes

The first thing I would suggest is indexing - particularly to a) only include a subset of the data you need for these, and b) to have them ordered in a useful manner.

I suggest the following two indexes would appear to help

CREATE INDEX IX_ContactRecord_User ON ContactRecord 
    (UserId, ContactSeq) 
    INCLUDE (ApplicationCD, Startdt);

CREATE INDEX IX_TransactionRecord_ContactSeq ON TransactionRecord 
    (ContactSeq, TransactionTypeCd);

These are both 'covering indexes', as well as being sorted in ways that can help. Alternatively, you could replace the first one with a slightly modified version (sorting first on ContactSeq) but I think the above version would be more useful.

CREATE INDEX IX_ContactRecord_User2 ON ContactRecord 
    (ContactSeq) 
    INCLUDE (ApplicationCD, Startdt, UserId);

Also, regarding the index on TransactionRecord - if this is the only query that would be using that index, you could improve it by creating the following index instead

CREATE INDEX IX_TransactionRecord_ContactSeq_Filtered ON TransactionRecord
    (ContactSeq, TransactionTypeCd) 
    WHERE (TransactionTypeCD <> '122');

The above is a filtered index that matches what's specified in the WHERE clause of your statement. The big thing about this is that it has already a) removed the records where the type <> '122', and b) has sorted the records already on ContactSeq so it's then easy to look them up.

By the way - given you asked about adding indexes on Foreign Keys on principle - the use of these really depends on how you read the data. If you are only ever referring to the referenced table (e.g., you have an FK to a status table, and only ever use it to report, in English, the statuses) then an index on the original table's Status_ID wouldn't help. On the other hand, if you want to find all the rows with Status_ID = 4, then it would help.

To help understanding indexes, I strongly recommend Brent Ozar's How to think like an SQL Server Engine - it really helped me to understand how indexes work in practice.

Use a sorted temp table

This may help but is unlikely to be the primary fix. If you pre-load the relevant UserIDs into a temporary table (with a primary key on UserID) then it may help with the relevant JOIN. It may also be easier for you to modify each run rather than have to modify the middle of the query.

CREATE TABLE #Users (UserId VARCHAR(10) PRIMARY KEY);
INSERT INTO #Users (UserID) VALUES
('1234567890'),
('1234567891');

Then replace the middle section of your query with

      where
         ContactSeq in 
         (
            select
               ContactSeq 
            from
               ContactRecord CR
               INNER JOIN #Users U ON CR.UserID = U.UserID 
         )
         and TransactionRecord.TransactionTypeCd not in 
         (
            '122'
         )

Simplify the query

I had a go at simplifying the query, and got it to this:

select  CR.UserId,
        max(CR.StartDt) as LastLoginDate 
from    ContactRecord CR
        INNER JOIN TransactionRecord TR ON CR.ContactSeq = TR.ContactSeq
where   TR.TransactionTypeCd not in ('122')
        AND CR.ApplicationCd not in ('1', '4', '5')
        AND CR.UserId in ('1234567890', '1234567891') -- etc
group by UserId;

or alternatively (with the temp table)

select  CR.UserId,
        max(CR.StartDt) as LastLoginDate 
from    ContactRecord CR
        INNER JOIN #Users U ON CR.UserID = U.UserID 
        INNER JOIN TransactionRecord TR ON CR.ContactSeq = TR.ContactSeq
where   TR.TransactionTypeCd not in ('122')
        AND CR.ApplicationCd not in ('1', '4', '5')
group by UserId;

One advantage of simplifying the query, is that it also helps SQL Server get good estimates; which in turn help it get good execution plans.

もちろん、上記が状況に応じてまったく同じレコードを返すことをテストする必要があります。テストするデータセットがないため、これらの簡略化されたバージョンが元のバージョンと100％一致するかどうかはわかりません。

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-13

コメントを追加

サインイン

分類Dev

Related 関連記事

記事

SQLServerクエリの断続的なパフォーマンスの問題

SQLServerクエリの断続的なパフォーマンスの問題

RegexクエリのMongoDBパフォーマンスの問題

再帰クエリのパフォーマンスの問題

MongoDBクエリのパフォーマンスの問題

linqクエリのパフォーマンスの問題

単純なクエリでのパフォーマンスの問題

OpenJDKは断続的なパフォーマンスとUIの問題を示しています

パフォーマンスの問題の診断

mongodbの大きなネストされたデータのクエリパフォーマンスの問題

DECENDANTS使用時のMDXクエリのパフォーマンスの問題

集計クエリでのMySQLViewのパフォーマンスの問題

クエリでのMySqlのパフォーマンスの問題

「select」SQLiteリクエストのパフォーマンスの問題

9.6アップグレード後のpostgresqlINクエリでの奇妙なパフォーマンスの問題

SQLServerとASP.NETMVCのパフォーマンスの問題

NOTINクエリでのSparkパフォーマンスの問題

パフォーマンスの問題のクエリを特定する方法 - MYSQL

JOIN FETCHパフォーマンスの問題があるクエリ

SQL結合サブクエリの問題/パフォーマンス

サブクエリを使用したMYSQLクエリのパフォーマンスの問題

アスタリスクcdrmysqlクエリのパフォーマンスの問題

System Health OpenJDKは、断続的なパフォーマンスとUIの問題を示しています。Oracle JRE / JDKの使用をお勧めします

パーティションと最大のクエリパフォーマンスの問題が遅い

SQLServer-データが膨大な場合のIN演算子の使用中のクエリのパフォーマンス

SQLServerのパフォーマンスの問題。「パラメータ内」のレコード数が増えると、クエリのパフォーマンスが大幅に低下します。

SQLクエリのパフォーマンスの問題の全表スキャン

Javaリフレクションのパフォーマンスの問題

再帰的なORMクラスでのSpringリポジトリのパフォーマンスの問題

SQLServerのクエリパフォーマンスチューニング

パフォーマンスの問題