I'm running the following Select query:
SELECT "entry"."id" AS "entry_id",
"entry"."input" AS "entry_input",
"entry"."output" AS "entry_output",
"entry"."numOfWords" AS "entry_numOfWords",
"entry"."times_seen" AS "entry_times_seen",
"word_class"."value" AS "word_class_value",
"dominant_noun"."noun" AS "dominant_noun_noun",
"dominant_noun"."article" AS "dominant_noun_article",
"dominant_noun"."isPluaral" AS "dominant_noun_isPluaral",
"subject"."subjectIndex" AS "subject_subjectIndex",
"last_time_visited"."value" AS "last_time_visited_value"
FROM "entry" "entry"
LEFT JOIN "word_class" "word_class" ON "word_class"."entryId"="entry"."id"
LEFT JOIN "dominant_noun" "dominant_noun" ON "dominant_noun"."entryId"="entry"."id"
LEFT JOIN "subject_entries_entry" "subject_entry" ON "subject_entry"."entryId"="entry"."id"
LEFT JOIN "subject" "subject" ON "subject"."id"="subject_entry"."subjectId"
LEFT JOIN "last_time_visited" "last_time_visited" ON "last_time_visited"."entryId"="entry"."id"
WHERE "entry"."inputLang" = 31
AND ("entry"."input" like '% hilfe %' OR "entry"."input" like 'hilfe %' OR "entry"."input" like '% hilfe')
ORDER BY "word_class"."value" DESC, "entry"."numOfWords" ASC;
Time result:
real 0m15.100s
user 0m14.072s
sys 0m1.024s
Against this Database schema:
CREATE TABLE sqlite_sequence(name,seq);
CREATE TABLE IF NOT EXISTS "subject" ("id" integer PRIMARY KEY AUTOINCREMENT NOT NULL, "subjectIndex" tinyint NOT NULL);
CREATE TABLE IF NOT EXISTS "entry" ("id" integer PRIMARY KEY AUTOINCREMENT NOT NULL, "inputLang" tinyint NOT NULL, "outputLang" tinyint NOT NULL, "input"
varchar NOT NULL, "output" varchar NOT NULL, "numOfWords" tinyint NOT NULL, "times_seen" integer NOT NULL DEFAULT (0));
CREATE TABLE IF NOT EXISTS "abbr" ("id" integer PRIMARY KEY AUTOINCREMENT NOT NULL, "value" varchar NOT NULL, "entryId" integer, CONSTRAINT "REL_ca935aaf7
66cba1e7bfbe90275" UNIQUE ("entryId"), CONSTRAINT "FK_ca935aaf766cba1e7bfbe902757" FOREIGN KEY ("entryId") REFERENCES "entry" ("id"));
CREATE TABLE IF NOT EXISTS "word_class" ("id" integer PRIMARY KEY AUTOINCREMENT NOT NULL, "value" integer NOT NULL, "entryId" integer, CONSTRAINT "REL_941
45442deb2b2209bd943a787" UNIQUE ("entryId"), CONSTRAINT "FK_94145442deb2b2209bd943a7874" FOREIGN KEY ("entryId") REFERENCES "entry" ("id"));
CREATE TABLE IF NOT EXISTS "dominant_noun" ("id" integer PRIMARY KEY AUTOINCREMENT NOT NULL, "noun" varchar NOT NULL, "article" tinyint NOT NULL, "isPluar
al" boolean NOT NULL, "entryId" integer, CONSTRAINT "REL_f493eeedea653d8a89f595c82c" UNIQUE ("entryId"), CONSTRAINT "FK_f493eeedea653d8a89f595c82c4" FOREI
GN KEY ("entryId") REFERENCES "entry" ("id"));
CREATE TABLE IF NOT EXISTS "last_time_visited" ("id" integer PRIMARY KEY AUTOINCREMENT NOT NULL, "value" datetime NOT NULL DEFAULT (CURRENT_TIMESTAMP), "e
ntryId" integer, CONSTRAINT "REL_e631a6f55d59214f8e6aaa6447" UNIQUE ("entryId"), CONSTRAINT "FK_e631a6f55d59214f8e6aaa64478" FOREIGN KEY ("entryId") REFER
ENCES "entry" ("id"));
CREATE TABLE IF NOT EXISTS "subject_entries_entry" ("subjectId" integer NOT NULL, "entryId" integer NOT NULL, CONSTRAINT "FK_d2eaa7a84a7963ed94e472cef0b"FOREIGN KEY ("subjectId") REFERENCES "subject" ("id") ON DELETE CASCADE, CONSTRAINT "FK_5f940450dd4c681a9fecf0b14b2" FOREIGN KEY ("entryId") REFERENCES "entry" ("id") ON DELETE CASCADE, PRIMARY KEY ("subjectId", "entryId"));
CREATE INDEX "IDX_3091789786b922bee00bbb44b1" ON "entry" ("inputLang") ;
CREATE INDEX "IDX_36ab3550b9e3ef647d1230affc" ON "entry" ("outputLang") ;
CREATE INDEX "IDX_1b0f6266dffb9a7e6343e7faa4" ON "entry" ("input") ;
CREATE INDEX "IDX_a77c7936ea412ec1958007154a" ON "entry" ("numOfWords") ;
CREATE INDEX "IDX_b32699a03d36223ff9bad94ea6" ON "entry" ("times_seen") ;
Explain result:
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 109 0 00 Start at 109
1 SorterOpen 6 14 0 k(2,-B,B) 00
2 OpenRead 0 12 0 7 00 root=12 iDb=0; entry
3 OpenRead 1 2 0 3 00 root=2 iDb=0; word_class
4 OpenRead 7 3 0 k(2,,) 02 root=3 iDb=0; sqlite_autoindex_word_class_1
5 OpenRead 2 5 0 5 00 root=5 iDb=0; dominant_noun
6 OpenRead 8 6 0 k(2,,) 02 root=6 iDb=0; sqlite_autoindex_dominant_noun_1
7 OpenRead 3 10 0 2 00 root=10 iDb=0; subject_entries_entry
8 OpenRead 4 9 0 2 00 root=9 iDb=0; subject
9 OpenRead 5 7 0 3 00 root=7 iDb=0; last_time_visited
10 OpenRead 9 8 0 k(2,,) 02 root=8 iDb=0; sqlite_autoindex_last_time_visited_1
11 Rewind 0 92 0 00
12 Column 0 1 1 00 r[1]=entry.inputLang
13 Ne 2 91 1 (BINARY) 54 if r[1]!=r[2] goto 91
14 Column 0 3 4 00 r[4]=entry.input
15 Function0 1 3 1 like(2) 02 r[1]=func(r[3..4])
16 If 1 23 0 00
17 Column 0 3 6 00 r[6]=entry.input
18 Function0 1 5 1 like(2) 02 r[1]=func(r[5..6])
19 If 1 23 0 00
20 Column 0 3 8 00 r[8]=entry.input
21 Function0 1 7 1 like(2) 02 r[1]=func(r[7..8])
22 IfNot 1 91 1 00
23 Integer 0 9 0 00 r[9]=0; init LEFT JOIN no-match flag
24 Rowid 0 10 0 00 r[10]=rowid
25 SeekGE 7 87 10 1 00 key=r[10]
26 IdxGT 7 87 10 1 00 key=r[10]
27 DeferredSeek 7 0 1 00 Move 1 to 7.rowid if needed
28 Integer 1 9 0 00 r[9]=1; record LEFT JOIN hit
29 Integer 0 11 0 00 r[11]=0; init LEFT JOIN no-match flag
30 Rowid 0 12 0 00 r[12]=rowid
31 SeekGE 8 83 12 1 00 key=r[12]
32 IdxGT 8 83 12 1 00 key=r[12]
33 DeferredSeek 8 0 2 00 Move 2 to 8.rowid if needed
34 Integer 1 11 0 00 r[11]=1; record LEFT JOIN hit
35 Once 0 44 0 00
36 OpenAutoindex 10 3 0 k(3,B,,) 00 nColumn=3; for subject_entries_entry
37 Rewind 3 44 0 00
38 Column 3 1 13 00 r[13]=subject_entries_entry.entryId
39 Column 3 0 14 00 r[14]=subject_entries_entry.subjectId
40 Rowid 3 15 0 00 r[15]=rowid
41 MakeRecord 13 3 1 00 r[1]=mkrec(r[13..15])
42 IdxInsert 10 1 0 10 key=r[1]
43 Next 3 38 0 03
44 Integer 0 16 0 00 r[16]=0; init LEFT JOIN no-match flag
45 Rowid 0 17 0 00 r[17]=rowid
46 SeekGE 10 80 17 1 00 key=r[17]
47 IdxGT 10 80 17 1 00 key=r[17]
48 Integer 1 16 0 00 r[16]=1; record LEFT JOIN hit
49 Integer 0 18 0 00 r[18]=0; init LEFT JOIN no-match flag
50 Column 10 1 19 00 r[19]=subject_entries_entry.subjectId
51 SeekRowid 4 76 19 00 intkey=r[19]
52 Integer 1 18 0 00 r[18]=1; record LEFT JOIN hit
53 Integer 0 20 0 00 r[20]=0; init LEFT JOIN no-match flag
54 Rowid 0 21 0 00 r[21]=rowid
55 SeekGE 9 72 21 1 00 key=r[21]
56 IdxGT 9 72 21 1 00 key=r[21]
57 DeferredSeek 9 0 5 00 Move 5 to 9.rowid if needed
58 Integer 1 20 0 00 r[20]=1; record LEFT JOIN hit
59 Rowid 0 24 0 00 r[24]=rowid
60 Column 0 3 25 00 r[25]=entry.input
61 Column 0 4 26 00 r[26]=entry.output
62 Column 0 6 27 0 00 r[27]=entry.times_seen
63 Column 2 1 28 00 r[28]=dominant_noun.noun
64 Column 2 2 29 00 r[29]=dominant_noun.article
65 Column 2 3 30 00 r[30]=dominant_noun.isPluaral
66 Column 4 1 31 00 r[31]=subject.subjectIndex
67 Column 5 1 32 00 r[32]=last_time_visited.value
68 Column 1 1 22 00 r[22]=word_class.value
69 Column 0 5 23 00 r[23]=entry.numOfWords
70 MakeRecord 22 11 35 00 r[35]=mkrec(r[22..32])
71 SorterInsert 6 35 22 11 00 key=r[35]
72 IfPos 20 76 0 00 if r[20]>0 then r[20]-=0, goto 76
73 NullRow 5 0 0 00
74 NullRow 9 0 0 00
75 Goto 0 58 0 00
76 IfPos 18 79 0 00 if r[18]>0 then r[18]-=0, goto 79
77 NullRow 4 0 0 00
78 Goto 0 52 0 00
79 Next 10 47 0 00
80 IfPos 16 83 0 00 if r[16]>0 then r[16]-=0, goto 83
81 NullRow 10 0 0 00
82 Goto 0 48 0 00
83 IfPos 11 87 0 00 if r[11]>0 then r[11]-=0, goto 87
84 NullRow 2 0 0 00
85 NullRow 8 0 0 00
86 Goto 0 34 0 00
87 IfPos 9 91 0 00 if r[9]>0 then r[9]-=0, goto 91
88 NullRow 1 0 0 00
89 NullRow 7 0 0 00
90 Goto 0 28 0 00
91 Next 0 12 0 01
92 OpenPseudo 11 36 14 00 14 columns in r[36]
93 SorterSort 6 108 0 00
94 SorterData 6 36 11 00 r[36]=data
95 Column 11 10 34 00 r[34]=last_time_visited_value
96 Column 11 9 33 00 r[33]=subject_subjectIndex
97 Column 11 8 32 00 r[32]=dominant_noun_isPluaral
98 Column 11 7 31 00 r[31]=dominant_noun_article
99 Column 11 6 30 00 r[30]=dominant_noun_noun
100 Column 11 0 29 00 r[29]=word_class_value
101 Column 11 5 28 00 r[28]=entry_times_seen
102 Column 11 1 27 00 r[27]=entry_numOfWords
103 Column 11 4 26 00 r[26]=entry_output
104 Column 11 3 25 00 r[25]=entry_input
105 Column 11 2 24 00 r[24]=entry_id
106 ResultRow 24 11 0 00 output=r[24..34]
107 SorterNext 6 94 0 00
108 Halt 0 0 0 00
109 Transaction 0 0 348 0 01 usesStmtJournal=0
110 Integer 31 2 0 00 r[2]=31
111 String8 0 3 0 % hilfe % 00 r[3]='% hilfe %'
112 String8 0 5 0 hilfe % 00 r[5]='hilfe %'
113 String8 0 7 0 % hilfe 00 r[7]='% hilfe'
114 Goto 0 1 0 00
Explain Query Plan output:
QUERY PLAN
|--SCAN TABLE entry AS entry
|--SEARCH TABLE word_class AS word_class USING INDEX sqlite_autoindex_word_class_1 (entryId=?)
|--SEARCH TABLE dominant_noun AS dominant_noun USING INDEX sqlite_autoindex_dominant_noun_1 (entryId=?)
|--SEARCH TABLE subject_entries_entry AS subject_entry USING AUTOMATIC COVERING INDEX (entryId=?)
|--SEARCH TABLE subject AS subject USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE last_time_visited AS last_time_visited USING INDEX sqlite_autoindex_last_time_visited_1 (entryId=?)
`--USE TEMP B-TREE FOR ORDER BY
Analyze output:
subject||1437631
entry|IDX_b32699a03d36223ff9bad94ea6|2348382 2348382
entry|IDX_a77c7936ea412ec1958007154a|2348382 67097
entry|IDX_1b0f6266dffb9a7e6343e7faa4|2348382 2
entry|IDX_36ab3550b9e3ef647d1230affc|2348382 1174191
entry|IDX_3091789786b922bee00bbb44b1|2348382 1174191
abbr|sqlite_autoindex_abbr_1|42575 1
dominant_noun|sqlite_autoindex_dominant_noun_1|823071 1
word_class|sqlite_autoindex_word_class_1|2005516 1
subject_entries_entry|sqlite_autoindex_subject_entries_entry_1|1437631 1 1
It often takes more than 10 seconds to get the results. Although this is my first time working with SQLite but 20 seconds reply time seems strange. Please add a comment if i should provide extra info in order to resolve the problem?
The factor to the long query times you're seeing is the subject_entries_entry
table. It's a standard junction table used to relate rows in the entry
table to rows in the subject
table. The table definition uses a primary key that puts the subject id first, followed by the entry id (PRIMARY KEY ("subjectId", "entryId")
).
Your query, on the other hand, first joins the entry id on the table, and then the subject id - the opposite of the order in the key. Sqlite can and does reorder the tables in a join to try to be as efficient as possible, but in this case it's not doing so. Going to the EXPLAIN QUERY PLAN
output:
SEARCH TABLE subject_entries_entry AS subject_entry USING AUTOMATIC COVERING INDEX (entryId=?)
The SEARCH
means it's looking up specific rows in an index instead of looking at every single one (SCAN
), which is what you want, but the AUTOMATIC COVERING INDEX
part is bad. AUTOMATIC
means that the query planner hasn't found an existing index it can use, but thinks using an index will be better than having to scan the table - so it builds a temporary one that exists just for that query. It looks like the subject_entries_entry
table has a lot of rows, so this can take a while.
Recreating the table with the primary key columns in the same order they're used in the join cut the time down by a lot (As would a separate index with the columns flipped, at the expense of more disk space used).
My other advise for this table is making it a WITHOUT ROWID one. Normal sqlite tables use a 64-bit integer primary key (Known as rowid) regardless of what the table definition uses; a non-INTEGER PRIMARY KEY
is just a normal UNIQUE
index in such a table. With WITHOUT ROWID
, the primary key is the actual primary key of the table, saving space in cases like this where there's no real use for the actual rowid. Instead of a table and an index that duplicates the contents of each row, it just has a table. This optimization won't affect query speed, though, since it's using a covering index that has all needed information in the index already; the actual table isn't even looked at in the query as it is now.
I'm not sure about further speedups - looking at the query plan, it's using pre-existing indexes for the rest of the tables, and the join clauses are all straightforward. I'm a little surprised it's not using the index on entry(inputLang)
to do a search instead of a scan on that table. Maybe if you can rebuild the sqlite library with SQLITE_ENABLE_STAT4 turned on and then a PRAGMA optimize to rebuild the statistics tables, but that's getting into pretty advanced stuff depending on what language you're using (Easy in C or C++, harder in others).
Edit:
Some other things to explore:
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加