Slow queries: SQL indexes not used
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Zeitgeist Framework |
Fix Released
|
High
|
Markus Korn |
Bug Description
THE PROBLEM:
I am seeing query times around 200ms (and 150ms with my latest performance tweak in trunk), which surprised me as quite slow since my queries where quite simple. I had expected times around 1-2ms.
Reading up on the sqlite documentation I see that the queries we generate are pretty far from optimized in an sqlite world [1]. The case is that when ever you use an OR sqlite will no longer use an index. Thus this query is NOT using the indexes:
SELECT * FROM event WHERE interpretation=1 OR interpretation=2
But if we rewrite it using IN instead the indexes will be used:
SELECT * FROM event WHERE interpretation IN (1, 2)
This also explains the case where Michal where seeing tremendously slow query times when searching for a big range of mimetypes.
Looking in _zeitgeist.
THE SOLUTION
I *definitely* don't think we should panic and feverishly start rewriting our query compilation. Here's what I propose:
1) Implement an envvar ZEITGEIST_
2) Collect some useful intelligence with this new tool, and generally learn more about how we can optimize sqlite queries. A big question here is how the event_view VIEW impacts the query plan.
3) Write a new template -> SQL compilation engine that generates SQL optimized for sqlite. We can actually be quite clever about grouping our OR statements into IN clauses - but it will be tricky to get right.
NOTE: That this doesn't imply any change in the public API or event template system. That would be the wrong solution imho. Our current API is nice and simple by my standards. Let's keep it that way.
[1]: See fx the section "Using Indexes To Speed Searching" in http://
Related branches
- Markus Korn: Approve
- Diff: 0 lines
Changed in zeitgeist: | |
milestone: | 0.6 → 0.7 |
Changed in zeitgeist: | |
status: | New → Confirmed |
Changed in zeitgeist: | |
milestone: | 0.7.0 → none |
Changed in zeitgeist: | |
milestone: | none → 0.8.0 |
Changed in zeitgeist: | |
status: | Fix Committed → Fix Released |
My concern is which time frame we should take to solve this issue... How
important is it for your work?
On Tue, Sep 7, 2010 at 2:21 PM, Mikkel Kamstrup Erlandsen <
<email address hidden>> wrote:
> Public bug reported: engine. sql.WhereClause .add_text_ condition( ) I am DEBUG_QUERY_ PLANS which will spit out www.sqlite. org/vdbe. html /bugs.launchpad .net/bugs/ 632363
>
> THE PROBLEM:
> I am seeing query times around 200ms (and 150ms with my latest performance
> tweak in trunk), which surprised me as quite slow since my queries where
> quite simple. I had expected times around 1-2ms.
>
> Reading up on the sqlite documentation I see that the queries we
> generate are pretty far from optimized in an sqlite world [1]. The case
> is that when ever you use an OR sqlite will no longer use an index. Thus
> this query is NOT using the indexes:
>
> SELECT * FROM event WHERE interpretation=1 OR interpretation=2
>
> But if we rewrite it using IN instead the indexes will be used:
>
> SELECT * FROM event WHERE interpretation IN (1, 2)
>
> This also explains the case where Michal where seeing tremendously slow
> query times when searching for a big range of mimetypes.
>
> Looking in _zeitgeist.
> also pretty sure we are not using the indexes for prefix queries (eg
> file://home/*).
>
>
> THE SOLUTION
> I *definitely* don't think we should panic and feverishly start rewriting
> our query compilation. Here's what I propose:
>
> 1) Implement an envvar ZEITGEIST_
> all our SQL calls and the query plans for each of our calls. The query
> plan will tell us how the db is queried and which indexes are used if
> any. The query plan is obtained by prefixing the SQL statement with
> EXPLAIN QUERY PLAN.
>
> 2) Collect some useful intelligence with this new tool, and generally
> learn more about how we can optimize sqlite queries. A big question here
> is how the event_view VIEW impacts the query plan.
>
> 3) Write a new template -> SQL compilation engine that generates SQL
> optimized for sqlite. We can actually be quite clever about grouping our
> OR statements into IN clauses - but it will be tricky to get right.
>
> NOTE: That this doesn't imply any change in the public API or event
> template system. That would be the wrong solution imho. Our current API
> is nice and simple by my standards. Let's keep it that way.
>
> [1]: See fx the section "Using Indexes To Speed Searching" in
> http://
>
> ** Affects: zeitgeist
> Importance: Undecided
> Status: New
>
> --
> Slow queries: SQL indexes not used
> https:/
> You received this bug notification because you are subscribed to The
> Zeitgeist Project.
>
> Status in Zeitgeist Framework: New
>
> Bug description:
> THE PROBLEM:
> I am seeing query times around 200ms (and 150ms with my latest performance
> tweak in trunk), which surprised me as quite slow since my queries where
> quite simple. I had expected times around 1-2ms.
>
> Reading up on the sqlite documentation I see that the queries we generate
> are pretty far from optimized in an sqlite world [1]. The case is that when
> ever you use an OR sqlite will no longer use an index. Thus this query is
> NOT using the indexes:
>
> SELECT * FROM event WHERE inte...