/bugs/230350/+bug-portlet-subscribers-content has complex queries - structural subscriptions have poor queries on bugs with many bugtasks

Bug #731099 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Gary Poster

Bug Description

I was doing scaling testing on bug 724033 and found the following query:

SELECT DISTINCT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname, Person.hide_email_addresses, Person.homepage_content,
Person.icon, Person.id, Person.logo, Person.mailing_list_auto_subscribe_policy, Person.merged, Person.mugshot, Person.name, Person.personal_standing, Person.personal_standing_reason, Person.registrant, Person.renewal_policy, Person.subscri
ptionpolicy, Person.teamdescription, Person.teamowner, Person.verbose_bugnotifications, Person.visibility FROM StructuralSubscription LEFT JOIN BugSubscriptionFilter ON BugSubscriptionFilter.structuralsubscription = StructuralSubscription.
id JOIN Person ON Person.id = StructuralSubscription.subscriber WHERE (BugSubscriptionFilter.id IN (SELECT BugSubscriptionFilter.id FROM StructuralSubscription JOIN BugSubscriptionFilter ON BugSubscriptionFilter.structuralsubscription = St
ructuralSubscription.id LEFT JOIN BugSubscriptionFilterStatus ON BugSubscriptionFilterStatus.filter = BugSubscriptionFilter.id LEFT JOIN BugSubscriptionFilterImportance ON BugSubscriptionFilterImportance.filter = BugSubscriptionFilter.id L
EFT JOIN BugSubscriptionFilterTag ON BugSubscriptionFilterTag.filter = BugSubscriptionFilter.id WHERE StructuralSubscription.id IN (1) AND ((StructuralSubscription.product = 4 OR StructuralSubscription.project = 4) AND (BugSubscriptionFilt
erImportance.importance = 20 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 130) AND (BugSubscriptionF
ilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 131) AND (BugSubscriptio
nFilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 132) AND (BugSubscript
ionFilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 133) AND (BugSubscri
ptionFilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 134) AND (BugSubsc
riptionFilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 135) AND (BugSub
scriptionFilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 136) AND (BugS
ubscriptionFilterImportance.importance = 5 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.product = 137) AND (Bu
....

NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.distribution = 1 AND StructuralSubscription.sourcepackagename = 1 OR StructuralSubscription.distribution = 1 AND StructuralSubscription.sourcepackagename IS NULL) AND (BugSubscriptionFilterImportance.importance = 30 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 10 OR BugSubscriptionFilterStatus.status IS NULL) OR (StructuralSubscription.distribution = 3 AND StructuralSubscription.sourcepackagename = 1 OR StructuralSubscription.distribution = 3 AND StructuralSubscription.sourcepackagename IS NULL) AND (BugSubscriptionFilterImportance.importance = 20 OR BugSubscriptionFilterImportance.importance IS NULL) AND (BugSubscriptionFilterStatus.status = 20 OR BugSubscriptionFilterStatus.status IS NULL)) AND NOT BugSubscriptionFilter.include_any_tags GROUP BY BugSubscriptionFilter.id HAVING COUNT((CASE WHEN BugSubscriptionFilterTag.include THEN BugSubscriptionFilterTag.tag END)) = 0) OR StructuralSubscription.id IN (1) AND BugSubscriptionFilter.id IS NULL)

Related branches

Revision history for this message
Robert Collins (lifeless) wrote :

The actual query is about 40 times longer.

Changed in launchpad:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Gary Poster (gary) wrote :

This query is long because we are trying to do complex things, some of which are dependent on the bugtasks in a bug, so the query will necessarily expand as more and more bugtasks are involved. We aggregate them so that the "inner loops" are done within a single SQL query rather than in Python + multiple SQL queries.

As noted in the MP for my changes, the query now grows at a much slower rate than the previous approach.

The current version of the query increases in size at twice the rate that it would ideally, in my estimation. To get rid of the "twice", we need to either make an additional intermediate query or introduce the use of WITH statements. My approach is already breaking out a separate query for what I believe is the significantly bigger win. I avoided making the third query because I didn't think it was necessary, and in my experience making separate queries is a cost to be balanced with other considerations. Makin the third q. I would expect it to make the query roughly 20 times longer than your excerpt, rather than 40 times longer.

We could also remove a part of the feature to reduce the query to 20 times longer rather than 40 times.

Another alternative is to go back to the drawing board for fundamental design of the feature, which I think is unnecessary and would need to involve discussion with Francis about timeline.

In addition to being unwieldy, does an equivalent query take a long time on staging? I'm hopeful that it is actually pretty fast, because it is working with a constrained set of pre-calculated rows that mitigates the cost of the other checks. If not, I'd again first look at the "third query" option I mentioned above. I'd also return to the suggestion that we push this to the cron job: the arguments against it are surmountable, I believe.

In sum, I believe we are pushing against a challenge inherent in our goals: trying to reduce the number of SQL queries will cause some queries to grow, as we push inner loops from Python + multiple SQL queries to within the SQL queries themselves. We can slow the growth, but we can't eliminate it without changing the questions we ask. I believe we are asking reasonable questions now.

Revision history for this message
Gary Poster (gary) wrote :

Meh, somehow my browser thought I was ready to submit when I was in the middle of a review/edit. In particular "Makin the third q." was supposed to be "Making the third query would be relatively easy to do, if desired."

Gary Poster (gary)
tags: added: story-better-bug-notification
Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 731099] Re: structural subscriptions have poor queries on bugs with many bugtasks

Thanks Gary. When I looked at it it looked like there were entirely
duplicate sections - so perhaps this is case 0) the query builder just
needs a little tuning ?

Bug https://launchpad.net/bugs/230350 should generate a similarly
complex query for you on qastaging or staging, and if either have the
large number of filter rows you're adding as part of your work, we
should be able to see the query.

I would directly visit the subscribers portlet view and add
++profile++show to your url so that you can just hit end to get to the
sql queries : the long one should be visible there and have its time
recorded.

-Rob

Revision history for this message
Gary Poster (gary) wrote :

On Mar 8, 2011, at 2:50 PM, Robert Collins wrote:

> Thanks Gary. When I looked at it it looked like there were entirely
> duplicate sections - so perhaps this is case 0) the query builder just
> needs a little tuning ?

The duplicate sections are the "twice the rate that it would ideally" I was mentioning. ...They are there for the reasons I described, and can be removed using the methods I mentioned.

> Bug https://launchpad.net/bugs/230350 should generate a similarly
> complex query for you on qastaging or staging, and if either have the
> large number of filter rows you're adding as part of your work, we
> should be able to see the query.

Staging is the only place that has the filters, unfortunately, and it has been down for 15 hours so far (see my note to canonical-launchpad). qastaging times out on that page without even getting to my code, AFAICT. :-/

Gary

Revision history for this message
Gary Poster (gary) wrote : Re: structural subscriptions have poor queries on bugs with many bugtasks

To be clear, I have looked at the queries, on my local instance. I analyzed them for the MP under various circumstances (including increasing bugtasks), and presented the SQL to Danilo as well.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 731099] Re: structural subscriptions have poor queries on bugs with many bugtasks

For timings - are you looking at the subscription portlet or the main
bug page. The main bug page will timeout for sure, but the portlet
shouldn't.

summary: + /bugs/230350/+bug-portlet-subscribers-content has complex queries -
structural subscriptions have poor queries on bugs with many bugtasks
Gary Poster (gary)
Changed in launchpad:
assignee: nobody → Gary Poster (gary)
status: Triaged → In Progress
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
Changed in launchpad:
milestone: none → 11.05
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
Revision history for this message
Gary Poster (gary) wrote :

The query is still long (33773 chars), but much shorter than it was. Unfortunately, I didn't save a snapshot of the full query before the change, but if Robert's comment 1 was not too hyperbolic, then the original was about 180000, so the new version is around 1/5 the size of the old.

There is still some duplication (In particular, " OR StructuralSubscription.distribution = 1 AND" is found 310 times in the query) but as I mentioned in my MP, this is because we are using an abstraction to identify bug targets, and I'd prefer to keep it that way. That said, if we were to get rid of those clauses, the query would be half the size again. Stuart mentioned a DB-level abstraction today (a BugTarget table) in another context; switching from an app-level abstraction to a db-level abstraction would be a nice way to get this last shrink in too. I'm marking this qa-ok without that change though: I think that's a separate bug, and reducing the query to 1/5 the size seems respectable enough.

Performance for the query in this edge case seems to have improved, going from around 400ms before to around 200ms after. The sample size for those comparisons is, um, small, but seems roughly stable.

tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
Robert Collins (lifeless) wrote :

\o/

Brad Crittenden (bac)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.