Comment 2 for bug 731099

Revision history for this message
Gary Poster (gary) wrote : Re: structural subscriptions have poor queries on bugs with many bugtasks

This query is long because we are trying to do complex things, some of which are dependent on the bugtasks in a bug, so the query will necessarily expand as more and more bugtasks are involved. We aggregate them so that the "inner loops" are done within a single SQL query rather than in Python + multiple SQL queries.

As noted in the MP for my changes, the query now grows at a much slower rate than the previous approach.

The current version of the query increases in size at twice the rate that it would ideally, in my estimation. To get rid of the "twice", we need to either make an additional intermediate query or introduce the use of WITH statements. My approach is already breaking out a separate query for what I believe is the significantly bigger win. I avoided making the third query because I didn't think it was necessary, and in my experience making separate queries is a cost to be balanced with other considerations. Makin the third q. I would expect it to make the query roughly 20 times longer than your excerpt, rather than 40 times longer.

We could also remove a part of the feature to reduce the query to 20 times longer rather than 40 times.

Another alternative is to go back to the drawing board for fundamental design of the feature, which I think is unnecessary and would need to involve discussion with Francis about timeline.

In addition to being unwieldy, does an equivalent query take a long time on staging? I'm hopeful that it is actually pretty fast, because it is working with a constrained set of pre-calculated rows that mitigates the cost of the other checks. If not, I'd again first look at the "third query" option I mentioned above. I'd also return to the suggestion that we push this to the cron job: the arguments against it are surmountable, I believe.

In sum, I believe we are pushing against a challenge inherent in our goals: trying to reduce the number of SQL queries will cause some queries to grow, as we push inner loops from Python + multiple SQL queries to within the SQL queries themselves. We can slow the growth, but we can't eliminate it without changing the questions we ask. I believe we are asking reasonable questions now.