@Francis: I have no idea if this should be classified as a regression, to be honest. This is code that existed before the Yellow squad existed, but was developed for this story. It's conceivable that we might have aggravated something somehow, but I'm not clear on how immediately. I'd suggest using another metric to decide whether this should be addressed immediately/sooner.
@Robert: A few replies, for a lengthier comment.
- Showing the subscribers abstractly--for the web UI, in particular--is a different question than the one this query is asking. The latter takes structural subscription filters into account, and the former does not. The code that is timing out is exclusively about figuring out who should be notified for a particular notification, not about who is subscribed in the abstract. The notification code could be run offline.
- Offline tasks need to be fast, but they can be slower than in-browser tasks, or we have yet another problem IMO. [A side thought: in this particular case if a 1.5 second transaction were too long (and I am aware of the dangers of long write transactions) then we also might be able to divide up the work into a slave-based read transaction and a gracefully-recover-if-it-fails write transaction.]
- My immediate suspect would be that perhaps filters need an index we do not have, but investigation will presumably point our way to this relatively quickly.
- It's possible that a quick run through with the query planner will be a smoking gun to an immediate remediation. I tried to do that just now but I'm not clear on how to quickly resolve the query variables from the OOPS, not having done this before. If Robert/Stuart can get data from the query planner that would be cool, though if we are told to drop everything and focus on this I suspect someone else on the yellow squad will know how to do it.
In sum, I hope to assign myself (so I can learn) with Danilo's assistance (so he can teach me) when we tackle this task. For obvious reasons, I would prefer not to treat this as a burning task that requires us to drop our current feature work.
@Francis: I have no idea if this should be classified as a regression, to be honest. This is code that existed before the Yellow squad existed, but was developed for this story. It's conceivable that we might have aggravated something somehow, but I'm not clear on how immediately. I'd suggest using another metric to decide whether this should be addressed immediately/sooner.
@Robert: A few replies, for a lengthier comment. recover- if-it-fails write transaction.]
- Showing the subscribers abstractly--for the web UI, in particular--is a different question than the one this query is asking. The latter takes structural subscription filters into account, and the former does not. The code that is timing out is exclusively about figuring out who should be notified for a particular notification, not about who is subscribed in the abstract. The notification code could be run offline.
- Offline tasks need to be fast, but they can be slower than in-browser tasks, or we have yet another problem IMO. [A side thought: in this particular case if a 1.5 second transaction were too long (and I am aware of the dangers of long write transactions) then we also might be able to divide up the work into a slave-based read transaction and a gracefully-
- My immediate suspect would be that perhaps filters need an index we do not have, but investigation will presumably point our way to this relatively quickly.
- It's possible that a quick run through with the query planner will be a smoking gun to an immediate remediation. I tried to do that just now but I'm not clear on how to quickly resolve the query variables from the OOPS, not having done this before. If Robert/Stuart can get data from the query planner that would be cool, though if we are told to drop everything and focus on this I suspect someone else on the yellow squad will know how to do it.
In sum, I hope to assign myself (so I can learn) with Danilo's assistance (so he can teach me) when we tackle this task. For obvious reasons, I would prefer not to treat this as a burning task that requires us to drop our current feature work.