LoginToken:+accountmerge with many translations cannot complete in-webapp-request

Bug #677713 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

OOPS-1784A1984

https://launchpad.net/token/XXXXX/+accountmerge

User: Sense Hofstede
DB id: 455815
Branch: launchpad-rev-9972
Revno: 9972
SQL time: 18854 ms
Non-sql time: 345 ms
Total time: 19199 ms
Statement Count: 253

id Statement
1 17070.0 1 SQL-launchpad-main-master UPDATE TranslationMessage SET submitter=455815 WHERE submitter=3368841

This is an example of a system problem with active translators (this user isn't a high volume user - some translators have hundreds of thousands of translations). The OOPS probably reflects table contention with the message merging script that was running, but we expect failures like this with very busy or long-lived accounts when they are merged.

Revision history for this message
Robert Collins (lifeless) wrote :

launchpad_qastaging=> select count(*) from translationmessage where submitter = 3368841;
 count
-------
  2279
(1 row)

launchpad_qastaging=> select count(*) from translationmessage where submitter = 455815;
 count
-------
  1147
(1 row)

Revision history for this message
Robert Collins (lifeless) wrote :

3.5 thousand occurences on sunday.

Revision history for this message
Robert Collins (lifeless) wrote :

Seems to be dimished frequency now, either they succeeded, have given up, or whatever was causing that update to stall (perhaps a long transactioned batch job?!) has stopped.

Revision history for this message
Данило Шеган (danilo) wrote :

I can imagine cases where it'd be even worse. Much, much worse. For instance, I alone have around 134000 TranslationMessages: updating them in a live web page request is basically impossible (for more reasons than one: one of them is that it'd mean updating 134000 rows, another is that it will cause replication lag to grow substantially). There are other translators with many more translations (for instance, Andre Gondim has >300k translationmessages). So, we simply can't fix the problem with the way account merging works: it has to be decoupled from the web UI. Do note that this is not any "rare" or "unusual" occurrence: many translations would be coming from upstream and for those translators that only now start using LP, they might really want to merge accounts that LP has eg. created for their GNOME and KDE emails, and each of them might have thousands of translations attached.

I am simply marking this as 'won't fix' for translations, recognizing it as a problem, but admitting defeat without reworking account merging. If account merging is reworked, nothing would need to be fixed regarding translations anyway. The only thing we may do is try to optimize this harder, but there's no way we can avoid timeouts completely.

Changed in rosetta:
status: New → Won't Fix
Revision history for this message
Robert Collins (lifeless) wrote :

Thanks for the clear explanation Danilo. That means that we want to fix this eventually - and there is work on making account merges batched anyway. That won't necessarily fix this on its own, so I'm going to reopen this and triage it down to low - it should be an open bug because:
 - it is a defect
 - we want to support account merge
 - thus we want to fix it even if we don't know how.

Changed in rosetta:
status: Won't Fix → Triaged
importance: High → Low
summary: - LoginToken:+accountmerge with many translations
+ LoginToken:+accountmerge with many translations cannot complete in-
+ webapp-request
description: updated
Changed in rosetta:
importance: Low → High
Revision history for this message
Robert Collins (lifeless) wrote :

Actually, putting it back to high - if we need exceptions to the zero oops policy we should discuss that and make it part of the policy. (I think we probably do, but I'd like some team thinking on this, rather than making it harder for other folk to triage).

Revision history for this message
Данило Шеган (danilo) wrote :

I actually think we should open a registry task for this instead. We can't fix this inside translations. It's simply not doable, imho. IOW, if we fix it in translations, we would be silly not to fix it completely and make it batched for everything else (i.e. we could batch only the translations part and then call that a translations fix, but that would be very silly imo).

I am not going to play with statuses here because I don't want a "status war" :), but if you are in agreement, I'll let you open a registry bug task and close the rosetta task. If not, then I suppose you have a different translations-only solution in mind.

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

This bug is going to be a 'Launchpad' bug tomorrow, so opening another bug is moot at this point :-)

Revision history for this message
Данило Шеган (danilo) wrote :

Oh, I was thinking of another bug task, or perhaps even reassigning to registry — then it would get the proper tag :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.