GitRefScanJob fails to back off and retry when hosting backend returns 502/503
Bug #1797532 reported by
Tom Haddon
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
Critical
|
Unassigned |
Bug Description
We've seen a number of instances over the last few weeks of merge proposals against git branches not including a commit that is in the repo in question. The branch scan is failing and never being retried.
To post a comment you must log in.
This bug report is too general to be actionable on its own. This comment is an attempt to make it actionable.
1) ceph rebalancing activity means that git.launchpad.net has very little I/O to play with. Many of the problems are due to this. /bugs.launchpad .net/turnip/ +bug/1797534 for scaling issues on the turnip side. /bugs.launchpad .net/launchpad/ +bug/1783315 and https:/ /bugs.launchpad .net/launchpad/ +bug/1792920) can cause scan jobs to fail or be delayed.
2) I've filed https:/
3) Other issues with Launchpad's job runners (https:/
4) Launchpad should back off and retry when the Git hosting backend returns 502/503 in response to its requests. That would make it more robust against scaling/resource issues on the hosting backend.
I suggest that we confine this bug to 4).