Failure during initial branch scan increases scan time enormously
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
Critical
|
Unassigned |
Bug Description
Excerpted from https:/
...here's some diagnosis from experiments with lp:~wallyworld/launchpad/create-aag-privacy-transitions-1009364, a Launchpad branch which is now permanently cursed and unable to scan within the timeout.
wallyworld pushed the branch at around 01:04. It starts to scan, but times out 5 minutes later.
2012-06-18 01:05:11 INFO Running <SCAN_BRANCH branch job (4134693) for ~wallyworld/
2012-06-18 01:10:15 INFO Job resulted in OOPS: OOPS-415b4f54e93cb4a66345cf3807221142
Another scan is attempted at 01:11:10, timing out at 01:16:23 with OOPS-d214e0439183a7ec5ccb2bf73c34efec. wallyworld pushed it to a new name at 01:22, and the new branch scanned successfully in 109s.
I noticed the accursed branch and coordinated with wallyworld and spm to debug it a little. A scan was forced by pushing a new rev at 03:33, and the scan timed out as expected with OOPS-d83500d8b2cd4f2355127bb914508982. The branch state after this was https:/
https:/
summary: |
- Failure to scan branch increases scan time enormously + Failure during initial branch scan increases scan time enormously |
Branches hit by this bug can now be unstuck by running 'scripts/ unscan- branch. py --rescan lp:~PATH/TO/BRANCH' as bzrsyncd@ackee (or anywhere with access to the branchscanner DB user). It erases the BranchRevisions, last_scanned_id, and some other stuff, and requests a rescan from the normal branch scanner. For really big branches the DB cache may be too cold for the first scan to complete, so it might need to be retried a couple of times.