importer repeatedly requeues recently published packages
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Distributed Development |
New
|
Undecided
|
Unassigned |
Bug Description
The importer currently repeatedly requeues recently published packages many times.
Specifically, every 5 minutes, it queues a job to import everything that was newly published from 10 minutes *BEFORE* the last package it previously knew to be published.
Note the *BEFORE* meaning that subsequent runs of the queue adder script deliberately overlap with what the previous run processed.
Usually this isn't a big deal - the importer can keep up with a few extra redundant jobs.
The problem comes during a mass requeue or the opening of a new Ubuntu series. In this situation, very many spurious queue entries can build up, delaying the importer in catching up with a new series opening or processing a backlog of requeues.
Case in point: we're currently in the early days of Quantal Quetzal, and I've just manually disabled the queue addition cronjob for a while, and marked ~25000 importer jobs to be skipped, because they were redundant with the the other ~10000 I left alone.
Note: the way the web UI presents the queue, you can't see the buildup, because the number presented in the web UI is *unique* queue entries. A "sqlite3 meta.db 'SELECT COUNT(*) FROM jobs WHERE active'" shows the truth, as does inspecting the driver logs to see how many times some packages get re-processed in a single day.
On Sat, 28 Apr 2012 09:52:09 -0000, Max Bowsher <email address hidden> wrote:
> Public bug reported:
>
> The importer currently repeatedly requeues recently published packages
> many times.
>
> Specifically, every 5 minutes, it queues a job to import everything that
> was newly published from 10 minutes *BEFORE* the last package it
> previously knew to be published.
>
> Note the *BEFORE* meaning that subsequent runs of the queue adder script
> deliberately overlap with what the previous run processed.
This was a concious decision when I wrote the code, because I wasn't
sure what guarantees LP would provide about the API responses and the
timestamps they contain.
I unfortunately have no data about whether this check has ever actually
caught something that would have been missed.
I think that we could reduce the impact of this overlap by not adding a
job if a previous run created a job for the id of the publishing record
(or for the (package, version) combo). That would still have a bit of
overlap in case the API isn't monotonic, without causing the issue you
describe.
Alternatively, given that the importer loops over all packages anyway
when not doing anything, it is eventually consistent, so the overlap
could be dropped with only a small chance that the branch will be out of
date for longer than currently if an oddity happens.
> Note: the way the web UI presents the queue, you can't see the buildup,
> because the number presented in the web UI is *unique* queue entries. A
> "sqlite3 meta.db 'SELECT COUNT(*) FROM jobs WHERE active'" shows the
> truth, as does inspecting the driver logs to see how many times some
> packages get re-processed in a single day.
Do you think that the unique constraint should be dropped from the queue
display on the status page?
Thanks,
James