Launchpad fails to run parallel builds for lpcraft

Bug #2007650 reported by Alex Murray
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
Low
Unassigned

Bug Description

As per the lpcraft docs https://lpcraft.readthedocs.io/en/latest/configuration.html a pipeline can specify a list of jobs - or a list of list of jobs, in which case these jobs should be executed in parallel.

However, this does not seem to be supported, as can be seen at https://code.launchpad.net/~alexmurray/qa-regression-testing/+git/qa-regression-testing/+ref/lpcraft-ci on commit https://git.launchpad.net/~alexmurray/qa-regression-testing/commit/?id=cea273d3efb01c1e67721de7bd2969942f4f05a7 - no CI/lpcraft job was triggered in this case after changing the .launchpad.yaml to specify a list of list of jobs.

Related branches

Revision history for this message
Jürgen Gmach (jugmac00) wrote (last edit ):

The documentation may not be clear enough, but where did you read that a pipeline could specify a list of list of jobs?

A pipeline could either consist of one job or of a list of jobs, and you could have multiple pipelines

From the specification ( https://docs.google.com/document/d/14MXIbRTBpp8bvtxJBOR12S9JcLisdNJi9r3taJx96LE/edit )

```
pipeline:
  - [test, lint] # this stage is a list, which means jobs are executed in parallel
  - build-wheel # this stage will only execute if previous steps in the pipeline passed
```

As already mentioned in the ~lpcrafters channel, parallel job / pipeline execution is currently not yet implemented. [needs to be documented]

AFAIK we only ever intended to implement the parallel execution on the Launchpad level, not in lpcraft. [needs to be clarified and documented]

Not executing jobs in parallel is one thing, a stalled CI pipeline is another.

Having a look at the commit history, it seems that the isses started with the introduction of list of lists for a single pipeline.

https://git.launchpad.net/~alexmurray/qa-regression-testing/commit/?id=cea273d3efb01c1e67721de7bd2969942f4f05a7

So the first thing to check would be the yaml parser / write a failing test for that case for Launchpad - as we had to use a different parser for Launchpad, as Launchpad is on an older version.

The most interesting bit is ... why is CI stalled even after reverting the changes?

Revision history for this message
Alex Murray (alexmurray) wrote :

> The documentation may not be clear enough, but where did you read that a pipeline could specify a list of list of jobs?

In your example, if you remove the `build-wheel` job then your pipeline is:

```
pipeline:
  - [test, lint]
```

which is the same as:
```
pipeline:
  -
    - test
    - lint
```

which is what I mean by a list of list of jobs.

> The most interesting bit is ... why is CI stalled even after reverting the changes?

Indeed - any ideas how we can get this restarted now that the change has been reverted?

Revision history for this message
Colin Watson (cjwatson) wrote :

It's currently failing because of a typo in your `.launchpad.yaml`:

  [2023-02-17 07:27:56,217: ERROR/ForkPoolWorker-3] Failed to request CI builds for 4fdac1049532b186e814f18b1857bb7667ba44ee: No job definition for 'glib-security'

You have `glib-security` in `pipeline`, but `glibc-security` under `jobs`. Pick one of those.

(It's unfortunate that this error isn't surfaced!)

Revision history for this message
Alex Murray (alexmurray) wrote :

Gah, thanks @cjwatson.

Revision history for this message
Jürgen Gmach (jugmac00) wrote :

So, what is left is that we need to update the documentation and we need to implement parallel job execution.

Changed in launchpad:
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Alex Murray (alexmurray) wrote :

So it turns out after fixing the typo in the .launchpad.yaml (thanks again for finding that cjwatson) the parallel jobs did execute as expected - see https://code.launchpad.net/~alexmurray/qa-regression-testing/+git/qa-regression-testing/+build/21636 for an example. So I think this can be closed?

Revision history for this message
Ines Almeida (ines-almeida) wrote :

FYI, I updated the lpci documentation (https://lpci.readthedocs.io/en/latest/configuration.html) to mention that we don't yet allow parallel job execution (but plan to in the future)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.