Initialize Device transaction incorrect "Done" status

Bug #1728076 reported by Kyle Nitzsche
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Triaged
Medium
Unassigned

Bug Description

In a seeded classic image with a gadget whose prepare-device hook points at a serial vault URL that does not exist, the Initialize Device transaction finally (after the transaction times out) has a status of "Done".

Expectation: A final status that indicates the transaction is not completed and will therefore run again. For example: Error, or Hold, or something.

Note: Even though the Initialize Dev change ended up with a status of Done after it timed out, it did then in fact run again. Meanwhile, a corrected gadget had been installed pointing at the correct serial vault url. This new transaction also ended up with a status of Done, and this time a serial assertion was obtained.

Changes:

ubuntu@ubuntu:~$ snap changes
ID Status Spawn Ready Summary
2 Done 2017-10-27T05:33:27Z 2017-11-15T18:26:23Z Initialize device
4 Done 2017-11-15T18:26:23Z 2017-11-15T18:31:30Z Initialize device

ubuntu@ubuntu:~$ snap change 2
Status Spawn Ready Summary
Done 2017-10-27T05:33:27Z 2017-11-15T18:26:23Z Run prepare-device hook
Done 2017-10-27T05:33:27Z 2017-11-15T18:26:23Z Generate device key
Hold 2017-10-27T05:33:27Z 2017-11-15T18:26:23Z Request device serial

......................................................................
Request device serial

2017-10-27T05:36:34Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:18:12Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:19:13Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:20:14Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:21:14Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:22:15Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:23:15Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:24:16Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:25:17Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid
2017-11-15T18:26:17Z ERROR cannot retrieve request-id for making a request for a serial: Post https://serial-vault.com/v1/request-id: x509: certificate has expired or is not yet valid

ubuntu@ubuntu:~$ snap change 4
Status Spawn Ready Summary
Done 2017-11-15T18:26:23Z 2017-11-15T18:31:23Z Run prepare-device hook
Done 2017-11-15T18:26:23Z 2017-11-15T18:31:23Z Generate device key
Done 2017-11-15T18:26:23Z 2017-11-15T18:31:30Z Request device serial

description: updated
Changed in snappy:
importance: Undecided → Medium
Revision history for this message
Samuele Pedroni (pedronis) wrote :

I looked at our state machine again, we have the following transitions that end up in Hold, all related to trying to abort a change and its tasks:

i) Do ---> Hold (aborting a never started task)

ii) Doing ---> Abort -- (run tryUndo after current task run is terminated in a retry state or observed from outside the running goroutine) -> Hold (if the task doesn't support undo)

related transitions not ending in Hold:

iii) Done ---> Undo ---> Done (if the task doesn't support undo)

iv) Doing ---> Abort -- (we don't get to run tryUndo, the task terminates before then in a non-retry state) -> Undo --> Done (if the the task doesn't support undo)

so we never reach Hold from a task that was actually fully Done, so it seems it wouldn't be incorrect to assign to a Change that has combination of Done and Hold tasks the status Hold,

in fact it seems like the current decision is related to some confusion over time about transitions i/ii vs iii/iv and the actual use of Hold.

Revision history for this message
Kyle Nitzsche (knitzsche) wrote :

Does that mean the bug Status should be Confirmed?

Revision history for this message
Kyle Nitzsche (knitzsche) wrote :

Should this bug therefore be confirmed?

Michael Vogt (mvo)
Changed in snappy:
status: New → Triaged
Michael Vogt (mvo)
affects: snappy → snapd
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.