Canonical Juju

Bug #1999758
Comment #3

Comment 3 for bug 1999758

Revision history for this message

Simon Richardson (simonrichardson) wrote on 2023-01-17:

Digging into this, it seems like there is sequencing problem when enqueuing onto the scheduler. The code expects that a filesystem change always precedes a filesystem attachment change[1]. In normally day operations this is the case, but for a hard restart, this isn't always the case.

Unfortunately, case statements in a select are shuffled upon every entry. When we receive all the information at once, when the restart has completed there is no guarantee of the correct order[2]. For example in most cases the ordering is valid:

filesystems changed: []string{"0/0"}
filesystem attachments changed: []watcher.MachineStorageId{watcher.MachineStorageId{MachineTag:"machine-0", AttachmentTag:"filesystem-0-0"}}

For times when a attachment is not successful, the attachment comes before the change:

filesystem attachments changed: []watcher.MachineStorageId{watcher.MachineStorageId{MachineTag:"machine-0", AttachmentTag:"filesystem-0-0"}}
filesystems changed: []string{"0/0"}

The solution is non-obvious, as you can't always guarantee that the proceeding change follows an attachment.

1. https://github.com/juju/juju/blob/2.9/worker/storageprovisioner/filesystem_events.go#L165
2. https://github.com/juju/juju/blob/2.9/worker/storageprovisioner/storageprovisioner.go#L304-L358