Comment 3 for bug 1999758

Revision history for this message
Simon Richardson (simonrichardson) wrote :

Digging into this, it seems like there is sequencing problem when enqueuing onto the scheduler. The code expects that a filesystem change always precedes a filesystem attachment change[1]. In normally day operations this is the case, but for a hard restart, this isn't always the case.

Unfortunately, case statements in a select are shuffled upon every entry. When we receive all the information at once, when the restart has completed there is no guarantee of the correct order[2]. For example in most cases the ordering is valid:

    filesystems changed: []string{"0/0"}
    filesystem attachments changed: []watcher.MachineStorageId{watcher.MachineStorageId{MachineTag:"machine-0", AttachmentTag:"filesystem-0-0"}}

For times when a attachment is not successful, the attachment comes before the change:

    filesystem attachments changed: []watcher.MachineStorageId{watcher.MachineStorageId{MachineTag:"machine-0", AttachmentTag:"filesystem-0-0"}}
    filesystems changed: []string{"0/0"}

The solution is non-obvious, as you can't always guarantee that the proceeding change follows an attachment.

 1. https://github.com/juju/juju/blob/2.9/worker/storageprovisioner/filesystem_events.go#L165
 2. https://github.com/juju/juju/blob/2.9/worker/storageprovisioner/storageprovisioner.go#L304-L358