Comment 4 for bug 1867812

Revision history for this message
Dougal Matthews (d0ugal) wrote :

I have spent a bunch of time trying to resolve this bug and I've gotten a little stuck. I wanted to write up a bit about what I have learned so far.

There is a method in maasserver.models.node, get_boot_disk which returns the boot disk for a node. However, if there isn't one specified then it will default to the first physical disk. This seems like a confusing behaviour and I think it is the cause of the issue here. There is also another suspicious method is_boot_disk in maasserver.models.blockdevice, which uses get_boot_disk. This means that if there is no specified boot disk, but if there is a physical disk that can be found then is_boot_disk will return True.

I was able to replicate this bug with this unit test:

+ def test_create_without_creating_partitions(self):
+ """Verify we don't create a partition when creating a bcache cache set
+
+ Test for LP 1867812
+ """
+ self.become_admin()
+ node = factory.make_Node(
+ status=NODE_STATUS.READY, with_boot_disk=False
+ )
+ uri = get_bcache_cache_sets_uri(node)
+ cache_device = factory.make_PhysicalBlockDevice(node=node)
+ response = self.client.post(uri, {"cache_device": cache_device.id})
+ self.assertEqual(
+ http.client.OK, response.status_code, response.content
+ )
+ parsed_device = json_load_bytes(response.content)
+ self.assertFalse(
+ parsed_device["cache_device"]["partitions"],
+ )

This test causes get_boot_disk to be called from three different places.

  File "/home/ubuntu/code/canonical/maas/src/maasserver/models/blockdevice.py", line 302, in create_partition_if_boot_disk

  File "/home/ubuntu/code/canonical/maas/src/maasserver/models/partition.py", line 223, in get_partition_number

  File "/home/ubuntu/code/canonical/maas/src/maasserver/storage_layouts.py", line 59, in __init__

When trying to remove these three uses I broke a test in maasserver.forms.tests.test_cachset; TestCreateCacheSetForm.test_cache_set_creation_with_boot_disk. This test seems to verify the opposite behaviour and expects partitions to be created. Despite the node not having a explicit boot disk assigned.

Looking at when this test had that behaviour introduced, it was in this 2015 change: https://git.launchpad.net/maas/commit/?id=035e8d44c07aef638dae9579627e67129aa3cb27

This makes me wonder how the 2.6 behaviour was ever possible. It might be that I am missing something or my reproduction is incomplete.

From a brief discussion on mattermost it sounds like the best solution here would be to make everything much more explicit. By having users specify which disks should be bootable and we should be clear when partitions will be created.

I have also spent some time diving into the git history to try and track down whatever the breaking change might be. No luck there. I'd really like to better understand why/when the behaviour changed.