Comment 4 for bug 1386702

Revision history for this message
Dennis Dmitriev (ddmitriev) wrote : Re: [System tests] Need to fix destructive tests disconnect controllers

Looks like the tests 'ha_destroy_controllers' and 'ha_disconnect_controllers' doesn't cover cases we want to test.
These tests just check that pacemaker can mark a controller as 'offline', but don't check if the cluster is still operational.

I suggest to rewrite these tests to the following scenario:

1) Revert a snapshot
2) Destroy or disconnect (depends on the test) the first controller,
3) assert_pacemaker() that the controller marked as 'offline'
4) Wait on a different controller for 'pacemaker' resources to become operational and vip__* resources migrated to the working controllers.
5) Run 'smoke' OSTF tests to make sure that the cluster is still operational.
6) Start or restore connectivity to the first controller,
7) Wait until pacemaker get the controller as 'online' (with assert_pacemaker() )
8) Wait for pacemaker resources to become operational on all controllers,
9) Run 'sanity' and 'smoke' OSTF tests.
10) Repeat the same from 1) to 9) for the second controller.

This will test:
1) How a cluster is continues working without connectivity to the primary or secondary controller?
2) How a cluster is recovering when the lost controller appears online after booting / restoring connection and how the lost controller may affect the cluster of two controllers.