That is the real scenario and real cause of that problem:
Facts:
1. We have primary-openstack-controller task.
2. That job is repeatable. If it fail it will be re-triggered.
3. That job triggers Nova installation and configuration
4. Installation and configuration triggers sync-db.
5. sync-db can't repeat step if something goes wrong with connection to DB.
So that is the fail scenario:
1. sync-db was working while wsrep was starting and re-balancing nodes.
2. mysql node on primary controller had failed while sync-db was running
3. sync-db was not able to recover and failed
4. primary-openstack-controller task was failed too.
5. primary-openstack-controller task was triggered one more time.
6. nova installation was not triggered because nova is already installed and configured.
7. sync-db was not triggered too because it depends on nova installation process.
8. primary-openstack-controller task was succeed because there where not fails.
9. so deployment has success state too
10. but nova database had not filled with data and was broken.
Possible solutions:
1. Allways trigger sync-db on primary-openstack-controller.
2. Add check about current db state.
3. Somehow fix wrep and mysql starting procedure.
4. Fix sync-db to repeat steps.
That is the real scenario and real cause of that problem:
Facts:
1. We have primary- openstack- controller task.
2. That job is repeatable. If it fail it will be re-triggered.
3. That job triggers Nova installation and configuration
4. Installation and configuration triggers sync-db.
5. sync-db can't repeat step if something goes wrong with connection to DB.
So that is the fail scenario:
1. sync-db was working while wsrep was starting and re-balancing nodes. openstack- controller task was failed too. openstack- controller task was triggered one more time. openstack- controller task was succeed because there where not fails.
2. mysql node on primary controller had failed while sync-db was running
3. sync-db was not able to recover and failed
4. primary-
5. primary-
6. nova installation was not triggered because nova is already installed and configured.
7. sync-db was not triggered too because it depends on nova installation process.
8. primary-
9. so deployment has success state too
10. but nova database had not filled with data and was broken.
Possible solutions: openstack- controller.
1. Allways trigger sync-db on primary-
2. Add check about current db state.
3. Somehow fix wrep and mysql starting procedure.
4. Fix sync-db to repeat steps.