But I don’t see any evidence of a network disconnect that would cause this. Right before this happens, controller-1 has just come online and finished DRDB syncing. It's possible we have some stale TCP connection to the helm postgres DB in the tiller container. Postgres logs report nothing off. Maybe running out of connections? Looking at the tiller process running the container there are quite a few threads running. I'm not sure if this is normal behavior.
Attaching key logs from this.
But I don’t see any evidence of a network disconnect that would cause this. Right before this happens, controller-1 has just come online and finished DRDB syncing. It's possible we have some stale TCP connection to the helm postgres DB in the tiller container. Postgres logs report nothing off. Maybe running out of connections? Looking at the tiller process running the container there are quite a few threads running. I'm not sure if this is normal behavior.
I could not reproduce this in my local setup