Discussing with Juju team, seems EOF cases are out-of-scope on retrials. However, I only see this issue during bootstrap. enable_ha and deploy seems to be fine. I can see those commands implement a 10x retry by default and I think that covers this issue with enough retrials to get a VM. Bootstrap op does not have that.
It is annoying because it demands me to relaunch bootstrap command over and over again until it get that right.
If what stops us from implementing a retrial for EOF is the fact that we don't know if the request was eventually accepted or not, why we don't sleep for say, 10s and run a list after? Juju machines have always a particular naming: juju-UUID-X where X is the machine #.
Hi, I am seeing the same situation on a customer deployment:
caused by: Post https:/ /URL/compute/ v2.1/SOME_ UUID/servers: EOF
Full logs here: https:/ /pastebin. canonical. com/p/4TtR2zm8w g/
Discussing with Juju team, seems EOF cases are out-of-scope on retrials. However, I only see this issue during bootstrap. enable_ha and deploy seems to be fine. I can see those commands implement a 10x retry by default and I think that covers this issue with enough retrials to get a VM. Bootstrap op does not have that.
It is annoying because it demands me to relaunch bootstrap command over and over again until it get that right.
If what stops us from implementing a retrial for EOF is the fact that we don't know if the request was eventually accepted or not, why we don't sleep for say, 10s and run a list after? Juju machines have always a particular naming: juju-UUID-X where X is the machine #.