charm stays in "docker login failed, see juju debug-log" status even after later success
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Docker Subordinate Charm |
Triaged
|
High
|
Unassigned |
Bug Description
UPDATE: After investigation, the problem seems to be that the charm is not doing a status-set after things are OK again. The docker-registry itself seems to be working. See comment #1 for details.
------
I want to add a private docker registry with the charm Docker Registry #173 from https:/
I use a relation from docker-registry to easyrsa to get SSL certificates as per https:/
The docker-registry charm has a default binding to an oam space but all the other bindings are set to another space (internal-space):
bindings:
"": *oam-space
cert-
docker-
dockerhost: *internal-space
nrpe-
sdn-plugin: *internal-space
website: *internal-space
With the following options:
options:
auth-
auth-
http_proxy: *http-proxy
https_proxy: *http-proxy
no_proxy: *no-proxy
I tried no-proxy with some cidrs("
After deploying docker-registry and relating it to docker, I can do a docker login from the docker-registry unit to its oam private IP with “docker login -u admin -p password 172.31.
My k8s workers can’t reach that private oam IP so I need to add the internal space IP. If I try to reach the private oam IP from the k8s workers I get the following error, as expected:
“Error response from daemon: Get https:/
From the workers, a docker login to the internal space IP with “docker login -u admin -p password 172.31.246.81:5000” gives me the same error about the SSL cert not being valid for that IP, which is what I expect since I have the same thing on the docker-registry locally.
Then I set the http-host of docker-registry to my internal space IP:
juju config -m kubernetes docker-registry http-host="https:/
Now “docker login -u admin -p password 172.31.246.81:5000” works both from the docker-registry unit and the k8s workers. I do hit the following bug though but I can workaround it by apt installing the pass package:
https:/
https:/
https:/
But still, all docker units are blocked.
In the juju debug logs there’s always the 2 following lines but I doubt they are responsible for this problem:
2021-02-10 13:26:54 WARNING juju-log docker-
2021-02-10 13:27:35 WARNING docker-
I also see this error:
unit-docker-0: 14:06:00 ERROR juju.worker.
Looking at the docker charm code, I see that the "docker login failed" error happens when ['docker', 'login', netloc, '-u', registry.
# handle auth data
if registry.
cmd = ['docker', 'login', netloc,
try:
except CalledProcessError as e:
if b'http response' in e.output.lower():
# non-tls login with basic auth will error like this:
# Error response ... server gave HTTP response to HTTPS client
msg = 'docker login requires a TLS-enabled registry'
elif b'unauthorized' in e.output.lower():
# invalid creds will error like this:
# Error response ... 401 Unauthorized
msg = 'Incorrect credentials for docker registry'
else:
msg = 'docker login failed, see juju debug-log'
I'm trying to find out what the value of netloc is here. If it's the oam private IP, it's going to fail because I have to use the internal space IP.
I'm not sure if it's http-host config or the private-addr of the unit. I’m seeing in the docker-registry charm that netloc should be http-host but I'm not sure if that really applies to me:
def get_netloc():
'''Get the network location (host:port) for this registry.
If http-host config is present, return the netloc for that config.
If related to a proxy, return the proxy netloc. Otherwise, return
our private_
'''
charm_config = hookenv.config()
netloc = None
if charm_config.
netloc = urlparse(
else:
After investigation, I think there's not really any real problem with the docker registry. I think the problem is just that the charm is not doing a status-set after things are OK again. When I have the wrong IP (oam private IP), it did a status-set:
unit-docker-0: 17:02:19 INFO unit.docker/ 0.juju- log docker- registry: 105: Logging into docker registry: 172.31. 222.132: 5000. 0.juju- log docker- registry: 105: Invoking reactive handler: reactive/ docker. py:679: docker_ restart 0.juju- log docker- registry: 105: Passing NO_PROXY string that includes a cidr. This may not be compatible with software you are running in your shell. 0.juju- log docker- registry: 105: Restarting docker service. 0.juju- log docker- registry: 105: Setting runtime to {'apt': ['docker.io'], 'upstream': ['docker-ce'], 'nvidia': ['docker-ce', 'nvidia-docker2', 'nvidia- container- runtime' , 'nvidia- container- runtime- hook']} 0.juju- log docker- registry: 105: Reloading system daemons. 0.docker- registry- relation- changed WARNING: No swap limit support 0.juju- log docker- registry: 105: status-set: blocked: docker login failed, see juju debug-log
unit-docker-0: 17:02:49 INFO unit.docker/
unit-docker-0: 17:02:49 WARNING unit.docker/
unit-docker-0: 17:03:19 INFO unit.docker/
unit-docker-0: 17:03:19 INFO unit.docker/
unit-docker-0: 17:03:19 INFO unit.docker/
unit-docker-0: 17:03:31 WARNING unit.docker/
unit-docker-0: 17:03:31 INFO unit.docker/
I think we just never re-set status, which would also explain why my docker subs all stay in blocked state after I remove the docker-registry relation.
To push the investigation, I run “juju debug-hooks -m kubernetes docker/0” and then run “juju run -m kubernetes --unit docker/0 'hooks/ update- status' ” in another window which triggers the update-status hook in my debug-hook session
Then I run “charms.reactive -p get_flags” and see the following:
['docker. available' , registry. configured' , docker- registry. changed. basic_password' , docker- registry. changed. basic_user' , docker- registry. changed. egress- subnets' , docker- registry. changed. ingress- address' , docker- registry. changed. private- address' , docker- registry. changed. registry_ netloc' , docker- registry. changed. registry_ url', docker- registry. changed. tls_ca' , docker- registry. joined' , docker- registry. ready', docker. available' , docker. changed' , docker. changed. egress- subnets' , docker. changed. ingress- address' , docker. changed. private- address' , docker. changed. sandbox_ image', docker. joined' , docker. reconfigure' ]
'docker.ready',
'docker.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
'endpoint.
The only time in the docker charm code that we set status. active( 'Container runtime available.') and set_state( 'docker. available' ) are set is in the signal_ workloads_ start function:
@when(' docker. ready') 'docker. available' ) workloads_ start() :
@when_not(
def signal_
But right now in my debug-hook in update-status I have docker.available set, so this won’t run.
Am I understanding this right?