Confusing error message if registration-key is set to the empty string: Need computer-title and juju-info to proceed

Bug #1637421 reported by Paul Gear
98
This bug affects 21 people
Affects Status Importance Assigned to Milestone
landscape-client-charm
Confirmed
High
Unassigned

Bug Description

New deploys of cs:xenial/landscape-client-1 result in the charm going into maintenance status with message "Need computer-title and juju-info to proceed". This occurs despite being related to the primary charm via juju-info and a computer-title being present in /etc/landscape/client.conf. Example status at https://pastebin.canonical.com/169176/

Revision history for this message
Adam Collard (adam-collard) wrote :

Would be useful to see "juju show-status-log landscape-client/1 -n 10000" so we get the fully history, and "juju config landscape-client" (with secrets scrubbed!) to see the config.

The computer-title prompt is about the juju config, not /etc/landscape/client.conf (the charm will write it there)

Changed in landscape-client-charm:
status: New → Incomplete
Revision history for this message
Adam Collard (adam-collard) wrote :

Sorry ignore the second paragraph in comment #1.

The charm requires the account-name configuration value to be set. Seeing the status log and configuration as per first paragraph remains useful.

Revision history for this message
Paul Gear (paulgear) wrote :

Unit log of next attempt: https://pastebin.canonical.com/169185/; status log of same: https://pastebin.canonical.com/169186/

python -c 'import socket; print(socket.gethostname()) and curl https://landscape.canonical.com/message-system return their expected results.

(FTR, computer-title is not a charm configuration option)

Changed in landscape-client-charm:
status: Incomplete → New
Revision history for this message
Paul Gear (paulgear) wrote :

Workaround: setting ping-url & url explicitly to their default values rather than to empty strings.

tags: added: canonical-is
Revision history for this message
Paul Gear (paulgear) wrote :

It seems this might be a juju 1 vs. juju 2 incompatibility; I compared with a previously-deployed version of the charm, and the only difference was in the saved persistent config: in the juju 1 config, "registration-key" was absent, whereas in juju 2 it was set to the empty string. I'm not sure if this is really a bug or should be considered a configuration error on my part, but it certainly is surprising.

summary: + Confusing error message if registration-key is set to the empty string:
Need computer-title and juju-info to proceed
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I also spent about 40min debugging a similar case.

My charm config had two mistakes:
a) registration_key set to "secret", but the server didn't require one. I think usually this would just be ignored and not be a fatal error
b) account_name was incorrect. This is indeed a fatal registration error.

Yet the error message I saw in juju status was the very much misleading "Need computer-title and juju-info" text, which sent me on a wild goose chase for quite some time. I even found RUN=0 in /etc/default/landscape-client and thought that was the problem.

Turns out all I had to do was set that to 1 and try "landscape-config --silent", which correctly diagnosed the issue for me:

root@clipper:~# landscape-config --silent
[ ok ] Restarting landscape-client (via systemctl): landscape-client.service.
Please wait...
Invalid account name or registration key.

That status message "Need computer-title and juju-info to proceed" is just a last resort "else:" clause and doesn't really mean what it says. It really means "Sorry, something else happened and I don't know what it is."

        if is_configured_enough:
            exit_code = self.try_to_register()
            if exit_code == 0:
                self.status_set("active", "System successfully registered")
        else:
            if not self.config.get("account_name"):
                self.status_set(
                    "blocked", "Need account-name/registration-key to proceed")
            else:
                self.status_set(
                    "maintenance",
                    "Need computer-title and juju-info to proceed")
            return 0

Revision history for this message
Riccardo Magrini (riccardo-magrini) wrote :

Same issue, but after to have fill in the parts on the charm, as suggested from official site (https://jujucharms.com/landscape-client/28), deployed that, I've connected to node via ssh

$: juju ssh ubuntu@node

and using the command reported on Landscape server:

$: sudo landscape-config --computer-title "My Web Server" --account-name standalone -p 128-qosk-7382 --url https://10.20.81.5/message-system --ping-url http://10.20.81.5/ping

after few second the node is appeared on Landscape Server. I've also followed this guide

https://dor.ky/ubuntu-server-management-with-landscape/

but I followed that from SSL cert

Xav Paice (xavpaice)
tags: added: canonical-bootstack
Revision history for this message
Xav Paice (xavpaice) wrote :

I was getting this same problem, with a juju 2.2.3 setup and cs:landscape-client.

Found that setting url and ping-url by hand fixed it, but I had already added a relation between landscape-server and landscape-client, and I would fully expect the relation to populate that info without needing to configure after deployment.

Revision history for this message
Xav Paice (xavpaice) wrote :

Just one update to that, only one computer actually registered OK. The remainder remained in the same state, but when I tried locally I wound up with:

landscape-config --silent
Restarting landscape-client (via systemctl): landscape-client.service.
Please wait...

The server's SSL information is incorrect, or fails signature verification!
If the server is using a self-signed certificate, please ensure you supply it with the --ssl-public-key parameter.

I would also expect the ssl cert to be passed by the relation.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

FYI, it appears that the relation of landscape-client to landscape-server for the promulgated charm versions is only to create a client on the landscape-server for registration, not to related the peer landscape-clients to the registration info of landscape-server. For this, we'd want to build and promulgate the new landscape-server code into an updated charm to provide the registration relation interface.

Jacek Nykis (jacekn)
Changed in landscape-client-charm:
importance: Undecided → High
Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

Any updates on this?
It looks like it's still actual thing.

Vern Hart (vern)
tags: added: cpe-onsite
Revision history for this message
Vern Hart (vern) wrote :

We're seeing this on site at a customer (several customers, actually).

Checking /etc/landscape/client.conf it seems like it configured correctly:

  [client]
  log_level = info
  url = https://10.1.1.42/message-system
  ping_url = http://10.1.1.42/ping
  data_path = /var/lib/landscape/client
  account_name = standalone
  disable_unattended_upgrades = False
  monitor_plugins = ALL
  computer_title = compute01
  egress_subnets = 10.1.1.40/32
  ingress_address = 10.1.1.40
  private_address = 10.1.1.40
  ssl_public_key = /etc/ssl/certs/landscape_server_ca.crt

In the crt file I see an ascii-armored cert, as I would expect.

However when I try to connect to that https url with curl I get an error with the cert:

  $ curl --cacert /etc/ssl/certs/landscape_server_ca.crt https://10.1.1.42/message-system
  curl: (60) SSL certificate problem: self-signed certificate
  More details here: https://curl.haxx.se/docs/sslcerts.html

  curl failed to verify the legitimacy of the server and therefore could not
  establish a secure connection to it. To learn more about this situation and
  how to fix it, please visit the web page mentioned above.

Checking that cert file on other units I see that it is the same on landscape-server/0 and landscape-server/2 but is different on landscape-server/1 and landscape-haproxy/0

In auditing the landscape-client units I find at least 4 different versions of landscape_server_ca.crt.

When I query with the following openssl command, I set yet ANOTHER certificate!

  $ openssl s_client -showcerts -servername 10.1.1.42 -connect 10.1.1.42:443

I've no idea where this latest certificate came from but if I put it in /etc/ssl/certs/landscape_server_ca.crt and run:

  $ landscape-config --silent
  [ ok ] Restarting landscape-client (via systemctl): landscape-client.service
  Please wait...
  System successfully registered.

It would seem the relationship between landscape-server and landscape-client is passing around invalid certs. The haproxy charm (related to both landscape-server and landscape-client) is responsible for creating the cert and I only have one landscape-haproxy unit so I'd expect there to be only one certificate.

Perhaps landscape-server is also generating a cert? I'm not sure. What I do know is that there are at least 5 different certs in this deployment and none of the landscape-client or landscape-server or landscape-haproxy units have the one that works against the landscape url. Where does the working cert come from?

Revision history for this message
Vern Hart (vern) wrote :

I suspect the predominant feeling is that this is not important since in a production deployment, a non-self-signed certificate will be created and assigned to landscape-haproxy, landscape-server, and landscape-client -- thereby bypassing this bug/issue. I would contend that if the certificate were shared correctly, the deployment experience would be better and less confusing.

At the very least, can we change the status message when the certificate is invalid?

Changed in landscape-client-charm:
status: New → Confirmed
tags: added: sts
Revision history for this message
David O Neill (dmzoneill) wrote :

may help someone resolve landscape-clinet stuck in maintenance mode

JUJU_MODEL=k8s-controller:lma
HAPROXY=landscape-haproxy/0

# Note landscape does not have HA Proxy VIP as in baremetal HA setup.
IP=$( juju status -m $JUJU_MODEL $HAPROXY --format json | jq -r '.machines[]."dns-name"' )

Juju switch lma

juju ssh $HAPROXY "sudo openssl x509 -in /var/lib/haproxy/default.pem > /tmp/landscape.crt; sudo chmod ugo+r /tmp/landscape.crt"
juju scp $HAPROXY:/tmp/landscape.crt /tmp

# Run below for lma and k8s-tenant-1 models:

juju config -m <JUJU_MODEL> landscape-client \
        url="https:/$IP/message-system" \
        ping-url="https://$IP/ping" \
        ssl-public-key="base64:$(cat /tmp/landscape.crt | openssl base64 -e)"

Revision history for this message
Ponnuvel Palaniyappan (pponnuvel) wrote :

This issue still occurs as of juju version 2.7.6, landscape-client revision 33.

Revision history for this message
Vern Hart (vern) wrote :

It would be super awesome if the landscape-client charm checked the url and/or ping_url for a valid ssl certificate and, if there's a problem with the cert, report the problem instead of "Need computer-title and juju-info to proceed".

Revision history for this message
Paul Goins (vultaire) wrote :

I had this issue recently due to a DNS issue which caused one unit to not be able to reach the landscape server. Spent a lot of time tracing it down to find out that it was the register() function which was failing, and that I needed to check the landscape client's broker.log to identify what was going wrong.

Revision history for this message
Paul Goins (vultaire) wrote :

Had this again. Two other failure conditions to consider:
* Certificate expired. Easily checkable by: curl $(juju config landscape-client url)
* Global proxy variables defined in juju model-config. The landscape client code (not the charm, but the actual landscape client) pulls http_proxy and https_proxy from the environment, and these will get injected into the landscape client config file. no_proxy is not respected.

Revision history for this message
Heather Lemon (hypothetical-lemon) wrote :

This was resolved for me by running

juju run --unit=landscape-client/7 hooks/config-changed

Revision history for this message
Barry Price (barryprice) wrote :

FWIW, ran into this on a juju 3.1/jammy deploy today, and the usual config-changed run was not enough.

Had to 'juju ssh' to each unit and then manually 'sudo systemctl enable landscape-client.service' and then 'sudo systemctl start landscape-client.service' first.

Then 'juju exec --unit landscape-client/x hooks/config-changed' for each unit finally cleared it.

Note that 'juju run' is now 'juju exec' for 3.1 at least.

Revision history for this message
Loïc Gomez (kotodama) wrote :

For future travelers, when you need to do that on many machines:
juju exec --application landscape-client -- 'systemctl enable landscape-client; systemctl start landscape-client'
juju exec --application landscape-client hooks/config-changed

Revision history for this message
Junien Fridrick (axino) wrote :

Thanks @barryprice, running that fixed my environment as well (juju 3.1/jammy)

Revision history for this message
Tom Haddon (mthaddon) wrote :

Similar for me, Loic's two liner worked nicely.

Revision history for this message
Dagmawi Biru (dagbiru) wrote :

This is a long running issue and unfortunately one that needs some attention I think.

There are numerous errors that can only be realized by parsing /var/log/landscape/broker.log
that is not bubbled up as an error to the charm, which seems to just default on this "computer-title" errors mentioned above.

Hence, if comment #21 did not resolve the issue, look into
/var/log/landscape/broker.log in the related unit to see the actual error preventing the client from
connecting.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.