DNS resolution issues during enlistment with Bionic ephemeral environment.

Bug #1770201 reported by John George
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned

Bug Description

Package installation fails during enlistment, with Bionic, due to DNS errors. Specifically ipmitool does not get installed.

In the attached logs see logs-2018-05-09-01.40.46/10.244.40.32/var/log/maas/rsyslog/maas-enlisting-node/2018-05-09/10.244.40.137

https://pastebin.canonical.com/p/CZnyNNwV4P/

This surfaces in cpe-solution CI tests during the enlistment step when unexpected power parameters are returned:
https://pastebin.canonical.com/p/SMt2b7dYqC/

Full detail of one example of this failures can be seen here:
https://solutions.qa.canonical.com/#/qa/testRun/8a9ae88e-896b-4b01-9571-120aa4a651df

Revision history for this message
John George (jog) wrote :
description: updated
description: updated
Changed in maas:
status: New → Incomplete
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi John,

I dont fully understand your issue. It is completely valid that if you cannot resolve to the archive, packaging installation fails.

This is the case in MAAS and in machines that dont have MAAS at all. In other words, if you have resolution issues on any ubuntu machine, you wont be able to install packages.

Can you please clarify?

Marking as incomplete.

Revision history for this message
Christian Reis (kiko) wrote :

Is this only on Bionic and netplan? If so we should update summary/description.

John George (jog)
summary: - During enlistment DNS resolution failures leave required packages
- uninstalled
+ During enlistment with Bionic DNS resolution failures leave required
+ packages uninstalled
Revision history for this message
John George (jog) wrote : Re: During enlistment with Bionic DNS resolution failures leave required packages uninstalled

Hi Andres,

This is happening during enlistment of machines in our CI lab, which we've noted is now using Bionic as the commissioning OS. The rsyslog for the enlisting node has the following warning pointing to the IPs of the MAAS(s) serving DNS, which seems unexpected?

May 9 01:40:33 maas-enlisting-node systemd-resolved[988]: Using degraded feature set (UDP) for DNS server 10.244.40.33.
May 9 01:40:33 maas-enlisting-node systemd-resolved[988]: Using degraded feature set (TCP) for DNS server 10.244.40.31.
May 9 01:40:33 maas-enlisting-node systemd-resolved[988]: Using degraded feature set (UDP) for DNS server 10.244.40.32.

We are only seeing these DNS issues during enlistment and not during other test setup phases, which leads me to believe that there is something wrong specifically with MAAS.

John George (jog)
Changed in maas:
status: Incomplete → New
John George (jog)
description: updated
Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: [2.4] DNS resolution issues during enlistment with Bionic env.

Ok, So i think we have a few things to investigate:

1. MAAS hasn't changed the way how DNS are served between Xenial and Bionic, so the first thing we need to identify is how different the environment is between xenial/bionic.
2. The second thing is identify what configuration the machine is being sent to.
3. The third thing is determining the state of the DNS server

That said, if we can determine whether Xenial works, and bionic doesn't in the same cluster, we can be pretty confident that this may be a regression in netplan configuration or how the machine ends up being configured.

For that I think we will need to have a live environment.

@John, next time this issue happens, could you please leave a live environment so we can investigate? I'll mark this as incomplete in the meantime as the logs don't really give me what I need to investigate.

summary: - During enlistment with Bionic DNS resolution failures leave required
- packages uninstalled
+ [2.4] DNS resolution issues during enlistment with Bionic env.
Changed in maas:
status: New → Incomplete
milestone: none → 2.4.0rc2
summary: - [2.4] DNS resolution issues during enlistment with Bionic env.
+ [2.4] DNS resolution issues during enlistment with Bionic ephemeral
+ environment.
summary: - [2.4] DNS resolution issues during enlistment with Bionic ephemeral
+ [2.3] DNS resolution issues during enlistment with Bionic ephemeral
environment.
summary: - [2.3] DNS resolution issues during enlistment with Bionic ephemeral
- environment.
+ DNS resolution issues during enlistment with Bionicphemeral environment.
summary: - DNS resolution issues during enlistment with Bionicphemeral environment.
+ DNS resolution issues during enlistment with Bionic ephemeral
+ environment.
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Andres, leaving the environment up isn't really a helpful approach - this is a CI environment and this happens sporadically. We do not pause on failures.

What extra logs do you need?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

@Jason,

I think I've spotted the actual issue.

Looking at the latest reproduction of this issue [1], I've noticed that it was test MAAS 2.3.0.

I then looked at the make_foundation jenkins job [2], and noticed that 'edit_named_options' is being called without '--migrate-conflicting-options'. Since the fix for [3] has only being made available in MAAS 2.3.3, '--migrate-conflicting-options' is still required. It is because of this that DNS fails to be configured correctly in your 2 secondary region controllers, which causes the DNS resolution issues in the ephemeral environment.

Please not that in 2.3.3 you /can/ continue to use '--migrate-conflicting-options', given that 2.3.3 includes the fix to correctly configure the DNS server regardless of whether the parameter is being passed or not.

[1]: https://solutions.qa.canonical.com/#/qa/testRun/24da8ae0-64b8-45c1-b11f-2ddc1b1674bb
[2]: https://oil-jenkins.canonical.com/job/make_foundation/1148/console
[3]: https://bugs.launchpad.net/maas/+bug/1513775

Changed in maas:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.