NM-controlled dnsmasq prevents other DNS servers from starting
- Precise (12.04)
- Bug #959037
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
djbdns (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Precise |
Won't Fix
|
Undecided
|
Unassigned | ||
dnsmasq (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Precise |
Won't Fix
|
High
|
Mathieu Trudel-Lapierre | ||
network-manager (Ubuntu) |
Confirmed
|
Low
|
Unassigned | ||
Precise |
Won't Fix
|
High
|
Mathieu Trudel-Lapierre | ||
pdns-recursor (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Precise |
Invalid
|
Undecided
|
Unassigned | ||
pdnsd (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Precise |
Invalid
|
Undecided
|
Unassigned |
Bug Description
As described in https:/
That breaks the default bind9 and dnsmasq installations, for people that actually want to install a DNS server.
Having to manually comment out "#dns=dnsmasq" in /etc/NetworkMan
Please make network-manager smarter so that it checks if bind9 or dnsmasq are installed, so that it doesn't start the local resolver in that case.
Related branches
Mathieu Trudel-Lapierre (cyphermox) wrote : | #1 |
Mathieu Trudel-Lapierre (cyphermox) wrote : | #2 |
I don't think we'll cover this particular use case for Precise. I understand your requirement and how the need to change the settings in /etc/NetworkMan
There's another possibility to make this easier by making sure Bind always starts before NetworkManager, but most cases will not actually see bind and NetworkManager installed on the same system; and fixing this would require migrating bind from a sysvinit script to a new upstart job.
I'm keeping the task open as it's absolutely a valid request, we just won't have time to focus on fixing this for the Precise release. (Sorry)
Changed in network-manager (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → Low |
Alkis Georgopoulos (alkisg) wrote : | #3 |
> I don't think we'll cover this particular use case for Precise.
Excuse me, but how is installing bind9 or dnsmasq a "particular use case"?
I'm talking about the default installation, not some corner case...
> most cases will not actually see bind and NetworkManager installed on the same system
We have 250 schools here that use NetworkManager and dnsmasq as the DNS server, are there any stats that show that this is actually rare?
And, actually more rare than the split VPN need that the local resolver addresses?
Since the local resolver implementation seems a bit immature and needs to break two packages in order to work, one of them in main, wouldn't it be better if it was postponed and not be applied in an LTS release until it's more cooperative?
Kind regards,
Alkis Georgopoulos
Mathieu Trudel-Lapierre (cyphermox) wrote : | #4 |
I think I've been unclear. Using NetworkManager with *bind* is a relatively unusual use case. dnsmasq with NetworkManager for resolution is what we're aiming for *by default*, and that's what also part of the default install. Everything has been put in place so that split VPN and such are correctly addressed with NetworkManager spawning dnsmasq as necessary, which is what dns=dnsmasq achieves.
I'm not sure in this case what you mean by breaks two packages. There's a lot of benefits to having a local resolver other than the libc one (split DNS, faster and more efficient resolution, etc.).
I do feel we've tested this well, thoroughly, and that it's very cooperative and efficient. Please, tell me more about your setup so we can make sure we cater for this use case before release.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #5 |
What I mean here is that default installs normally don't involve installing a local DNS server, except perhaps as a caching resolver. The caching resolver use case is covered by spawning dnsmasq from NetworkManager; the local DNS server isn't. We do think that there is relatively few such installs of a server that depends on NetworkManager running; and that's definitely not the default setup for Ubuntu Server (where NetworkManager isn't installed by default).
Alkis Georgopoulos (alkisg) wrote : | #6 |
> Please, tell me more about your setup so we can make sure we cater for this use case before release.
1) Install precise-
2) Install dnsmasq. Fails to start. OK, annoying but let's see if the problem goes away after reboot.
3) Reboot. Try to `dig @some-pc ubuntu.com` from *another* PC.
Here's the problem. It *sometimes* works. The "caching resolver" implementation introduced a race condition.
So if the nm-spawned dnsmasq starts first, then the dnsmasq package is broken, and doesn't fulfill its stated goal to "provide DNS to a small network" out of the box and without manual editing of nm conffiles.
If the real dnsmasq starts first, then the "caching resolver" is broken instead.
Because of time constrains, I think that checking if [ -d /etc/dnsmasq.d ] before spawning dnsmasq from nm, would satisfy most of dnsmasq users. I don't think there are many users that want to keep the nm-spawned dnsmasq when they install the real one. Maybe something similar can be done for bind too.
In the future, maybe the "caching resolver" implementation can start using /etc/dnsmasq.d itself, along with the KVM-spawned instances too, so that people only have one dnsmasq instance instead of multiple ones?
(The reason we're using the desktop iso instead of the server one, is that we need a desktop environment in our servers for our LTSP thin clients, and because teachers work on our servers, they're not headless).
Alkis Georgopoulos (alkisg) wrote : | #7 |
Another idea would be to create a "spawn-
Alkis Georgopoulos (alkisg) wrote : | #8 |
And yet another idea would be to make a package out of the local resolver configuration, and declare that it Breaks: dnsmasq, bind9.
That way anyone installing dnsmasq or bind9 would get rid of the local resolver package and its conflicting configuration.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #9 |
If you're installing dnsmasq on top of the standard desktop install, why is it such an issue to edit the NetworkManager configuration to cater it to your needs? Wouldn't it make sense it this case to go further steps and make sure the network connection is setup in /etc/network/
I don't think adding complexity by creating new virtual packages for configurations is a sensible thing to do; and setting up a special upstart job to spawn a local resolver won't work (NM spawns it itself, using a custom configuration on purpose).
Since NM relies on dnsmasq-base for the standalone binary rather than the 'dnsmasq' package itself; I guess a workable solution would be to check for /etc/default/
Setting to Triaged; we've got a way to possibly deal with this use case...
Changed in network-manager (Ubuntu): | |
importance: | Low → Medium |
Mathieu Trudel-Lapierre (cyphermox) wrote : | #10 |
Does it help any if the daemon dnsmasq is configured to only listen on the interface meant for the ltsp clients, if there's a specific interface for this?
Mathieu Trudel-Lapierre (cyphermox) wrote : | #11 |
There's other probably far simpler (and safer) workarounds. What's your configuration for the dnsmasq like?
Upstream mentions some configurations at the dnsmasq level that are very relevant for this particular case:
in /etc/dnsmasq.conf:
#except-interface=
# Or which to listen on by address (remember to include 127.0.0.1 if
# you use this.)
#listen-address=
The problem is that listen-address probably shouldn't contain 127.0.0.1 if dnsmasq is meant to be used to resolve things for ltsp clients; also, except-interface=lo may be a good idea here to avoid listening on the loopback interface. That way both instances should start fine.
Alkis Georgopoulos (alkisg) wrote : | #12 |
Hi Mathieu,
> If you're installing dnsmasq on top of the standard desktop install, why is it such an issue to edit the NetworkManager configuration to cater it to your needs?
> except-interface=lo may be a good idea here to avoid listening on the loopback interface
It's not about me; it's that the default dnsmasq/bind installations are now broken on desktop installations.
For the needs of our schools here in every LTS release we're making repositories with custom packages for automated installation + configuration, so the nm configuration editing is just a sed away, much less trouble than even reporting the bug in the first place.
> Wouldn't it make sense it this case to go further steps and make sure the network connection is setup in /etc/network/
No, network manager supports static IPs (even though we don't always need them even on LTSP servers) and doing it without /etc/network/
> and setting up a special upstart job to spawn a local resolver won't work (NM spawns it itself, using a custom configuration on purpose).
Right, that's why I'm saying that the local resolver implementation is immature, it doesn't integrate with the rest of the distro, but it breaks other packages by launching a DNS server from hardcoded C code instead of a regular sysvinit/upstart script like all the other daemons.
> I guess a workable solution would be to check for /etc/default/
That would indeed be workable, please do implement it.
> listen-address probably shouldn't contain 127.0.0.1 if dnsmasq is meant to be used to resolve things for ltsp clients
Thin client sessions run on the server, and would be resolved from the nm-spawned dnsmasq instance without caching, while LTSP fat client sessions would be resolved from the normal dnsmasq instance with caching.
Having one DNS server for half of the clients and another for the other half is bound to cause confusion and problems.
Anyway, I think I've made my point, if it's too difficult to do for Precise just postpone it until the next release. To workaround the problem for Greek schools I'll make an ltsp-server-dnsmasq package and sed the nm configuration in its postinst.
Cheers,
Alkis
Mathieu Trudel-Lapierre (cyphermox) wrote : | #13 |
The parsing of /etc/default/
Please, do post your dnsmasq configuration so we can try to figure out the right way to integrate this with the current setup.
As for the set of resolvers on the network, that's not exactly the "plan": all systems used to have the libc resolver. Now any system that runs NetworkManager will also be running a local dnsmasq instance since that handles a bunch of issues (more than three servers, split DNS, broken IPv6 DNS, etc) far better than libc. Then they can easily speak to a network DNS server if necessary or resolve directly to the internet.
I don't understand how your systems are setup, and I think that's where the confusion come from. What I'm expecting is that the LTSP server also runs a dnsmasq daemon to provide resolving to all the LTSP clients; with none of the clients running dnsmasq "locally". Isn't that the case?
I do think there are simpler ways to fix this than doing a sed of the nm configuration.
Alkis Georgopoulos (alkisg) wrote : | #14 |
> Please, do post your dnsmasq configuration so we can try to figure out the right way to integrate this with the current setup.
Just assume the default dnsmasq configuration, any other settings we have there are completely unrelated to this problem.
When one installs dnsmasq, it's supposed to start listening on 0.0.0.0:53, without manually editing any configuration files at all, i.e. with the stock /etc/dnsmasq.conf.
Now with the local resolver listening on 127.0.0.1:53, dnsmasq complains that the port is in use and fails to start.
> Now any system that runs NetworkManager will also be running a local dnsmasq
Let's step back a bit and talk about that. You're launching a DNS server without using a sysvinit or upstart job. So you're bypassing update-rc.d, policy-rc.d, upstart .override files, package Conflicts:, Provides: etc, all the standard framework for managing services.
Why wouldn't it be more reasonable to start the local resolver service normally like all the other daemons?
Even make a package out of it, and declare that it Conflicts: bind9, dnsmasq, so that people installing those automatically get rid of the local resolver and its conflicting configuration?
If you assume that "network-manager contains a hardcoded DNS server", then the network-manager package itself should conflict with other DNS servers... But that shouldn't be the case, people should be allowed to install any DNS server they want alongside network-manager, and that could be done seamlessly and without editing any configuration files at all if:
network-manager recommented the local-resolver package,
and the local-resolver package conflicted with the other dns server packages.
Then, when I install dnsmasq over the desktop installation, the local-resolver package would be automatically uninstalled, and I wouldn't have to edit any configuration file at all to resolve the conflict, it would be resolved by the package manager.
> I don't understand how your systems are setup, and I think that's where the confusion come from. What I'm expecting is that the LTSP server also runs a dnsmasq daemon to provide resolving to all the LTSP clients; with none of the clients running dnsmasq "locally".
The problem isn't LTSP specific, it applies to anyone that wants to use dnsmasq as a DNS server for his local network.
But yes, for LTSP labs that use dnsmasq, it is exactly as you described it. Now, LTSP clients are all diskless and netbooted, but of two kinds: thin and fat clients. Imagine thin clients like XDMCP clients, i.e. many users working remotely on the same server. So those would be using the local resolver, and miss the caching feature and the speed up that it offers.
Imagine fat clients like regular machines that have nameserver=the LTSP server in their resolv.conf. In the solution you proposed above, those would be using the real dnsmasq instance, with caching and everything.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #15 |
Then at this point the issue is that dnsmasq is shipped with a default configuration that while it's technically "correct"; binds on all interfaces and should normally be modified by the admin to suit the needs of their network. That configuration will break with NM making use of dnsmasq-base as a local resolver; and most likely also bombs with qemu/kvm virtual machines.
I want to make this easy for people in your situation, but having a system-wide instance isn't going to work. Not only is it way too complex for what we're trying to achieve (let alone confusing to users to see packages get removed by metapackages), but you always risk that someone modifying the system-wide config meant for use with NetworkManager then causes totally unwanted behavior when NetworkManager tries to add nameservers to the configuration. That's without counting that this still doesn't fix the issue of resolving for virtual machines, which you'll almost certainly want to resolve separately from anything else (and to think of it, installing virt-manager and virtual machine on your setup probably breaks just as bad as NM).
I've been trying hard to offer solutions and I've proposed configuration changes to the shipped config which cover the issue nicely for your case. If you don't want to apply these changes, that's fine; you're obviously free to implement a fix however you see fit :)
For precise +1 there may be a way to move dnsmasq initialization in NM to use 127.0.1.1, and allow this in dnsmasq with upstream's help, but that's not even going to solve this particular issue.
Reducing the priority since we won't look at this until Precise+1 and there aren't many reports about such issues.
Changed in network-manager (Ubuntu): | |
importance: | Medium → Low |
Marco Menardi (mmenaz) wrote : | #16 |
I run ltsp also, and even if I remove NM completely, I think that Alkis's setup is interesting and would love to be able to use it also in the near future, so this "breakage" will affect me too.
As general consideration I find scaring that installing a package can bring such problems "just because we think that usually is not used often". I really want GNU/Linux keep being an predictable system and apt packaging a very good one, so please consider to fix this issue before release.
Thanks in advance
Asmo Koskinen (asmok) wrote : | #17 |
Me, too. Fix this one. '#dns=dnsmasq' is ugly hack, not for real humans, who run ltsp server at school.
Here is my bug report:
https:/
Best Regards Asmo Koskinen.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #18 |
Please read the whole thread and see the various other workarounds provided; granted the default shipped configuration for dnsmasq doesn't play well with NetworkManager, but it's easy to adjust to your particular needs and workaround this issue; which also only happens if the system acting as a server locally runs both dnsmasq and NetworkManager.
We've clearly identified that having dnsmasq bind to particular interfaces is an easy way to work around this and is a very good idea anyway. Please make sure your dnsmasq configuration sets interface= to the interface on which it should listen, and possibly also uncomment bind-interfaces in /etc/dnsmasq.conf. At that point the changes to /etc/NetworkMan
This isn't just a simple fix for this; the default shipped configuration for dnsmasq is just as "guilty" as network-manager for assuming it should bind on all addresses and all interfaces.
Alkis Georgopoulos (alkisg) wrote : | #19 |
> This isn't just a simple fix for this; the default shipped configuration for dnsmasq is just as "guilty" as network-manager for assuming it should bind on all addresses and all interfaces.
I disagree; most system services bind to all addresses and interfaces by default (sshd, cupsd, bind, dnsmasq, dhcp, tftp, nbd, inetd, rpc...). And I do want DNS services for my thin client sessions running on the server, so I do want dnsmasq listening in all addresses.
Alkis Georgopoulos (alkisg) wrote : | #20 |
Mathieu, some help please?
After my ltsp-pnp package comments out dns=dnsmasq in /etc/NetworkMan
invoke-rc.d dnsmasq restart from its postinst,
but that fails as the nm-spawned dnsmasq instance is still listening on port 53.
And if I kill it before starting the normal dnsmasq, that leaves the DNS configuration broken...
How can I tell resolv.conf and network-manager to reload their configurations?
Is it necessary to restart the network-manager service? And if it is, is that enough? I'd hate to have to tell the users that they need to restart their servers... :(
Thanks!
Mathieu Trudel-Lapierre (cyphermox) wrote : | #21 |
You need to restart network-manager after changing the configuration value.
It's unfortunate that the configuration needs to be changed, but it's needed. I sympathize with your use case, but there is sufficient benefit in using NM together with dnsmasq and resolvconf to solve other DNS resolution issues to inconvenience those who use dnsmasq separately as a standalone daemon (to have to change the config to suit their needs).
We won't be fixing this for Precise, but I've started discussion with dnsmasq upstream to possibly deal differently with the binding and allow running instances on other IP addresses (such as 127.0.1.1 or so). It's still going to need sufficient amounts of work to fix dnsmasq's method of binding to interfaces and how NM starts and interfaces with dnsmasq (though I already have patches for NM, but they're useless without the fixes in dnsmasq). At this point though, the simplest way to deal with this remains to edit interfaces= to map to the relevant external interfaces (eth0, wlan0, etc.) and let the NM-spawned instance get started on lo.
Alkis Georgopoulos (alkisg) wrote : | #22 |
> At this point though, the simplest way to deal with this remains to edit interfaces= to map to the relevant external interfaces (eth0, wlan0, etc.) and let the NM-spawned instance get started on lo.
We can't do that; we need DNS caching for thin client sessions which run on the server with DNS=127.0.0.1. We need to completely disable the nm dnsmasq spawning.
> You need to restart network-manager after changing the configuration value.
Thank you, I think that's too much to do from a postinst so I'll probably document it as part of the installation process.
For the record, I think that the proper way to solve the problem is from libc itself. Ask Simon to allow calling dnsmasq like a library, or communicate with it via a socket, whatever's needed, but no :53 port hooking, this is reserved for real DNS servers, not for helpers for libc shortcomings.
Thanks again for all the feedback,
Alkis
Mathieu Trudel-Lapierre (cyphermox) wrote : | #23 |
caching> That's one good reason where this is currently failing. The NM instance won't cache. That's disabled on purpose, but we'll re-enable for 12.10 or later once we can have per-user caches and something secure.
library> unfortunately, that won't help. library use, with not being able to keep state (e.g. have I tried this server yet? did it respond?) is one of the issues we're fixing with dnsmasq, which can't be tackled by a library.
Using dnsmasq via dbus is a likely good way to fix this, but there are countless possible issues with assuming that the centrally running instance of dnsmasq is the one you also want to use for resolving your own stuff, and to update with information from DHCP.
Alkis Georgopoulos (alkisg) wrote : | #24 |
Since this won't be fixed for Precise from the network-manager side, the dnsmasq package now is broken by default in desktop installations.
So I've added the dnsmasq package in the "Affects:" list, to make it easier for people to locate the cause of the problem so that fewer duplicate bug reports are filed (it's an LTS release, I suppose many people will be bitten by it in the next 5 years).
Also, even though it's not the correct place to solve the problem, the dnsmasq.postinst could be temporarily modified to disable the local resolver. I can propose a patch for it if the maintainer is interested.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #25 |
That wouldn't be the right process though. The configuration itself shipped by default should be patched, that can be done with a simple patch to the dnsmasq package.
Alkis Georgopoulos (alkisg) wrote : | #26 |
> The configuration itself shipped by default should be patched
If you mean something like:
except-interface=lo
bind-interfaces
...I just tested them and they do allow both dnsmasq instances to run.
But of course those settings won't be acceptable to most dnsmasq users, as listening on "lo" is usually desired too (local DNS cache; DHCP/TFTP for VMs etc). So I don't think that crippling the default dnsmasq functionality is a good way to solve this problem. DNS clients shouldn't hook port 53; it's reserved for DNS servers.
Launchpad Janitor (janitor) wrote : | #27 |
Status changed to 'Confirmed' because the bug affects multiple users.
Changed in dnsmasq (Ubuntu): | |
status: | New → Confirmed |
Bert Voegele (bertvoegele-deactivatedaccount) wrote : | #28 |
Just as a short reminder, there are more DNS-resolver/server available as packages out there than just bind and dnsmasq, i.e. djbdns and it's derivates. Until I removed the annoying dns=dnsmasq line in /e/N/Nconf, NM disconnected the WLAN after a couple of minutes, throwing an error about dnsmasq not able to bind to 127.0.0.1.
I'm puzzled about the default inclusion of dnsmasq as a local resolver for standard users. If a connection is to be shared, it might be useful to bind dnsmasq to the shared iface to provide DHCP and DNS, like it's done with libvirt-bin.
summary: |
- Don't start local resolver if a DNS server is installed + Standalone dnsmasq is not compatible out of the box with NM+dnsmasq |
summary: |
- Standalone dnsmasq is not compatible out of the box with NM+dnsmasq + Don't start local resolver if a DNS server is installed |
Alkis Georgopoulos (alkisg) wrote : | #29 |
@jdthood: the "Standalone dnsmasq is not compatible out of the box with NM+dnsmasq" title hints that the problem is caused by the dnsmasq package, i.e. that it should be crippled and not listen on "lo" by default in order to coexist with the local resolver implementation.
I don't think this is the case, I don't think the dnsmasq package does anything wrong; I just cross-linked the bug report in case other people hit the problem and try to find it in the dnsmasq bug page.
The problem should be fixed from the network-manager side.
Otherwise, similar bug reports should be filed against all other DNS server packages, not just dnsmasq. But I really think that people do want their DNS servers to listen on "lo" by default. They wouldn't want to break that just to help the local resolver implementation.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #30 |
Listening on lo is fine; and blocking other DNS servers from being started isn't. I think we're in violent agreement there. The problem is how to fix this.
I'm not saying dnsmasq should be crippled, but that it should special-case lo and not just listen on 0.0.0.0; because that binds to any further use of port 53, which might not work with any further processes that might want to legitimately listen on port 53.
That's pretty much how the solution is shaping to be: when listening on all interfaces, listen on each interfaces separately; binding to the IP address attached to the interface (or via any other mean). We should then be able to have dnnsmasq listen on 127.0.1.1:53 to satisfy the need for a local resolver.
Thomas Hood (jdthood) wrote : | #31 |
@Alkis: Your title "Dont..." is not a description of a problem.
Alkis Georgopoulos (alkisg) wrote : Re: Local resolver prohibits DNS servers from running | #32 |
@Thomas: cool, I hope this one's better.
summary: |
- Don't start local resolver if a DNS server is installed + Local resolver prohibits DNS servers from running |
Thomas Hood (jdthood) wrote : | #33 |
I just re-read the whole discussion and thought it would be useful (for me, at least) to summarize it.
The original bug report was that NM+dnsmasq and standalone dnsmasq are incompatible because they have overlapping network socket address ranges, 0.0.0.0:53 and 127.0.0.1:53.
One solution is for the administrator to comment out "dns=dnsmasq" in /etc/NetworkMan
Another solution is as described by the submitter's title: "[Hey NetworkManager,] Don't start local resolver if a DNS server is installed".
Another solution favored by Mathieu is for the NM-enslaved dnsmasq and the standalone dnsmasq to use disjoint network socket address ranges.
Early on, Mathieu said that solving this problem would not be a top priority because not many users want to combine the DNS server role (running bind or dnsmasq) with the DNS client role (running NM+dnsmasq).
Alkis argued that the incompatibility is a serious bug that should be prevented using package dependencies or eliminated automatically by maintainer scripts or other means. The administrator shouldn't have to search the web to figure out how to make the dnsmasq package work. Troublesome is the fact that standalone dnsmasq sometimes works, sometimes doesn't, in the presence of NM+dnsmasq.
Along the way Alkis levelled some fundamental criticisms against the design of NM+dnsmasq.
I think that there is a clash of civilizations here: the Debian way (modular components that just work together in any combination allowed by package dependencies) versus the RedHat way (big daemons with limited options that own subsystems).
Thomas Hood (jdthood) wrote : | #34 |
Alkis: Why do you need the dnsmasq package at all? You want NM and dnsmasq. Why not just use the NM-enslaved dnsmasq?
If the latter doesn't meet your needs, could it be adapted somehow to meet your needs?
Assuming that there are good reasons for using NM and standalone dnsmasq, I'd be inclined to agree with Alkis (if I understood him correctly) that a good solution would be to put the NM-dnsmasq integration stuff into a package and make this conflict with the standalone dnsmasq package.
Thomas Hood (jdthood) wrote : | #35 |
Hmm, I wasn't very clear. What I meant in my questions above (#34) was this. If NM+dnsmasq is the best solution for name service for the local host, isn't it also a better solution than NM-together-
Alkis Georgopoulos (alkisg) wrote : | #36 |
Thomas, that was a very good summary at comment #33!
> Why do you need the dnsmasq package at all? You want NM and dnsmasq. Why not just use the NM-enslaved dnsmasq?
The NM-enslaved dnsmasq uses hardcoded options (in C) that provide extremely limited functionality.
* It doesn't listen on ethX (--listen-
* It doesn't cache requests (--cache-size=0). No caching ==> no DNS queries speedup. This again is very significant for LANs as there are many concurrent users.
* Finally, we also need the DHCP and TFTP functionality of dnsmasq, so even if NM+dnsmasq included a real DNS server, we'd have to run another dnsmasq instance (without a DNS service in that case) for its 2 other services.
> a good solution would be to put the NM-dnsmasq integration stuff into a package and make this conflict with the standalone dnsmasq package.
I completely agree, and to also conflict with bind9 and any other DNS server packages.
Thomas Hood (jdthood) wrote : | #37 |
What lies behind the problem being discussed here is the simple fact that there exists no single adequate network configuration utility for GNU/Linux. I am most familiar with Debian. From Debian we inherit ifupdown which was designed for static configuration. Debian developers have known for more than ten years that ifupdown needed to be replaced, but have never managed to come up with a replacement. From RedHat we get NetworkManager which was never intended to be a general network configurer but in the absence of any alternative continues to be enhanced with new features. Considerable effort has obviously been spent in Ubuntu just to get NM to coexist with other networking packages. It still doesn't fully cooperate with them (see #47379 for another example) and will probably never be well integrated with them.
So we are still forced to choose between two network configuration approaches, NM-oriented in the desktop version and ifupdown-oriented in the server version. Each one has its limitations. If you try to combine the two, as you (Alkis) want to do, then you are confronted with these limitations. You are lucky that all you have to do is comment out one line in a configuration file to get things to work!
We can continue playing around with the existing tools so that they work better in particular use cases but what we really need is a properly designed network configuration utility to supersede both ifupdown and NM.
I am vaguely aware of the Wicd project. Must go read up on that.
Thomas Hood (jdthood) wrote : | #38 |
* Some thinking about[0][1], if not much coding of[2], a successor to ifupdown was done in the netconf project[3] led by Debian Developer martin krafft[4][5].
[0]http://
[1]http://
[2]http://
[3]https:/
[4]madduck AT debian.org
[5]http://
* One small step toward harmonizing desktop network configuration and server network configuration was taken with the introduction of resolvconf in both versions of 12.04. But there again, NM integrates bare-minimally with resolvconf; NM doesn't let resolvconf prioritize nameserver information according to interface-order(5) but sends resolvconf one big lump of nameserver information called "NetworkManager".
* If Ubuntu doesn't switch to wicd or netconf or something else then another possibility to be explored is to break up NM into components that can be better integrated with other parts of the distro. This is, of course, rather difficult without cooperation from upstream.
Thomas Hood (jdthood) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running | #39 |
Based on comment #28, marked as affecting djbdns.
summary: |
- Local resolver prohibits DNS servers from running + NM-controlled dnsmasq prevents other DNS servers from running |
summary: |
- NM-controlled dnsmasq prevents other DNS servers from running + NM-controlled dnsmasq prevents other DNS servers from running, yet + network-manager doesn't Conflict with their packages |
Thomas Hood (jdthood) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #40 |
But enough dreaming. Given the world as it is, the immediate challenge is to make NM+dnsmasq compatible with standalone nameservers. (Otherwise network-manager should Conflict with those nameservers' packages.)
Solutions mentioned earlier:
* Tell the administrator to comment out "dns=dnsmasq" in /etc/NetworkMan
* Change NM so that it acts as if "dns=dnsmasq" is absent if a DNS server package is installed.
* Change standalone dnsmasq such that it doesn't listen on 0.0.0.0:53, doesn't listen on 127.0.1.1:53 and change NM so that its dnsmasq listens only on 127.0.1.1:53.
Here's a new idea.
* Enhance the resolver(3) so that nameservers can be specified in resolv.conf using the <address>:<port> notation
* Change NM such that it causes its slave dnsmasq to listen on another (than 53) port number P and sends "nameserver 127.0.0.1:P" to resolvconf.
Thomas Hood (jdthood) wrote : | #41 |
I meantioned Wicd and Netconf above. While investigating another problem I stumbled across Connman
which appears to be another alternative to NetworkManager worth watching.
Thomas Hood (jdthood) wrote : | #42 |
Another idea:
* Change NM such that it causes its slave dnsmasq to listen on ::1 instead of 127.0.0.1
But I guess the problem will just arise again if the standalone dnsmasq is changed to listen on the wildcard IPv6 address.
Alkis Georgopoulos (alkisg) wrote : | #43 |
> * Change NM such that it causes its slave dnsmasq to listen on ::1 instead of 127.0.0.1
Personally, when I install dnsmasq, I *don't want* to use the NM-spawned dnsmasq, because it disables caching etc etc. So it wouldn't matter if it listened on another address, on a socket or wherever else; I wouldn't want it as my default resolver.
I still think that the best idea is to make the local resolver a separate package that conflicts with all DNS servers, so that it's automatically uninstalled when one of them gets installed,
until the problem is solved within libc. There are a lot of methods for interprocess communication, I'm sure some of them can solve the problem properly. That way all people would benefit, now the solution is only for those using Ubuntu+Network manager+resolvconf.
Thomas Hood (jdthood) wrote : | #44 |
Alkis wrote:
> I wouldn't want it as my default resolver.
But some people might. It's better to eliminate the behavioral conflict, if we can, than to formalize that conflict as a packaging dependency.
Alkis Georgopoulos (alkisg) wrote : | #45 |
> It's better to eliminate the behavioral conflict, if we can, than to formalize that conflict as a packaging dependency.
I was about to say this:
But then the main problem which caused me to report this bug would remain:
When I install the dnsmasq package, it wouldn't work.
I'd configure my dnsmasq, then put 127.0.0.1 as the DNS server for my LTSP client sessions on the server, and another dnsmasq would answer which wouldn't contain my configuration, my A or MX records or whatever else I'd put in my configuration file.
...but then I thought of this, which if it worked, I wouldn't have problems:
If nm + resolvconf managed to properly chain the 2 dnsmasq instances,
so that the NM-spawned dnsmasq was contacted first in another address or port or IPv6 or whatever,
and then the NM-spawned dnsmasq contacted my real dnsmasq at 127.0.0.1, since it's the DNS server I declared at my connection properties,
then it would at least work as expected, with just a small additional overhead. I wouldn't mind about that. And it would work with any DNS server too, not with just dnsmasq.
Thomas Hood (jdthood) wrote : | #46 |
Alkis wrote:
> If nm + resolvconf managed to properly chain the 2 dnsmasq instances so that the NM-spawned dnsmasq was contacted first
I think that this configuration should be supported, whether or not it's the best solution to the present problem (#959037).
Resolvconf can handle this with a little tweaking. The "general" local nameserver registers its listen-address, 127.0.0.1, with resolvconf under a name like "lo.dnsmasq" which has a high priority according to interface-order(5). NM currently registers its slave nameserver's address under the name "NetworkManager" which has a very low priority. To implement Alkis's idea, the ordering would have to be adjusted so that the NM address has a higher priority than the other addresses. If we decide to implement Alkis's idea then I'll change the Debian package to add something like "lo.nm-dnsmasq" before the other lo.* patterns in the default interface-order. Then network-manager should be changed so that it registers its slave's address under that name.
But, second, there is a problem connecting the resolver to the NM-controlled dnsmasq such that the latter stays out of the way of the general local nameserver which currently wants to listen on the IPv4 wildcard address. Using address ::1 for nm-dnsmasq is a quick hack which might work without further ado
But even if it works it clearly isn't a permanent solution. More satisfactory would be to use an another port than 53 for the special purpose of connecting the resolver with nm-dnsmasq.
Currently the GNU C Library resolver doesn't support using another port.
Interestingly, OpenBSD does support it. Here's an extract from the OpenBSD resolv.conf man page.
nameserver IPv4 address (in dot notation) or IPv6 address (in hex-and-
Mac OS X has a similar feature.
That doesn't immediately help us, of course, but it does set a precedent for a similar enhancement of the GNU C Library. Perhaps we could implement ::1 for now and also start work on getting the aforementioned enhancement into glibc.
Simon Kelley (simon-thekelleys) wrote : Re: [Bug 959037] Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #47 |
On 11/06/12 19:57, Thomas Hood wrote:
> But, second, there is a problem connecting the resolver to the NM-
> controlled dnsmasq such that the latter stays out of the way of the
> general local nameserver which currently wants to listen on the IPv4
> wildcard address. Using address ::1 for nm-dnsmasq is a quick hack
> which might work without further ado
>
> But even if it works it clearly isn't a permanent solution. More
> satisfactory would be to use an another port than 53 for the special
> purpose of connecting the resolver with nm-dnsmasq.
>
Another option is to use another address in 127.0.0.0/8, any will work.
You'll need dnsmasq 2.61 or later for this to be a viable option.
You could have the nm-dnsmasq run with --bind-interfaces
--listen-
Another instance of dnsmasq will run without interfering with that,
providing only that --bind-interfaces is set.
Simon.
Thomas Hood (jdthood) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #48 |
Aha, I had tried this and it didn't work... in version 2.57. But I see that quantal already has 2.62.
> Another instance of dnsmasq will run without interfering with that, providing only that --bind-interfaces is set.
Just to make sure I understand correctly: Do you mean here that --bind-interfaces has to be set on both instances of dnsmasq? Or will one instance (the NM-controlled one) with --bind-interfaces coexist nicely with another (the standalone dnsmasq) which doesn't use that option and listens on 0.0.0.0?
NM already runs dnsmasq with --bind-interfaces and --listen-address (specifically, --listen-
Mathieu mentioned earlier the possibility of using 127.0.1.1 which happens to be the address assigned (in /etc/hosts) to the system hostname on some (but not all) systems. Is there any advantage to using 127.0.1.1 as opposed to another 127.* address?
Simon Kelley (simon-thekelleys) wrote : Re: [Bug 959037] Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #49 |
On 11/06/12 20:41, Thomas Hood wrote:
> Aha, I had tried this and it didn't work... in version 2.57. But I see
> that quantal already has 2.62.
>
>> Another instance of dnsmasq will run without interfering with that,
> providing only that --bind-interfaces is set.
>
> Just to make sure I understand correctly: Do you mean here that --bind-
> interfaces has to be set on both instances of dnsmasq? Or will one
> instance (the NM-controlled one) with --bind-interfaces coexist nicely
> with another (the standalone dnsmasq) which doesn't use that option and
> listens on 0.0.0.0?
It has to be set in both instances of dnsmasq.
dnsmasq started as a system daemon reads config from
/etc/dnsmasq.d/*
so dropping a file there containing "bind-interfaces" and doing the
relevant restart in postinst should make this automatic in most cases.
>
> NM already runs dnsmasq with --bind-interfaces and --listen-address
> (specifically, --listen-
> the address.
>
> Mathieu mentioned earlier the possibility of using 127.0.1.1 which
> happens to be the address assigned (in /etc/hosts) to the system
> hostname on some (but not all) systems. Is there any advantage to using
> 127.0.1.1 as opposed to another 127.* address?
>
I don't think so: they're all equivalent.
Simon.
Thomas Hood (jdthood) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #50 |
> so dropping a file there containing "bind-interfaces"
> and doing the relevant restart in postinst should
> make this automatic in most cases.
I notice that libvirt has just used this mechanism to solve a comparable problem (see ##928524). Libvirt includes the file /etc/dnsmasq.
bind-interfaces
except-
Alkis Georgopoulos (alkisg) wrote : | #51 |
Note that while bind-interfaces can be specified multiple times, defining except-interfaces more than once is a syntax error in my dnsmasq 2.59-4.
Thomas Hood (jdthood) wrote : | #52 |
It just occurred to me that if we are going to change someone's listen address then it might be better to give 127.0.0.1 to nm-dnsmasq and 127.0.1.1 to the standalone nameserver.
Consider the case where nm-dnsmasq is running on a machine, nemo, that happens to run the nameserver for the LAN. /etc/hosts on nemo contains either
127.0.0.1 localhost
127.0.1.1 nemo
or
127.0.0.1 localhost
10.1.2.3 nemo
where 10.1.2.3 is nemo's external IP address.
Other machines in the LAN access nemo via 10.1.2.3 for their general name service. If they are Ubuntu machines they also access their local nm-dnsmasq instances via the loopback address. It's nicely symmetrical if processes on nemo itself also use the loopback address to access the local nm-dnsmasq and use either the public address, 10.1.2.3 or its substitute, 127.0.1.1, for general name service.
Perhaps this is only an aesthetic question.
Simon: Can we arrange by means of the file in /etc/dnsmasq.d/ that the standalone dnsmasq listens on 127.0.1.1 rather than 127.0.0.1?
Thomas Hood (jdthood) wrote : | #53 |
Alkis: Suppose your host, foo, has external IP address 10.1.2.3 and runs a standalone nameserver which listens on eth0. Configure things such that nm-dnsmasq on foo uses 10.1.2.3 as its upstream nameserver; configure the standalone nameserver on foo not to listen on lo. If it's dnsmasq, start it with --except-
If so then this may be a very simple way to deal with #959037, at least with respect to dnsmasq. Network-manager simply drops a file with
except-
into /etc/dnsmasq.d/. NM can still use the local standalone dnsmasq via the external network interface, the address of which it may receive from the DHCP server, for example.
Thomas Hood (jdthood) wrote : | #54 |
Hmm, just tested this myself. You can't use "except-
Thomas Hood (jdthood) wrote : | #55 |
Aha, you have to use "except-
Alkis Georgopoulos (alkisg) wrote : | #56 |
I tested bind-interfaces and except-interface=lo in the past (comment #26), it worked as advertised. I haven't yet tested them in the chained dnsmasq mode, but I guess it would work if I'm using a static IP (which isn't always the case for LTSP servers, some teachers use their laptops for LTSP servers). With a dynamic IP, I'd have to open the NM-connection editor all the time to update my primary DNS server. So a loopback address for the dnsmasq service would be preferred locally, while the non-loopback one would again be required for the LTSP clients.
> Network-manager simply drops a file with
> except-interface=lo
What I meant in comment #51 is that libvirt then won't be able to ship with
> except-
because it's a syntax error to specify 2 except-interface directives. Maybe it could be allowed in some future dnsmasq version, with the list of declared interfaces being merged.
Thomas Hood (jdthood) wrote : | #57 |
Alkis wrote in #51:
> Note that while bind-interfaces can be specified multiple times,
> defining except-interfaces more than once is a syntax error in
> my dnsmasq 2.59-4.
Multiple except-interface options are accepted by dnsmasq 2.62-2.
Simon Kelley (simon-thekelleys) wrote : Re: [Bug 959037] Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #58 |
On 12/06/12 10:05, Alkis Georgopoulos wrote:
> Note that while bind-interfaces can be specified multiple times,
> defining except-interfaces more than once is a syntax error in my
> dnsmasq 2.59-4.
>
Are you sure? That should be allowed.
Simon.
Simon Kelley (simon-thekelleys) wrote : | #59 |
On 12/06/12 11:24, Thomas Hood wrote:
> Hmm, just tested this myself. You can't use "except-
> seems you have to use "listen-
> better way.
>
If you want to listen on an address which doesn't appear on an interface
(ie 127.0.1.1) then you have to use --listen-address.
The rules for 127.0.0.1 are slightly arcane too: If you use -interface
and --except-interface, then dnsmasq will assume that you want it to
listen on the address of any loopback interfaces it finds as well. In
practise that means 127.0.0.1
So
dnsmasq --interface=eth0
will listen on the address(es) of eth0 and 127.0.0.1.
If you use --listen-address, then dnsmasq assumes you want more control
and only uses the addresses you actually give
so
dnsmasq --listen-
will _not_ listen on 127.0.0.1
Given this, it makes sense to use 127.0.1.1 (or any address in
127.0.0.0/8 that doesn't appear on lo) for nm-dnsmasq. Because 127.0.1.1
doesn't appear on lo, another dnsmasq instance will not try and listen
on it, and the only thing required to get the two dnsmasq instances to
co-exist is --bind-interfaces.
Cheers,
Simon.
Thomas Hood (jdthood) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #60 |
(Executive summary of the following: I think we should fix this by making nm-dnsmasq listen at ::1.)
Thanks for your much-needed help, Simon.
It is good to know that the "except-interface" avenue is available. We want, however, to be able to enjoy the advantages of non-bind-interfaces mode ("unbound mode"??) in standalone dnsmasq insofar as we can. Certainly standalone dnsmasq should continue to run in unbound mode when n-m is not installed or when nm-dnsmasq is not in use; so ideally we would ensure that /etc/NetworkMan
In any case it would be better if we never had to force dnsmasq into bind-interfaces mode.
So instead of switching the nm-dnsmasq listen address from 127.0.0.1 to 127.0.1.1 it seems better to switch that address to ::1: no more difficult, yet in the latter case standalone dnsmasq can continue to run in unbound mode as it has traditionally done (unless forced into bind-interfaces mode by something like libvirt-bin, of course).
Implementing the change to ::1 shouldn't be too hard.
* It's a one-line change to network-manager where it starts dnsmasq and another one-line change where it register's the latter's address with resolvconf.
On a system with n-m and no standalone dnsmasq this will result in /etc/resolv.conf containing "nameserver ::1" and the resolver will connect to nm-dnsmasq. On a system with standalone dnsmasq and no n-m this will be no different from the traditional state of affairs, with /etc/resolv.conf containing "nameserver 127.0.0.1" and the resolver connecting to standalone dnsmasq.
On a system with both n-m and standalone dnsmasq this will *also* result in /etc/resolv.conf containing "nameserver 127.0.0.1" and the resolver connecting to standalone dnsmasq. This is probably unwanted, but is easily fixed by
* changing network-manager so that it registers the ::1 address under the name "nm-dnsmasq" (name open to discussion) instead of under the name "NetworkManager" (which can still be used for registering external nameserver information in the dns!=dnsmasq case);
* changing resolvconf so that it includes the pattern "nm-dns" at the top of /etc/resolvconf
Then on a system with both n-m and dnsmasq, /etc/resolv.conf will contain "nameserver ::1" and the resolver will use nm-dnsmasq.
The remaining challenge then is to see to it that NM sends the address 127.0.0.1 to nm-dnsmasq via /var/run/
Simon Kelley (simon-thekelleys) wrote : Re: [Bug 959037] Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #61 |
On 12/06/12 20:31, Thomas Hood wrote:
> (Executive summary of the following: I think we should fix this by
> making nm-dnsmasq listen at ::1.)
>
> Thanks for your much-needed help, Simon.
>
> It is good to know that the "except-interface" avenue is available. We
> want, however, to be able to enjoy the advantages of non-bind-interfaces
> mode ("unbound mode"??) in standalone dnsmasq insofar as we can.
> Certainly standalone dnsmasq should continue to run in unbound mode when
> n-m is not installed or when nm-dnsmasq is not in use; so ideally we
> would ensure that /etc/NetworkMan
> dns=dnsmasq if and only if /etc/dnsmasq.
> interfaces except-
> this.
>
> In any case it would be better if we never had to force dnsmasq into
> bind-interfaces mode.
>
> So instead of switching the nm-dnsmasq listen address from 127.0.0.1 to
> 127.0.1.1 it seems better to switch that address to ::1: no more
> difficult, yet in the latter case standalone dnsmasq can continue to run
> in unbound mode as it has traditionally done (unless forced into bind-
> interfaces mode by something like libvirt-bin, of course).
I don't think that's true. In unbound mode, the standalone dnsmasq will
bind the IPv6 wildcard address, which will stop the nm-dnsmasq from
binding ::1 There's no escape in IPv6 land. Indeed the situation is
worse, because as far a I know, you can't use any address in the defined
subnet for loopback, it has to be ::1, so except-interface=lo is required.
I think the 127.0.1.1 (or whatever) answer is the best. Unfortunately
there's no way round having to set --bind-interfaces on the standalone
dnsmasq, but except-interface=lo is not required as long as the
127.0.0.0/8 address in use by nm-dnsmasq doesn't appear on the lo interface.
Simon.
Thomas Hood (jdthood) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #62 |
OK, so the ::1 idea fails as a quick hack. The alternatives seem to be as follows.
1. Either we accept that nm-dnsmasq is incompatible with every standalone nameserver and enforce this in a better way;
2. or we force every standalone nameserver into bind-interfaces mode and move nm-dnsmasq's listen address to something other than 127.0.0.1;
3. or we make nm-dnsmasq listen on another port number (using the --port option) and enhance glibc to support accessing nameservers at ports other than 53.
Have I forgotten any?
#3 is the most attractive option but requires the most work and won't happen soon. In the short term the choice is between #1 and #2.
Simon Kelley (simon-thekelleys) wrote : Re: [Bug 959037] Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #63 |
On 13/06/12 11:07, Thomas Hood wrote:
> OK, so the ::1 idea fails as a quick hack. The alternatives seem to be
> as follows.
>
> 1. Either we accept that nm-dnsmasq is incompatible with every standalone nameserver and enforce this in a better way;
> 2. or we force every standalone nameserver into bind-interfaces mode and move nm-dnsmasq's listen address to something other than 127.0.0.1;
> 3. or we make nm-dnsmasq listen on another port number (using the --port option) and enhance glibc to support accessing nameservers at ports other than 53.
>
> Have I forgotten any?
>
> #3 is the most attractive option but requires the most work and won't
> happen soon. In the short term the choice is between #1 and #2.
>
For completeness, there's a #4 which is to dump
bind-interfaces
except-interface=lo
into /etc/dnsmasq.d, but that won't work for other nameservers (though
something analogous would, I expect)
If you can make #2 happen without breaking things, that would seem to be
worth doing, I guess the main problem is that you need dnsmasq 2.61 or a
backport of the relevant code to 2.59.
Simon.
Simon Kelley (simon-thekelleys) wrote : | #64 |
On 13/06/12 11:07, Thomas Hood wrote:
> OK, so the ::1 idea fails as a quick hack. The alternatives seem to be
> as follows.
>
> 1. Either we accept that nm-dnsmasq is incompatible with every standalone nameserver and enforce this in a better way;
> 2. or we force every standalone nameserver into bind-interfaces mode and move nm-dnsmasq's listen address to something other than 127.0.0.1;
> 3. or we make nm-dnsmasq listen on another port number (using the --port option) and enhance glibc to support accessing nameservers at ports other than 53.
>
> Have I forgotten any?
>
> #3 is the most attractive option but requires the most work and won't
> happen soon. In the short term the choice is between #1 and #2.
>
Further to #2 and getting dnsmasq support. I found a bug last night that
means that dnsmasq --listen-
an interface, will listen on port 69 of <ip addr> even if tftp is not
enabled. The fix is in git but not a release, but should be backported
if you do #2. It's trivial: one line.
Simon.
Alkis Georgopoulos (alkisg) wrote : Re: NM-controlled dnsmasq prevents other DNS servers from running, yet network-manager doesn't Conflict with their packages | #65 |
In reply to #58, sorry, defining multiple except-interface= directives works fine in my 2.59-4 after all, I think I might have used "except-
Solution #2 sounds good to me too. :)
If I understand well, a dnsmasq-base SRU is in order for 12.04 anyway to fix the tftp issue, so why not fix it in quantal and backport 2.62-3 to precise.
Thomas Hood (jdthood) wrote : | #66 |
Simon:
> If you can make #2 happen without breaking things, that would seem to be worth doing
Indeed, primum non nocere. Standalone dnsmasq works fine in the absence of NM+dnsmasq and vice versa and this must continue to be the case when we are done. :)
> I guess the main problem is that you need dnsmasq 2.61
As this issue has low importance I imagine it will only be fixed in quantal?
Simon:
> Further to #2 and getting dnsmasq support. I found a bug last night
> that means that dnsmasq --listen-
> is not on an interface, will listen on port 69 of <ip addr> even if
> tftp is not enabled
I just changed the lines in NetworkManager C code: s/127.0.
Once they have both been started in this way they both work --- standalone dnsmasq forwarding to nm-dnsmasq and the latter forwarding to the upstream nameservers.
The reason they cascade in this order is that dnsmasq registers 127.0.0.1 under the name "lo.dnsmasq" which has a high priority according to /etc/resolvconf
summary: |
- NM-controlled dnsmasq prevents other DNS servers from running, yet - network-manager doesn't Conflict with their packages + NM-controlled dnsmasq prevents other DNS servers from starting |
Thomas Hood (jdthood) wrote : | #67 |
With the latest dnsmasq code the two dnsmasq instances appear to work correctly in all combinations. I just tested as follows.
* With both dnsmasqs running, nm-dnsmasq forwards to the upstream nameservers and listens on 127.0.0.2; standalone dnsmasq forwards to 127.0.0.2 and listens on 127.0.0.1; the resolver consults 127.0.0.1. (That is, /etc/resolv.conf contains "nameserver 127.0.0.1".)
* Stop standalone dnsmasq (/etc/init.
* Comment out "dns=dnsmasq" and restart network-manager. With neither dnsmasq running the resolver consults the upstream nameservers.
* Start standalone dnsmasq (/etc/init.
In all cases name resolving works fine.
Summary of what was required:
* Get the latest dnsmasq from Simon's git repo.
* Patch two lines in n-m: (1) listen on 127.0.0.2 instead of 127.0.0.1 and (2) register 127.0.0.2 instead of 127.0.0.1 with resolvconf.
* Do something to cause standalone dnsmasq to be started with the bind-interfaces option.
The latter "something" could be to include a /etc/dnsmasq.d/ file in the network-manager package, but this is less than ideal because the file will continue to exist even if the admin comments out "dns=dnsmasq" in /etc/NetworkMan
standalone dnsmasq in unbound mode
and
standalone dnsmasq in bind-interfaces mode and nm-dnsmasq
Alkis Georgopoulos (alkisg) wrote : | #68 |
I was reading about bind-interfaces at http://
* Suppose a teacher boots her laptop (==LTSP server) without a network cable plugged in. Dnsmasq starts (will it only listen on lo?), and after a few minutes she decides to plug in the network cable and boot the LTSP clients. Will dnsmasq be listening on eth0 in that case?
* The same can happen if she has configured network manager with a user-connection, not a system-wide one. Dnsmasq will start on boot; eth0 will get an IP only after she logs in. Will she have to restart dnsmasq after login to make it work?
* Also, what if she was previously using a dynamic IP and she decides to set up a static IP? Will she have to restart dnsmasq after changing her IP?
Thomas Hood (jdthood) wrote : | #69 |
@Alkis: IIUC dnsmasq in bind-interfaces mode will not start to listen on any addresses assigned to interfaces after dnsmasq has started. So, yes, she would have to restart standalone dnsmasq if she wants it to listen on those newly assigned addresses.
IIUC the only way to avoid this is to run dnsmasq in non-bind-interfaces mode. (Simon will correct me if I'm wrong.) But that is incompatible with running nm-dnsmasq. So if you want dnsmasq in non-bind-interfaces mode you will have to disable NM's dns=dnsmasq mode.
To escape the dilemma we'd have to enhance dnsmasq and/or the resolver as we have discussed earlier.
Simon Kelley (simon-thekelleys) wrote : Re: [Bug 959037] Re: NM-controlled dnsmasq prevents other DNS servers from starting | #71 |
On 14/06/12 16:06, Thomas Hood wrote:
> @Alkis: IIUC dnsmasq in bind-interfaces mode will not start to listen on
> any addresses assigned to interfaces after dnsmasq has started. So,
> yes, she would have to restart standalone dnsmasq if she wants it to
> listen on those newly assigned addresses.
That's correct. The chief advantage of listening on the wildcard address
is that it all works as interfaces come and go. You lose that with
--bind-interfaces.
It would be possible these days to have dnsmasq detect extra interfaces
and bind to their addresses automatically, but the code to do that isn't
portable.
Simon.
Alkis Georgopoulos (alkisg) wrote : | #70 |
Thanks, so until the #3 idea is implemented (if ever), I'll be disabling the NM-spawned dnsmasq.
But of course the #2 idea is good enough for many cases, thank you all for your work on this.
Thomas Hood (jdthood) wrote : | #72 |
Regarding #3, I've filed a wish in upstream's bugzilla: http://
#2 is easy to implement and does solve the problem of standalone dnsmasq not starting on installation in the presence of NM+dnsmasq. What I am now wondering is how useful the resulting nameserver cascade is.
resolver ---[127.
It offers caching and service on ports other than lo, which nm-dnsmasq alone does not, but would this be better implemented by making nm-dnsmasq more configurable?
Alkis Georgopoulos (alkisg) wrote : | #73 |
What if NM always dropped whatever configuration file it needs in /etc/dnsmasq.
and when NM was started, it would check if /etc/init.d/dnsmasq exists,
* if yes, dnsmasq is installed, so it read the configuration file and there's no need to do anything,
* if not, dnsmasq-base is installed, so spawn `dnsmasq -C /etc/dnsmasq.
A nice side effect is that if the user did install dnsmasq but disabled it in /etc/default/
And, using a configuration file rather than a hardcoded command line would be appreciated by many users.
Finally, a single dnsmasq instance would be used in all cases, saving resources.
Thomas Hood (jdthood) wrote : | #74 |
Alkis: This relies on the assumption that NM's configuration text can be dropped in alongside whatever other configuration text is present and that dnsmasq will still work properly. This assumption is, er, questionable.
And this is also one answer to my question in #72. The "dnsmasq cascade" may waste resources but it has maintenance advantages. One dnsmasq process is under the control of NM. The other is under the control of the admin. They communicate with each other via a well defined protocol, RFC 1035.
(Another minor problem with your proposal as you phrased it is the following. The existence of /etc/init.d/dnsmasq does not entail that the dnsmasq is installed. The package could have been removed and not purged.)
Alkis Georgopoulos (alkisg) wrote : | #75 |
> This assumption is, er, questionable.
True, but if you don't mind, let's examine that question a bit.
This is the NM-spanwed command line:
/usr/sbin/dnsmasq --no-resolv --keep-
I assume that NM can be fixed to successfully do the following:
* Properly detect if dnsmasq is installed or not,
* Use a different command line if it isn't, i.e. the one above,
* And only include the configuration options it *really* requires in /etc/dnsmasq/
So let's see which are the dnsmasq configuration options needed by NM:
--cache-size=0
Not needed, it won't take any effect in chained dnsmasq mode if #2 is implemented either. If the user does want that, he'd have to put it manually in the main dnsmasq configuration in both solutions (#2 and this one).
--bind-interfaces
Not needed, there's only one dnsmasq.
--no-resolv, --keep-
Not needed.
--proxy-dnssec
Do I assume correctly that this is the only configuration option needed to be dropped in /etc/dnsmasq.
Alkis Georgopoulos (alkisg) wrote : | #76 |
> (Another minor problem with your proposal as you phrased it is the following. The existence of /etc/init.d/dnsmasq does not entail that the dnsmasq is installed. The package could have been removed and not purged.)
Correct, but then I wonder what prevents dnsmasq from running even if it's removed (not purged), since its executable is provided by dnsmasq-base, which _is_ installed.
But anyway the same logic that prevents it from running can be reused by NM to detect if dnsmasq will indeed start.
Thomas Hood (jdthood) wrote : | #77 |
> --conf-file not needed
Well, this is used to make nm-dnsmasq read the configuration file that has been dynamically generated by NM. Without this you will have to do something like the following.
ln -s /var/run/
NM kills and starts a new dnsmasq process every time this file changes. Will that be a problem for your LTSP setup where dnsmasq is also the DHCP server?
Alkis Georgopoulos (alkisg) wrote : | #78 |
> NM kills and starts a new dnsmasq process every time this file changes. Will that be a problem for your LTSP setup where dnsmasq is also the DHCP server?
The most time consuming operation that dnsmasq does in our setups is sending the kernel/initrd via TFTP. That takes a few seconds. If the teacher activated a VPN connection at that time and dnsmasq was killed+respawned, the client wouldn't boot. But I think that problem would be too rare, so it sounds acceptable.
I don't quite understand why dnsmasq needs to be restarted though. /var/run/
Alkis Georgopoulos (alkisg) wrote : | #79 |
The "real" dnsmasq command line is:
/usr/sbin/dnsmasq -x /var/run/
I think that NM would just need to update /var/run/
Thomas Hood (jdthood) wrote : | #80 |
$ cat /run/nm-
server=
server=
server=...
The first "server=" line reflects the fact that I am connected to a VPN. This can't be expressed in resolv.conf syntax.
No doubt dnsmasq could be enhanced to poll its configuration files. But it remains a question whether it's advisable for NM to make use of the standalone dnsmasq for the purposes for which nm-dnsmasq was introduced. Effectively this revisits the discussion that led to the introduction of nm-dnsmasq in the first place. Part of that discussion (which I wasn't party to) can be read here:
https:/
Simon Kelley (simon-thekelleys) wrote : | #81 |
On 15/06/12 10:19, Thomas Hood wrote:
> $ cat /run/nm-
> server=
> server=
> server=...
>
> The first "server=" line reflects the fact that I am connected to a VPN.
> This can't be expressed in resolv.conf syntax.
FYI only,
It's possible to use the dnsmasq DBus interface to set servers/domains
with full generality and without restarting dnsmasq.
Simon.
>
> No doubt dnsmasq could be enhanced to poll its configuration files. But
> it remains a question whether it's advisable for NM to make use of the
> standalone dnsmasq for the purposes for which nm-dnsmasq was introduced.
> Effectively this revisits the discussion that led to the introduction of
> nm-dnsmasq in the first place. Part of that discussion (which I wasn't
> party to) can be read here:
>
> https:/
> resolving
>
Simon Kelley (simon-thekelleys) wrote : | #82 |
On 15/06/12 08:04, Thomas Hood wrote:
> Alkis: This relies on the assumption that NM's configuration text can be
> dropped in alongside whatever other configuration text is present and
> that dnsmasq will still work properly. This assumption is, er,
> questionable.
There was an attempt, some time ago, to provide a way to allow something
like libvirt to add its DHCP configuration to a system dnsmasq
configuration without interfering with the existing config. It's
basically a way to specify an interface and subnet for DHCP in a config
line which overrides other access control, so for instance if the
system dnsmasq config limits it to certain interfaces, then the
interface specified by libvirt would be added to that set.
To my knowledge this facility has never actually been used.
>
> And this is also one answer to my question in #72. The "dnsmasq
> cascade" may waste resources but it has maintenance advantages. One
> dnsmasq process is under the control of NM. The other is under the
> control of the admin. They communicate with each other via a well
> defined protocol, RFC 1035.
This is a good argument, I think.
Simon.
Thomas Hood (jdthood) wrote : | #83 |
"Dnsmasq cascade" (#72) has maintenance advantages. For example it makes it easy for the distromaestros to switch to other software to perform the same limited task as nm-dnsmasq now performs, without any chance of disturbing admins' standalone dnsmasq setups.
Does dnsmasq-cascade have drawbacks compared with "Single dnsmasq" as described by Alkis in #73?
Yes...
* Dnsmasq cascade requires that standalone dnsmasq run in bind-interfaces mode.
-- Solvable by moving nm-dnsmasq to another port: http://
* Dnsmasq cascade requires two processes rather than one.
-- but resource usage is low so this doesn't seem important
But are there other drawbacks?
Thomas Hood (jdthood) wrote : | #84 |
> -- Solvable by moving nm-dnsmasq to another port: http://
BTW, the required enhancement to glibc shouldn't be difficult to implement. I expect that all we'd have to do is change the following code (around line 313 in resolv/res_init.c) so that it could read a port numeral and save that, instead of NAMESERVER_PORT, in sin_port.
if ((fp = fopen(_
[...]
while (fgets_
[...]
if (MATCH(buf, "nameserver") && nservall < MAXNS) {
while (*cp == ' ' || *cp == '\t')
if ((*cp != '\0') && (*cp != '\n') && __inet_aton(cp, &a)) {
}
There's one more snippet after this dealing with the IPv6 case. That should be it. Any obvious problems I'm overlooking?
Simon Kelley (simon-thekelleys) wrote : | #85 |
On 15/06/12 15:01, Thomas Hood wrote:
>> -- Solvable by moving nm-dnsmasq to another port:
> There's one more snippet after this dealing with the IPv6 case. That
> should be it. Any obvious problems I'm overlooking?
>
Applications that don't use the libc resolver? I don't know if such
exist be they might do.
Simon.
Thomas Hood (jdthood) wrote : | #86 |
> Applications that don't use the libc resolver?
Hmm, yes. There are several alternative resolver libraries (adns, firedns, djbdns, ...) and even if we fixed them all so that they could read the extended resolv.conf syntax then statically linked third party binaries would still break.
So having nm-dnsmasq listen on a different port, say 35353, is something that could be done ONLY when another nameserver was listening on 127.0.0.1:53 (and either forwarding to nm-dnsmasq at 127.0.0.1:35353 or not). In the absence of that other nameserver nm-dnsmasq would have to listen on 127.0.0.1:53 itself. That would probably be difficult to implement reliably.
So I guess the first drawback I mentioned in comment #83 can't be so easily eliminated.
Alkis Georgopoulos (alkisg) wrote : | #87 |
Would it be remotely possible in the future for the problem to be addressed inside libc itself?
Other people not using NM or dnsmasq would still welcome the split VPN resolving, right?
Should we file a wishlist bug request for it?
Thomas Hood (jdthood) wrote : | #88 |
I now agree (see Mathieu's comment #30) that the most expedient thing to do is
* update dnsmasq to a new release based on the latest code in Simon's git repo;
* patch the two lines in the n-m code such that (1) nm-dnsmasq listens on 127.0.0.2 instead of 127.0.0.1 and (2) NM registers 127.0.0.2 instead of 127.0.0.1 with resolvconf;
* add to n-m a /etc/dnsmasq.
and then, longer term
* enhance dnsmasq such that even in bind-interfaces mode it can be made to listen at all addresses on all interfaces. Dnsmasq would have to adapt dynamically to changes in network interfaces, opening and closing sockets in response to the addition and removal of interfaces or addresses. What would be the best way to implement this, Simon?
Thus in the short term we resolve the issue of standalone dnsmasq not starting when installed alongside network-manager. Installed alongside network-manager, dnsmasq will start in bind-interfaces mode and will forward its queries to nm-dnsmasq.
Alkis needs dnsmasq to start in unbound mode so he will still have to edit NetworkManager.conf to comment out "dns=dnsmasq" and edit /etc/dnsmasq.
Thomas Hood (jdthood) wrote : | #89 |
Relevant to my question above:
> What would be the best way to implement this, Simon?
is what Simon wrote in #928524 comment #12:
--- BEGIN QUOTATION ---
I'm wondering about adding a _third_ mode, which is has a desirable
mixture of the properties of the current two (--bind-interfaces and NOT
--bind-interfaces). Essentially, dnsmasq would bind the addresses of
individual interfaces rather than the wildcard address, making it less
of a bully for other dnsmasq instances or DNS servers, but it would use
netlink to track the creation of new interfaces or the addition of new
addresses to existing interfaces, and automatically bind them as
required. This mode is inherently Linux-specific, since it needs netlink
to work.
You could either just use it as the default, or as a less problematic
alternative to --bind-interfaces to be dropped into the system dnsmasq
by networkmanager.
--- END QUOTATION ---
Thomas Hood (jdthood) wrote : | #90 |
@Simon: This is pretty much what I had in mind (comment #88) as a long-term solution. How difficult do you think that this would be?
(Moving nm-dnsmasq listening to another port than 53 is at best a veeery long-term solution since it requires first getting glibc enhanced, then getting all other resolver libraries enhanced, then waiting for third-party static binaries to be replaced by new versions built against enhanced libraries. That's a ten-year project.)
If "bind-interface
Simon Kelley (simon-thekelleys) wrote : | #91 |
On 18/06/12 21:08, Thomas Hood wrote:
> @Simon: This is pretty much what I had in mind (comment #88) as a long-
> term solution. How difficult do you think that this would be?
Don't know. I'm working on it now: seems to be behaving:
dnsmasq: new IPv4: 192.168.3.1
dnsmasq: new IPv6: fe80::f0f6:
>
> (Moving nm-dnsmasq listening to another port than 53 is at best a veeery
> long-term solution since it requires first getting glibc enhanced, then
> getting all other resolver libraries enhanced, then waiting for third-
> party static binaries to be replaced by new versions built against
> enhanced libraries. That's a ten-year project.)
>
> If "bind-interface
> shouldn't be the default mode of operation. Indeed, I see no reason why
> it shouldn't be the *only* mode on OSs with support for it.
I see reasons: I've been burned by releasing changes that "won;t affect
anything" too many times, I like the idea of making the new behaviour
opt-in.
Simon.
>
Thomas Hood (jdthood) wrote : | #92 |
I can imagine that it will take a lot of care to avoid introducing races inside dnsmasq. Have I mentioned yet that Simon is a hero?
Do we have to worry about races outside of dnsmasq? Suppose someone was running dnsmasq in unbound mode and has now switched to the new improved dnsmasq in bind-interfaces
Thomas Hood (jdthood) wrote : | #93 |
Meanwhile my laptop has been working fine with two dnsmasq instances running in cascade. I'll try to subject this arrangement to more severe tests in the coming weeks.
# netstat -nl46p | grep :53
tcp 0 0 127.0.0.2:53 0.0.0.0:* LISTEN 7928/dnsmasq
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1256/dnsmasq
tcp6 0 0 ::1:53 :::* LISTEN 1256/dnsmasq
udp 0 0 127.0.0.2:53 0.0.0.0:* 7928/dnsmasq
udp 0 0 127.0.0.1:53 0.0.0.0:* 1256/dnsmasq
udp 0 0 0.0.0.0:5353 0.0.0.0:* 1097/avahi-daemon:
udp6 0 0 ::1:53 :::* 1256/dnsmasq
udp6 0 0 :::5353 :::* 1097/avahi-daemon:
# ps -elf|grep dnsmasq|grep -v grep
5 S dnsmasq 1256 1 0 80 0 - 8265 poll_s 10:03 ? 00:00:00 /usr/sbin/dnsmasq -x /var/run/
4 S nobody 7928 1090 0 80 0 - 8265 poll_s 12:13 ? 00:00:00 /usr/sbin/dnsmasq --no-resolv --keep-
# cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.0.1
# cat /var/run/
nameserver 127.0.0.2
# cat /var/run/
server=<LAN nameserver address>
server=<LAN nameserver address>
Simon Kelley (simon-thekelleys) wrote : | #94 |
On 20/06/12 10:56, Thomas Hood wrote:
> I can imagine that it will take a lot of care to avoid introducing races
> inside dnsmasq.
It's OK: notification of new interfaces comes via netlink, so it gets
synchronised via the select() call just like everything else.
Have I mentioned yet that Simon is a hero?
:-)
New code is in git (and I released 2.63test1) change --bind-interfaces
to --bind-dynamic as see how it goes.
>
> Do we have to worry about races outside of dnsmasq? Suppose someone was
> running dnsmasq in unbound mode and has now switched to the new improved
> dnsmasq in bind-interfaces
> but there is a delay before dnsmasq notices this and starts listening on
> it. Problem?
Because it's using netlink rather than polling, the delay is pretty
short (I know that's not a solution to races, but it does help.)
Simon.
Thomas Hood (jdthood) wrote : | #95 |
Just checked pdnsd. I thought it would be affected since it also starts in "server_ip=any" mode by default; however the Debian package which is also in Universe includes "server_
Changed in pdnsd (Ubuntu): | |
status: | New → Invalid |
Thomas Hood (jdthood) wrote : | #96 |
@Bert: Can you provide more information about the conflict with djbdns?
The dnscache-run package, one of the binary packages built from djbdns source, is marked as Conflicting with resolvconf because it messes directly with /etc/resolv.conf --- see Debian bug report #582755. Its maintainers haven't addressed this problem for several years, so from the Ubuntu perspective we have to regard dnscache-run as a rough package for do-it-youselfers and not something we need to worry much about here (#959037).
Are there other parts of djbdns we need to look at?
Changed in pdns-recursor (Ubuntu): | |
status: | New → Invalid |
Thomas Hood (jdthood) wrote : | #97 |
Next checked PowerDNS Recursor. The Debian package pdns-recursor is also available in Universe. Its default configuration is to listen only on 127.0.0.1:53 so it will also no longer conflict with nm-dnsmasq if the latter is moved to 127.0.0.2.
/etc/powerdns/
local-
local-port=53
(pdns-recursor is not to be confused with pdnsd.)
Thomas Hood (jdthood) wrote : | #98 |
Bug #928524 is related insofar as it is proposed there (see comment #18) to adopt the solution of forcing standalone dnsmasq into
bind-interfaces
except-
modes by means of /etc/dnsmasq.
Thomas Hood (jdthood) wrote : | #99 |
Simon wrote:
> change --bind-interfaces to --bind-dynamic as see how it goes.
As discussed in bug #928524 and bug #231060 various packages will be including files in /etc/dnsmasq.d/ with "bind-interfaces". I guess these will all later have to be changed to include "bind-dynamic" instead, unless dynamic binding becomes the default behavior in bind-interfaces mode.
Thomas Hood (jdthood) wrote : | #100 |
In bug #928524 Stéphane Graber has written that "even if Network Manager moves to using 127.0.0.2, which I believe is a good idea, it should still ship a dnsmasq.d config file containing 'bind-interface
Changed in dnsmasq (Ubuntu): | |
status: | Confirmed → Invalid |
status: | Invalid → Confirmed |
Changed in pdns-recursor (Ubuntu Precise): | |
status: | New → Invalid |
Changed in pdnsd (Ubuntu Precise): | |
status: | New → Invalid |
Changed in network-manager (Ubuntu Precise): | |
status: | New → Triaged |
Changed in dnsmasq (Ubuntu Precise): | |
status: | New → Confirmed |
Changed in network-manager (Ubuntu Precise): | |
importance: | Undecided → Low |
Thomas Hood (jdthood) wrote : | #101 |
Assuming that the plan in comment #88 will be implemented, the next step is to wait for dnsmasq 2.63 to get into the quantal repo.
Mathieu Trudel-Lapierre (cyphermox) wrote : | #102 |
Well, first we'll ship the file for /etc/dnsmasq.d; changing it to bind-dynamic after the fact is quick.
Launchpad Janitor (janitor) wrote : | #103 |
This bug was fixed in the package network-manager - 0.9.6.0~
---------------
network-manager (0.9.6.
* upstream snapshot 2012-07-16 12:59:59 (GMT)
+ 00297f49fbbe05c
[ Edward Donovan ]
* debian/
(LP: #1013171)
[ Mathieu Trudel-Lapierre ]
* debian/
patch. It adds unnecessary delays to things like detecting that hidden
networks are not in range, and since Jaunty drivers have changed a lot.
If we're still seeing timing issues with the supplicant, then perhaps the
drivers should be fixed instead, or we'll re-enable the patch. (LP: #446623)
* debian/
install a config file to /etc/dnsmasq.d to avoid system-wide instances of
dnsmasq to bind to 0.0.0.0 and the loopback interface, so that the NM-
spawned instance can claim an IP on lo and provide local resolution.
(LP: #959037)
* debian/
ethernet devices. Thanks to Stéphane Graber for the patch.
* debian/
* debian/
+ nm_utils_
* debian/control: move policykit-1 from Recommends to Depends: without it
calls to the backend (e.g. when starting nm-tool), will fail. Thanks to
Stéphane Graber for the testing and solution.
* debian/rules: fix clean to properly remove m4/intltool.m4.
* debian/
initial test to verify that NM works once installed.
* debian/control: add XS-Testsuite: autopkgtest.
-- Mathieu Trudel-Lapierre <email address hidden> Mon, 16 Jul 2012 17:17:51 -0400
Changed in network-manager (Ubuntu): | |
status: | Triaged → Fix Released |
Thomas Hood (jdthood) wrote : | #104 |
Note: the dnsmasq.d file included in the new n-m release includes both "bind-interfaces" and "except-
This is already a big improvement. It allows standalone dnsmasq to run on a system with NM and nm-dnsmasq: standalone dnsmasq listens on interfaces other than lo and forwards queries to nm-dnsmasq at 127.0.0.1.
$ dpkg -l dnsmasq network-
ii dnsmasq 2.62-3 Small caching DNS proxy and DHCP/TFTP server
ii network-manager 0.9.6.0~
$ cat /etc/dnsmasq.
# Tell any system-wide dnsmasq instance to not bind to the loopback interface.
# WARNING: changes to this file will get lost if network-manager is removed.
bind-interfaces
except-interface=lo
$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.0.1
search [redacted]
$ cat /var/run/
nameserver 127.0.0.1
$ cat /var/run/
server=
server=
server=
$ sudo netstat -nl4p |grep :53
tcp 0 0 192.168.1.20:53 0.0.0.0:* LISTEN 7039/dnsmasq
tcp 0 0 192.168.1.21:53 0.0.0.0:* LISTEN 7039/dnsmasq
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 6282/dnsmasq
udp 0 0 192.168.1.20:53 0.0.0.0:* 7039/dnsmasq
udp 0 0 192.168.1.21:53 0.0.0.0:* 7039/dnsmasq
udp 0 0 127.0.0.1:53 0.0.0.0:* 6282/dnsmasq
udp 0 0 0.0.0.0:5353 0.0.0.0:* 1103/avahi-daemon:
Thomas Hood (jdthood) wrote : | #105 |
Changing status to "in progress" in case we still want to implement the idea in comment #88.
Thomas Hood (jdthood) wrote : | #106 |
... would be what I suggest (but can't do myself). :)
Launchpad Janitor (janitor) wrote : | #107 |
Status changed to 'Confirmed' because the bug affects multiple users.
Changed in djbdns (Ubuntu Precise): | |
status: | New → Confirmed |
Changed in djbdns (Ubuntu): | |
status: | New → Confirmed |
Thomas Hood (jdthood) wrote : | #109 |
Is it really still a goal to get these fixes into Precise?
Mathieu Trudel-Lapierre (cyphermox) wrote : | #110 |
Yes, it is. I'll provide a package with a bunch of related changes from Quantal; namely:
- using dbus instead of a config file;
- using a different dbus name than the default for dnsmasq;
- restarting dnsmasq less often (fixed in using dbus, basically)
- avoid refreshing interface config on every route cache entry notification;
etc.
dnsmasq will still need to be updated to ship the dbus file in dnsmasq-base isntead of dnsmasq, and the biggest, most time-consuming issue is that the dbus name changing patch needs to be adapted to apply to Precise's dnsmasq.
Changed in network-manager (Ubuntu Precise): | |
assignee: | nobody → Mathieu Trudel-Lapierre (mathieu-tl) |
Mathieu Trudel-Lapierre (cyphermox) wrote : | #111 |
AFAIK this is fixed in Quantal for dnsmasq as well as NetworkManager; barring a minor issue with NM that I'm about to upload a fix for...
Changed in dnsmasq (Ubuntu): | |
status: | Confirmed → Fix Released |
Changed in dnsmasq (Ubuntu Precise): | |
assignee: | nobody → Mathieu Trudel-Lapierre (mathieu-tl) |
importance: | Undecided → High |
status: | Confirmed → Triaged |
Changed in network-manager (Ubuntu Precise): | |
importance: | Low → High |
Robin Battey (zanfur) wrote : | #112 |
I just read this entire chain, and I'm surprised not to see mention of using an NSS plugin, like Avahi (and ldap and NIS and /etc/hosts and DNS itself). I expect it would be simple enough to write a small NSS plugin that merely calls the NM-dnsmasq (running on localhost on a port other than 53) and placing it in front of (or instead of) "dns" on the hosts line in /etc/nsswitch.conf. This would not conflict at *all* with any local DNS servers, and would work for anything that used the libc resolver. It's also vastly cleaner than the "let's change multiple upstream packages" options I see listed above.
For extra points, it's probably past time to make a "dbus" nss plugin, which could be configured to talk to NM, which in turn would ask its personal dnsmasq instance running on any available port, or however it decided to track such things in the future. This would be a clean interface, with all resolving going through libc, with a well-defined API chain (libc --NSS--> dbusplugin --DBUS--> NetworkManager --DNS--> dnsmasq), and allow for NetworkManager to change the last step (DNS protocol to dnsmasq) to whatever in the future without re-architecting anything underneath.
Or have the NSS plugin directly access dnsmasq and have NetworkManager manage its configuration, to follow dnsmasq port changes or what have you. It's not as future-proof, but it still gets the job done without conflicting with any resolvers.
Thomas Hood (jdthood) wrote : | #113 |
Yes, writing an NSS plugin would have been the next resort. It's certainly easier than getting glibc and all other resolver libraries to support ports other than 53. But it's more difficult than the solution that was actually adopted, namely, to make nm-dnsmasq listen at 127.0.1.1.
(BTW, I don't know if it has been mentioned earlier in this thread, but one drawback of the adopted solution (i.e., making nm-dnsmasq listen at another address than 127.0.0.1) is that it breaks name service on machines that have no /etc/resolv.conf. In that case the resolver acts as if "nameserver 127.0.0.1" were specified. Granted, Ubuntu Precise and higher machines should *not* lack /etc/resolv.conf.)
Robin Battey (zanfur) wrote : | #114 |
Another drawback is that you still need to manually configure bind (and others) to only listen on particular addresses. If you're using dhcp this presents a problem, because you don't actually know the address. With bind, this is okay, mostly, because you can say to listen on everything for a particular interface (but then you can't listen on 127.0.0.1, because it's the same interface as 127.0.1.1), but other servers only have per-address configurations. The NSS plugin idea is *exactly* what NSS was designed for, and literally doesn't conflict with any name resolver in any way.
Thomas Hood (jdthood) wrote : | #115 |
Yes, the 127.0.1.1:53 solution works so long as dnsmasq and others are run in bind-interfaces (or equivalent) mode.
NM-dnsmasq currently (12.04) listens at 127.0.01:53 which prevents others from listening on either ALL:53 or lo:53, i.e., 127.0.0.1:53. The new (12.10) behavior allows others to listen on 127.0.0.1:53, but still doesn't allow them to listen on ALL:53. Someone correct me if I'm wrong.
> With bind, this is okay, mostly, because you can say to listen
> on everything for a particular interface
Are you sure? I am only aware of named.conf's "listen-on { IP_ADDRESS; }". If there is a feature such as you describe then presumably named binds ALL:53 and then filters according to the addresses on the specified interfaces.
> (but then you can't listen on 127.0.0.1, because it's the same interface as 127.0.1.1)
You don't listen on an interface, you listen on a socket --- an address:port pair. So when nm-dnsmasq binds 127.0.1.1:53, others can still bind lo:53, i.e., 127.0.0.1:53.
A question about the NSS plugin idea. Will this work only for software that uses glibc? What about alternative resolver libraries? They all read resolv.conf, but do they all read nsswitch.conf too? The djbdns description
http://
for one doesn't mention this.
Robin Battey (zanfur) wrote : | #116 |
> Are you sure? I am only aware of named.conf's "listen-on { IP_ADDRESS; }". If there is a feature such as you describe then presumably named binds ALL:53 and then filters according to the addresses on the specified interfaces.
Nope, I just verified, you're quite correct. I hadn't heard of it either, but upon (mis)reading comments above I presumed without verifying. Bad on me.
> A question about the NSS plugin idea. Will this work only for software that uses glibc? What about alternative resolver libraries?
Anything that uses the gethostbyname(3) call uses the NSS chain. That means essentially everything that isn't a resolver itself uses nsswitch.conf. DNS resolver libraries won't use NSS by design, because they are the resolvers themselves that are *used* by NSS. This is why there are no names in their respective configuration files, save for what they're serving (remote addresses are specified by address). If any DNS resolver itself reads nsswitch.conf, it's doing somethign Very Wrong.
The idea of NSS is that the DNS resolvers aren't *supposed* to use it. They are the exporters of NSS services, not the consumers. I don't know of any of them that use NSS for their own resolution; they are just one link in the NSS chain that is used by the (libc) name resolver libraries. When you hit the DNS service itself, you really *don't* want it to start the NSS chain over, because that would just lead to a loop.
My proposal for using NSS in place of NetworkManager's dnsmasq is to create a new NSS plugin and place it earlier in the NSS chain than the standard DNS resolver. For instance, a line like so:
hosts: files mdns4_minimal [NOTFOUND=return] network_manager [NOTFOUND=return] dns mdns4
This is straight from my Precise install, with the addition of the "network_manager [NOTFOUND=return]" stanza. It says that first you check /etc/hosts (that's "files"), then a subset of avahi ("mdns4_minimal [NOTFOUND=
It would not conflict with any other NSS plugin, because they are all tried in turn until a match is found. If you place it directly in front of the DNS resolver plugin in nsswitch.conf, it will be used before the standard DNS lookup, allowing you to do all the fancy connection-specific magic you need to do, while returning "Try Next" for anything non-connection specific, thus allowing the normal DNS resolver plugin (which reads resolv.conf) to do things as normal. This is *instead* of hooking in at resolv.conf, as you do now. People can install any resolver they want, and it works as designed. This lets you listen on high-numbered ports as well, *and* lets you have per-user dnsmasq instances (per user vpns?), while still running Bind or a normal dnsmasq instance on *:53.
Right now, the dnsmasq for NM basically hijacks resolv.conf, which means it's hooking into the DNS NSS plugin's resolution (it's the plugin that reads resolv.conf, not the applications, using code in libc). This is causing conflicts, because in order to use resolv.conf, you need to be running on port 53 -- and it would take re-writing ...
Svartalf (frank-earlconsult) wrote : | #117 |
This is a bad idea as it's been implemented, guys- there's tons of local installations that use internal DNS (My CenturyLink router or my day-job's setup, for example...) that this flatly breaks out of box. You've got to do a bunch of manual interventions for MANY corporate desktop and home desktop situations. It doesn't honor lookups against the local, specified by DHCP, DNS servers- it goes out to the DNS roots and goes from there. Works FINE for JUST surfing the 'net. It's an EPIC FAIL for normal, typical DNS use right now because there's no honoring any internal only DNS entries with it as it is out of box.
It's nice that you're trying to make it easier for VPN, etc. but in the corporate desktop story, you're using OpenVPN, PPTP, or something like Sonicwall's solution. This means it's going to re-direct DNS on you ANYHOW, defeating the nice thing you're attempting here. If you think you're changing their minds, think again.
As it stands, I'm going off to cripple this less than well thought out design decision so that things MIGHT work better on my setups. I suggest thinking through *ALL* prospective use-cases of things before implementing something like this in the future- it really, really ticks people off when it doesn't work like it's supposed to.
Thomas Hood (jdthood) wrote : | #118 |
@Svartalf: Can you please describe in more technical detail what fails to work on the machines in question, and share with us what you know about the causes of these malfunctionings? Once we have some idea what you're talking about we can help you further.
You wrote:
> there's tons of local installations that use internal DNS
What do you mean by "internal DNS"?
> It doesn't honor lookups against the local, specified by DHCP, DNS servers [...]
Ubuntu 12.04 *does* use DNS nameserver addresses provided by DHCP. Can you please explain what you are talking about here?
> OpenVPN, PPTP, or something like Sonicwall's solution [is] going to re-direct DNS on you ANYHOW
> If you think you're changing their minds, think again.
Ubuntu software works properly in Ubuntu 12.04 (except where it doesn't --- see the BTS). Third party software may fail to work properly, but it's up to the third party to fix that.
Third parties who think they can dictate how free host operating systems work can go fly a kite. Just my personal view.
John Hupp (john.hupp) wrote : | #119 |
I don't know how my case enters this discussion, but it is certainly connected to the current default installation wherein network-manager starts an instance of dnsmasq to act as a DHCP, DNS and TFTP server.
I was troubleshooting an LTSP-PNP client boot problem under Lubuntu Quantal. I installed with a single NIC per https:/
The problem is that the LTSP client, after successfully getting DHCP assignments, fails to download the pxelinux boot image. It reports "PXE-E32: TFTP open timeout."
To be more specific on the DHCP assignments, it identifies my hardware router as the DHCP server and the default gateway. It identifies the LTSP server as proxy and boot server.
I can also run this on the server itself to get a similar failure:
$ cd /tmp
$ tftp 192.168.1.102 -v -m binary -c get /ltsp/i386/
mode set to octet
Connected to 192.168.1.102 (192.168.1.102), port 69
getting from 192.168.
Transfer timed out.
A CRITICAL NOTE: This is using the default network-manager configuration of the network interface (using the default DHCP configuration, and the connection is "Available to all users").
If I merely configure the network interface (again for DHCP) via /etc/network/
But it introduces a new problem on both server and client: DNS resolution fails.
I can fix the DNS resolution problem by creating /etc/resolvconf
nameserver (my nameserver 1)
nameserver (my nameserver 2)
But trying to identify and perhaps work around the problem with network-manager and dnsmasq, I undid the changes to /etc/network/
It turns out that if I merely
$ sudo service dnsmasq restart
then the LTSP client will boot normally.
Hunting for some diagnostic information, I ran this command before and after restarting dnsmasq:
$ sudo netstat -nap | grep dnsmasq
Relevant output before restarting:
udp 0 0 127.0.0.1:69 0.0.0.0:* 887/dnsmasq
After restarting:
udp 0 0 127.0.0.1:69 0.0.0.0:* 1967/dnsmasq
udp 0 0 192.168.1.102:69 0.0.0.0:* 1967/dnsmasq
(where 192.168.1.102 is the server IP)
So dnsmasq is not binding to my server IP during boot.
If I remove /etc/dnsmasq.
Thomas Hood (jdthood) wrote : | #120 |
> the current default installation wherein network-manager starts
> an instance of dnsmasq to act as a DHCP, DNS and TFTP server.
NetworkManager starts an instance of dnsmasq to act only as a non-caching DNS nameserver forwarder. This instance listens only on the loopback interface 127.0.1.1.
If your client is DHCPing with a dnsmasq instance on an Ubuntu server then that dnsmasq instance is most probably a "standalone" instance, configured by means of files included in the "dnsmasq" package (not to be confused with the "dnsmasq-base" package which contains little more than the dnsmasq binary and which both the dnsmasq package and the network-manager package depend on) and started by an initscript, not by NetworkManager.
In reading further into your text my understanding is hampered by the fact that I am not entirely sure which machine you are referring to at different points in your text.
> The problem is that the LTSP client, after successfully getting
> DHCP assignments, fails to download the pxelinux boot image.
> It reports "PXE-E32: TFTP open timeout."
> To be more specific on the DHCP assignments, it identifies
> my hardware router as the DHCP server and the default gateway.
> It identifies the LTSP server as proxy and boot server.
Is your LTSP server running Ubuntu and standalone dnsmasq? Then shouldn't the client use your LTSP server as the DHCP server?
> So dnsmasq is not binding to my server IP during boot.
> If I remove /etc/dnsmasq.
> (which issues the sole dnsmasq directive to bind all the
> interfaces instead of listening on 0.0.0.0) and restart the
> server it allows the client to boot normally.
I think I know what is happening. The network-manager package causes (by means of the /etc/dnsmasq.
If you remove /etc/dnsmasq.
In the future we hope that standalone dnsmasq running in bind-interfaces mode will be enhanced such that it listens on interfaces that are brought up after it (dnsmasq) starts. The author of dnsmasq, Simon Kelley, has already implemented this enhancement experimentally. Once that work is done it will be possible to run dnsmasq in bind-interfaces mode without causing the problem that you ran into.
John Hupp (john.hupp) wrote : | #121 |
RE Thomas Hood's #120: That is very interesting, though I admit it is near the outer limits of my current understanding.
To address the only questions above:
>> The problem is that the LTSP client, after successfully getting
>> DHCP assignments, fails to download the pxelinux boot image.
>> It reports "PXE-E32: TFTP open timeout."
>> To be more specific on the DHCP assignments, it identifies
>> my hardware router as the DHCP server and the default gateway.
>> It identifies the LTSP server as proxy and boot server.
> Is your LTSP server running Ubuntu and standalone dnsmasq? Then shouldn't the client use your LTSP server as the DHCP server?
The LTSP server is running Lubuntu with the default network configuration, whatever that is. I understand you to be saying that this would be a standalone instance of dnsmasq started by an initscript, prepared to handle DHCP and TFTP. And apart from that, network-manager starts another instance of dnsmasq to handle DNS.
Regarding whether the client should use the LTSP server as the DHCP server: I imagine that it is prepared to handle DHCP, and probably does in a standard LTSP setup with 2 NIC's and the client connected to the second NIC, but in this LTSP-PNP setup with a single NIC, the client is connected to the router, and the LTSP server defers to the router handling DHCP.
------------------
Your explanation is very interesting because it explains why my blindly-applied work-around is effective. (And kudos to Simon Kelley who is working to make it possible for everything to work as configured right out of the box.)
But I don't understand what you said about standalone dnsmasq conflicting with network-manager's instance of dnsmasq when /etc/dnsmasq.
Apart from not understanding how the conflict arises, I wonder: Should this conflict be manifesting itself somehow? Everything seems to be working right now.
And would disabling network-manager's DNS-handling instance of dnsmasq then result in the need to set up an alternative DNS handler?
I'm willing to apply another solution blindly, as I did in removing /etc/dnsmasq.
Thomas Hood (jdthood) wrote : | #122 |
> the LTSP server defers to the router handling DHCP.
OK, I get it.
> I don't understand what you said about standalone dnsmasq
> conflicting with network-manager's instance of dnsmasq
> when /etc/dnsmasq.
When /etc/dnsmasq.
Remove that file and standalone dnsmasq starts in a mode where it tries to listen at all addresses. But it can't do this if NM-dnsmasq is already listening at some address.
> Should this conflict be manifesting itself somehow?
> Everything seems to be working right now.
Well, I am not sure which workaround, if any, you are currently relying on.
If you commented out "dns=dnsmasq" in /etc/NetworkMan
> And would disabling network-manager's DNS-handling
> instance of dnsmasq then result in the need to set up
> an alternative DNS handler?
No. If NM-dnsmasq is enabled then resolv.conf contains "nameserver 127.0.1.1" so that applications using the resolver library access NM-dnsmasq; NM-dnsmasq forwards queries to the upstream nameserver at the address A.A.A.A which was obtained via DHCP or otherwise. If NM-dnsmasq is disabled then resolv.conf simply contains "nameserver A.A.A.A".
> I'm willing to apply another solution blindly, as I did
> in removing /etc/dnsmasq.
> but it would be nice to understand more about it.
If you are running Ubuntu 12.04 then the best solution for now is to
* comment out the "bind-interfaces" line in /etc/dnsmasq.
* comment out the "dns=dnsmasq" line in /etc/NetworkMan
If you are running Ubuntu 12.10 and have dnsmasq version 2.63-1ubuntu1 then you can, instead,
* replace the "bind-interfaces" line in /etc/dnsmasq.
The "bind-dynamic" mode is the new mode that I referred to above and which Simon referred to earlier in comment #94. Please test it! If it works well then it should become the default, as mentioned above in comments ##99, 102.
John Hupp (john.hupp) wrote : | #123 |
Thanks for the explanation of how removal of /etc/dnsmasq.
>> Should this conflict be manifesting itself somehow?
>> Everything seems to be working right now.
>Well, I am not sure which workaround, if any, you are currently relying on.
>If you commented out "dns=dnsmasq" in /etc/NetworkMan
My workaround was simply to remove /etc/dnsmasq.
I did not comment out "dns=dnsmasq" in /etc/NetworkMan
Thanks also for the explanation of how disabling NM-dnsmasq does not break DNS.
Since I have dnsmasq v2.63, I tried the experimental solution: I restored /etc/dnsmasq.
Thank you!
Thomas Hood (jdthood) wrote : | #124 |
Question: Why did everything work on your machine when standalone dnsmasq wasn't in bind-interfaces mode but /etc/NM/NM.conf contained "dns=dnsmasq"?
Hypothesis: Standalone dnsmasq started first; network-manager second. NM tried to start NM-dnsmasq but this failed because of the address conflict and NM fell back to non-dnsmasq mode, which works fine. If this hypothesis is correct then there may be lines in the syslog that look like this:
[date] [hostname] NetworkManager[
[date] [hostname] dnsmasq[pid]: failed to create listening socket for 127.0.1.1: Address already in use
John Hupp (john.hupp) wrote : | #125 |
I thought I was done with this kind of issue, but I may be back for more.
It turns out that the only LTSP client that boots normally is the one that I was doing all of the above troubleshooting on. Others that I have tried in my little 2-PC setup all stop at a blank/black screen after successfully getting to the Lubuntu splash screen.
I have now set up forwarding of the client syslog messages to the server, and the log always ends with a string of ntpd items, the last of which is "ntpd[1314]: Listening on routing socket on fd #24 for interface updates"
I found this other Ubuntu Precise bug (#999725) https:/
Bug #999725 seems to involve some of the same issues as the ones dealt with here.
Comments? Troubleshooting? Workarounds?
Thomas Hood (jdthood) wrote : | #126 |
That the last syslog entries are made by ntpd doesn't necessarily mean that the machine is hanging because of ntpd. It could be hanging at the next step, for example.
Bug #999725 reports that ntp doesn't work properly when it is started before NIS, which is not to be confused with DNS. Probably not related.
Unfortunately I don't have any idea why the second client hangs whereas the first one doesn't.
John Hupp (john.hupp) wrote : | #127 |
Agreed. And I had hoped that I could eliminate ntpd as the source of the problem by using a simple switch in the LTSP configuration to turn it off for the client. Unfortunately that does not seem to be effective in disabling ntpd. Troubleshooting that elsewhere .....
Thomas Hood (jdthood) wrote : | #128 |
Belated reply to Robin Battey's #116.
My question in #115 was about alternative resolver libraries, not about DNS resolver libraries. There are libraries that play the same role as the whole glibc resolver. Generally these alternative resolver libraries include DNS resolvers and read /etc/resolv.conf for compatibility with the glibc resolver but I'd like to know whether or not, or to what extent, they also obey /etc/nsswitch.conf.
I believe I understand your basic idea well enough. Instead of using resolv.conf to direct name queries to nm-dnsmasq, use a new NSS module. This new NSS module, foo, would be like the existing dns "module" except that it would only talk to nm-dnsmasq, or would allow other ports than 53 to be specified so that nm-dnsmasq could be talked to over another port than 53. The new module would be named on the "hosts:" line in /etc/nsswitch.conf instead of "dns". (I don't see the point of listing both foo and dns, since foo *is* DNS.)
But how much less work would this be than adapting the glibc code so that ports other than 53 can be specified, e.g., via a new config file with enhanced semantics that (if present) overrides resolv.conf? And how much less is the risk of breaking software that uses alternative resolver libraries?
Robin Battey (zanfur) wrote : | #129 |
You've got the basic idea. The nsswitch.conf file is where Name Service services are configured, and "hosts" is one of them. DNS is *one* way to look up hosts, but so is "files" (/etc/hosts) and "mdns4" (avahi). Anything that extends how names are translated to addresses should, imnho, be done through NSS. This is because *everything* supports NSS. For instance, NIS and NIS+ hosts are done through NSS, and this is supported by essentially everything, because it's the standard. All of the "enterprise" directory services I've come across use an NSS plugin (usually the "nis" one). It's just simply the right way to do it.
I wouldn't worry about resolver libraries that don't use glibc. They're typically DNS-specific, and are typically configured by their own files anyway. Dig, for instance, will use whatever server you tell it to, and ignore resolv.conf (though it uses it as a default). Same goes for the "host" tool -- they're used for querying specific DNS servers. However, those resolvers *also* ignore /etc/hosts, because that's referenced by the "files" NSS plugin. Any service that uses gethostbyname(3) is using glibc, and that's going to be everything except edge cases that are intentionally doing their own thing anyway. Things that try to emulate glibc behavior by first checking /etc/hosts and then /etc/resolv.conf are simply doing it wrong, and will miss (for instance) avahi, NIS, and any other directory service that may be installed.
I'm surprised at the idea that it will be less work to modify glibc. Even if it's technically easier to make a change in the glibc code than to create your own NSS plugin, you have a myriad of problems: NM functionality would now have a dependency on a nonstandard patch of glibc, the documentation for /etc/resolv.conf will be inconsistent for only this distribution, there could (will) be resistance by the glibc folks, who knows what you'll break when you alter how glibc behaves, etc.
However, with an NSS module, we have a huge number of advantages:
* It's the standard way of achieving this type of thing and is hence supported by most everything
* It's the standard way of achieving this type of thing so it's very well documented
* It's the standard way of achieving this type of thing so it's very modularized and isolated, and if NM stops working it will continue along the chain without screwing up plugins further up like (unlike when dnsmasq dies with the proposed glibc change)
* It's the standard way of achieving this type of things so the things that don't support it are, in general, doing it wrong and that's a bug on their end
* It's the standard way of achieving this type of thing so there's already a package (libnss-mdns) that adds a hosts NSS module, meaning both that we know it works and that it is "officially supported by ubuntu"
* It could be owned by the NM project instead of creating a dependency on a glibc patch that would not be taken up by distributions very quickly
* You could make it do other interesting things like have static /etc/hosts-like entries per connection.
You get the idea. If you want to see an example of an NSS hosts plugin packaged for ubuntu, that ...
todaioan (alan-ar06) wrote : | #130 |
<email address hidden>
Thomas Hood (jdthood) wrote : | #131 |
You may be right that developing a new "nm-dns" module would be easier than trying to enhance the existing dns module to support nonstandard ports.
But the more immediately relevant comparison is the comparison between the current solution and any solution involving a new or an enhanced NSS module. The current solution is to run nm-dnsmasq at 127.0.1.1:53. This solution has already been rolled out and seems to be working well. (To my own surprise I haven't seen any complaints related to the switch from 127.0.0.1 to 127.0.1.1, even though I have been following AskUbuntu and ubuntuforums.) Any alternative has to offer significant benefits if it's going to be considered for adoption, considering the amount of work and the risk involved. What benefits would the nm-dns module or the enhanced dns module give us relative to what we have now? One is: the ability to run nm-dnsmasq on another port, freeing up port 53 for BIND named listening on ALL:53. What else? Would the NSS-module approach make it easier to implement per-user caches, for example? (I see that Solaris provides per-user instances of nscd for this purpose.)
Robin, please submit a version of your comment #129 as a new bug report against network-manager, requesting that the connection to nm-dnsmasq be implemented by means of a new NSS module. Give your arguments in favor. Then we can continue the discussion in an open bug report rather than in this fix-released one.
Alkis Georgopoulos (alkisg) wrote : | #132 |
> To my own surprise I haven't seen any complaints related to the switch from 127.0.0.1 to 127.0.1.1, even though I have been following AskUbuntu and ubuntuforums.
It's possible that a large portion of Ubuntu users that are using dnsmasq as a DNS server, only use LTS releases, so complains might only show up after 2 years.
E.g. in 300 schools here we settled with disabling the nm-spawned dnsmasq from NetworkManager.
Btw please don't backport the current solution to Precise, the "bind-interfaces" part will break all those existing setups.
The nss-based solution does sound like it wouldn't cause any problems at all, though.
Thomas Hood (jdthood) wrote : | #133 |
> Btw please don't backport the current solution to Precise
In comment #110 MTL said that backporting the fix to Precise *is* planned.
Quantal includes dnsmasq 2.63 which has the new "bind-dynamic" option. In bind-dynamic mode dnsmasq works as it does in bind-interfaces mode but also updates its list of listen addresses whenever network interfaces are configured and deconfigured. It appears to work well. In bind-dynamic mode, as in bind-interfaces mode, standalone dnsmasq is compatible with NM-dnsmasq listening at 127.0.1.1. I would suggest therefore that if the switch from 127.0.0.1 to 127.0.1.1 for NM-dnsmasq is backported to Precise then dnsmasq 2.63 should simultaneously be backported to Precise and dnsmasq should be forced into bind-dynamic mode rather than into bind-interfaces mode.
Thomas Hood (jdthood) wrote : | #134 |
I wrote in comment #131:
> What benefits would the nm-dns module or the enhanced
> dns module give us relative to what we have now? One is:
> the ability to run nm-dnsmasq on another port, freeing up
> port 53 for BIND named listening on ALL:53. What else?
I just installed bind9 and was surprised to see that in its default configuration named behaves just like dnsmasq in bind-dynamic mode. That is, it listens on port 53 at all addresses assigned to interfaces. When interfaces are created or configured, named starts listening on those as well. With this behavior, it shouldn't often (ever?) be necessary to configure named to listen on the wildcard address.
Is there any nameserver out there that does still conflict with nm-dnsmasq listening at 127.0.1.1:53?
Thomas Hood (jdthood) wrote : | #135 |
The O'Reilly book _DNS and BIND_ says:
[QUOTE]
10.4.3.2 Interface interval
We've said already that BIND, by default, listens on all of a host's network interfaces. BIND 8 is actually smart enough to notice when a network interface on the host it's running on comes up or goes down. To do this, it periodically scans the host's network interfaces. This happens once each interface interval, which is 60 minutes by default. If you know the host your name server runs on has no dynamic network interfaces, you can disable scanning for new interfaces by setting the interface interval to zero to avoid unnecessary hourly overhead:
options {
};
On the other hand, if your host brings up or tears down network interfaces more often than every hour, you may want to reduce the interval.
[/QUOTE]
But when I tried it, named noticed right away that I had brought up an interface. Will investigate further.
Robin Battey (zanfur) wrote : | #136 |
In response to #131 and #134 by Thomas:
I would argue that "will it conflict with anything that exists?" is the wrong question, here. Certainly it will conflict in the future, and removing the users ability to run a DNS service on the wildcard address is suboptimal at best, even if they don't *need* to. To directly answer the question about something that conflicts: the internal resolver of the samba4 packages. They're beta right now, but the scheduled release date is December, and there's no parameter (yet) for altering the port or interfaces. This is actually the one that bit me originally.
To answer "what does it give us?", currently NM invokes a single dnsmasq instance that must be shared between all users. This isn't ideal, because NM connections can be per-user, and this could lead information disclosure at worst and oddly-rearranged DNS resolve orders at best. With an NSS module, you could spin up one dnsmasq instance for the system on a possibly priviliged port (but not 53) and one per user (above 1024), and link them together as forwarders so that only the user owning the connection will use the resolution they've specified in the GUI. It would require som tracking of which user's instance is on which port,and auto-invoking them when necessary, and shutting it down when the user logs out, but would allow for much more flexible and clean separation of user settings.
For the record, I am happy to write the NSS plugin myself, but it would require some changes in NM core itself, so I would have to work with someone on the NM team to implement it. If you're interested, and know who that would be, please do let me know.
I will also create a new bug report as requested.
Thomas Hood (jdthood) wrote : | #137 |
> something that conflicts: the internal resolver of the samba4 packages
Please file another report against samba4 describing the conflict with nm-dnsmasq.
Robin Battey (zanfur) wrote : | #138 |
I would if I considered it a bug. (I didn't fully describe the current state of samba4, because I figured it was irrelevant: You can alter the interfaces it binds to, but not for *only* the dns resolver -- so currently, if you want samba4 listening on the wildcard address you'll need the dns resolver listening there too.) It would be a nice feature, sure. But, it's nm-dnsmasq is the one breaking away from standards in ways that will break other packages, so I'm reporting the conflict here.
Btw, named immediately notices because of the /etc/network/
Thomas Hood (jdthood) wrote : | #139 |
If "libnss-nm-dns" would make it easier to introduce per-user caching and/or if it improved security then those would be important benefits.
Currently nm-dnsmasq has caching disabled because of concerns about cache poisoning and information leakage.
https:/
If there have already been discussions of per-user caching in Ubuntu then someone please give me the link.
The only approach that I have seen so far is per-user nscd in Solaris and (I now see) FreeBSD.
http://
http://
Thomas Hood (jdthood) wrote : | #140 |
> Btw, named immediately notices because of the
> /etc/network/
> "rndc reconfig" when an interface goes up or down.
Ah, yes. There is also a hook at /etc/ppp/
But named also notices immediately when I bring up an with NetworkManager. Any idea what the mechanism is there?
When I bring down an interface with NetworkManager, named does *not* notice this right away.
Thomas Hood (jdthood) wrote : | #141 |
Whoa. When an interface is brought up with NM the scripts in /etc/network/
Thomas Hood (jdthood) wrote : | #142 |
Aha. /etc/NetworkMan
Alkis Georgopoulos (alkisg) wrote : | #143 |
I'm still having problems with this on 14.04.
After the default installation, I installed dnsmasq and DNS stopped working until system restart.
Now it's only working for a few seconds after each network-manager restart!
If I comment out
#dns=dnsmasq
in NetworkManager.
For the 500+ schools that we're supporting here, we'll just continue commenting out #dns=dnsmasq because it doesn't cooperate with the regular dnsmasq installation,
but if you want me to provide more info to troubleshoot this issue, I'd be glad to.
I'm attaching the output of nm-tool. My effective dnsmasq.conf is:
$ egrep -rv '^#|^$' /etc/dnsmasq.*
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
/etc/dnsmasq.
Alkis Georgopoulos (alkisg) wrote : | #144 |
The fix for this issue caused another regression, dnsmasq now doesn't function correctly as a tftp server either.
I just tried Trusty (dnsmasq 2.68-1), and network manager ships /etc/dnsmasq.
bind-interfaces
So now dnsmasq only binds 127.0.0.1 for its tftp service:
udp 0 0 127.0.0.1:69 0.0.0.0:* 954/dnsmasq
udp6 0 0 ::1:69 :::* 954/dnsmasq
...and of course that breaks everything. Removing that file makes tftp work again.
Mathieu, could you please package the modifications to /etc/NetworkMan
...so that people that want to use dnsmasq as a real server can just blacklist it without suffering on each new Ubuntu installation?
E.g. for the 500+ schools we maintain here, we could then just Conflict: network-
Thanks,
Alkis
Changed in network-manager (Ubuntu): | |
status: | Fix Released → Confirmed |
Alkis Georgopoulos (alkisg) wrote : | #145 |
Or better yet, ltsp-server-
Thomas Hood (jdthood) wrote : | #146 |
> I just tried Trusty (dnsmasq 2.68-1), and network manager ships /etc/dnsmasq.
>
> bind-interfaces
>
> So now dnsmasq only binds 127.0.0.1 for its tftp service:
>
> udp 0 0 127.0.0.1:69 0.0.0.0:* 954/dnsmasq
> udp6 0 0 ::1:69 :::* 954/dnsmasq
>
> ...and of course that breaks everything. Removing that file makes tftp work again.
Alkis, does it work properly if you change "bind-interfaces" to "bind-dynamic"?
Alkis Georgopoulos (alkisg) wrote : | #147 |
Thomas, yup, TFTP appears to be working fine with bind-dynamic.
I'll test if re-enabling "dns=dnsmasq" in /etc/NetworkMan
Thanks!
John Hupp (john.hupp) wrote : | #148 |
Through Raring and Saucy, my two modifications to the given LTSP-PNP setup have been:
In /etc/dnsmasq.
Edit /etc/dnsmasq.
And those two mods still work for me in Saucy, but I'm running into what seems to be an NBD-related kernel bug, which I'm trying to commit bisect on the upstream kernel. Clients fail to boot, generating "Error: socket failed: connection refused."
It's off-topic, but this problem does not appear in Trusty?
Mathieu Trudel-Lapierre (cyphermox) wrote : | #149 |
Now that we can use bind-dynamic, I have nothing against setting that value instead of bind-interfaces, if it indeed solves the latest issues that were reported.
However, I'd really appreciate if separate bugs could be opened rather than reopening this bug, it would make each individual issue easier to see and fix.
Alkis Georgopoulos (alkisg) wrote : | #150 |
Mathieu, I reopened this bug because it was never resolved... not just for the TFTP issue.
Please see my #143 comment.
If you want more feedback tell me what to send, but DNS never worked properly for me when dnsmasq and nm-dnsmasq are both running.
Warwick Bruce Chapman (warwickchapman) wrote : | #151 |
What is the status of this as at 16.04?
Alkis Georgopoulos (alkisg) wrote : | #152 |
The network-manager package still ships /etc/dnsmasq.
with "bind-interfaces" in it
and that breaks the TFTP server of dnsmasq
and sometimes even the DNS server of dnsmasq.
"bind-dynamic" is a little better, but too unreliable to be used in production.
So this bug is still not resolved, after 150 messages it was just made a little worse.
One workaround is to undo the "solution" offered in this bug report:
1) In /etc/NetworkMan
2) And in /etc/dnsmasq.
A better solution would be for Mathieu to create a separate package for the nm-spawned dnsmasq, one that would conflict with the real dnsmasq server so that it would be automatically uninstalled when the sysadmin would install the real dnsmasq.
I can send a patch for that if it will be accepted.
Steve Langasek (vorlon) wrote : | #153 |
The Precise Pangolin has reached end of life, so this bug will not be fixed for that release
Changed in dnsmasq (Ubuntu Precise): | |
status: | Triaged → Won't Fix |
Changed in network-manager (Ubuntu Precise): | |
status: | Triaged → Won't Fix |
Changed in djbdns (Ubuntu Precise): | |
status: | Confirmed → Won't Fix |
Well, that's already partly done. dnsmasq will fail to start with bind is running, as it should; based on port 53 already being in use or not.
As another option, you may also wish to switch dns=dnsmasq to dns=bind to use bind directly as a resolver. There are other reasons to have dnsmasq and/or bind installed, so even checking for existence isn't the right way to cover this.