sysctl settings for net.nf_conntrack_max are not applied at boot time

Bug #1922778 reported by Drew Freiberger
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Neutron Open vSwitch Charm
Confirmed
Undecided
Unassigned
OpenStack Nova Compute Charm
Confirmed
Undecided
Unassigned
charm-ovn-chassis
Confirmed
Undecided
Unassigned

Bug Description

In a Queens deployment on Bionic 18.04.3, we are receiving alerts from nrpe charm's conntrack checks stating that conntrack table is almost full on our hypervisor nodes running charm-nova-comptue and charm-neutron-openvswitch.

Both of these charms have a default configuration to set 1000000 or 2000000 for nf_conntrack_max as needed for hosting network connection heavy virtual workloads.

It appears that there's a bug https://bugs.launchpad.net/ubuntu/+source/procps/+bug/50093 which ignores some sysctl settings due to procps loading sysctl configurations before all kernel modules have been loaded.

We should solve this issue either by addressing the upstream bug, by adding the proper modules for typical hypervisor tunables into the initramfs (as noted worked for nfsd tunables in comments of lp#50093), or by having the services load their /etc/sysctl.d/ modules during system init.

Tags: sts
Revision history for this message
Drew Freiberger (afreiberger) wrote :

Some proof of observation of the issue:

$ cat /proc/sys/net/nf_conntrack_max
262144
$ cd /etc/sysctl.d
$ grep conntrack *
50-nova-compute.conf:net.nf_conntrack_max=1000000
50-nova-compute.conf:net.netfilter.nf_conntrack_buckets=204800
50-nova-compute.conf:net.netfilter.nf_conntrack_max=1000000
50-openvswitch.conf:net.nf_conntrack_max=2000000
50-openvswitch.conf:net.netfilter.nf_conntrack_buckets=204800
50-openvswitch.conf:net.netfilter.nf_conntrack_max=2000000
$ ls -l 50-nova-compute.conf 50-openvswitch.conf
-rw-r--r-- 1 root root 346 Dec 15 21:02 50-nova-compute.conf
-rw-r--r-- 1 root root 346 Sep 20 2020 50-openvswitch.conf
$ date
Tue Apr 6 17:33:42 UTC 2021
$ uptime
 17:33:43 up 24 days, 4:00, 3 users, load average: 4.83, 7.68, 8.78

It appears this has been fixed for charm-neutron-gateway per https://bugs.launchpad.net/charm-neutron-gateway/+bug/1885192 and should be applied to these two charms.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

Subscribing field-high as this is affecting live clouds on Bionic running Neutron DVR without gateways, and also affects hypervisors.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (3.7 KiB)

I tested this in isolation on a Bionic VM and here is what I see:

1) nf_conntrack_max is applied upon reboot if an entry in /etc/modules and there is an entry in one of the files under /etc/sysctl.d/;
2) systemd-sysctl service is the one applying those settings. It is made to run after the systemd-modules-load unit (`After=systemd-modules-load.service`).

https://github.com/systemd/systemd/blob/v237/units/systemd-sysctl.service.in#L15 (upstream)
https://github.com/systemd/systemd/commit/0b73eab7a2185ae0377650e3fdb8208347a8a575 (original commit)
https://git.launchpad.net/ubuntu/+source/systemd/tree/units/systemd-sysctl.service.in?h=ubuntu/bionic-updates#n15 (bionic-updates)

3) Both systemd-modules-load and systemd-sysctl run as a part of the sysinit.target - so very early in the boot process.

https://www.freedesktop.org/software/systemd/man/bootup.html#System%20Manager%20Bootup

Could you provide more information about the status of `systemd-modules-load` and `systemd-sysctl` units: i.e. when they ran and whether the systemd-sysctl failed? Maybe something else is overriding those settings instead?

➜ ~ lxc launch ubuntu:bionic ct-bionic --vm
# enable LXD agent ... https://discuss.linuxcontainers.org/t/running-virtual-machines-with-lxd-4-0/7519

➜ ~ lxc exec ct-bionic bash

root@ct-bionic:~# modprobe nf_conntrack

root@ct-bionic:~# sysctl net.nf_conntrack_max
net.nf_conntrack_max = 32768

root@ct-bionic:~# echo nf_conntrack >> /etc/modules
root@ct-bionic:~# echo 'net.nf_conntrack_max = 42424242' > /etc/sysctl.d/10-conntrack.conf
root@ct-bionic:~# sysctl -p /etc/sysctl.d/10-conntrack.conf
net.nf_conntrack_max = 42424242

root@ct-bionic:~# reboot
# exec again

root@ct-bionic:~# lsmod | grep conntrack
nf_conntrack 135168 0

root@ct-bionic:~# sysctl net.nf_conntrack_max
net.nf_conntrack_max = 42424242

root@ct-bionic:~# sudo systemctl list-dependencies
default.target
● ├─accounts-daemon.service
● ├─apport.service
● ├─display-manager.service
● ├─grub-common.service
● ├─systemd-update-utmp-runlevel.service
● ├─ureadahead.service
● └─multi-user.target
# ...
● ├─basic.target
# ...
● │ ├─sysinit.target
# ...
● │ │ ├─systemd-machine-id-commit.service
● │ │ ├─systemd-modules-load.service
● │ │ ├─systemd-random-seed.service
● │ │ ├─systemd-sysctl.service

root@ct-bionic:~# systemctl cat systemd-sysctl.service
[Unit]
Description=Apply Kernel Variables
Documentation=man:systemd-sysctl.service(8) man:sysctl.d(5)
DefaultDependencies=no
Conflicts=shutdown.target
After=systemd-modules-load.service # <----- this
Before=sysinit.target shutdown.target
ConditionPathIsReadWrite=/proc/sys/net/

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/systemd-sysctl
TimeoutSec=90s

root@ct-bionic:~# systemctl status systemd-sysctl
● systemd-sysctl.service - Apply Kernel Variables
   Loaded: loaded (/lib/systemd/system/systemd-sysctl.service; static; vendor preset: enabled)
   Active: active (exited) since Mon 2021-05-24 09:53:27 UTC; 28min ago
     Docs: man:systemd-sysctl.service(8)
           man:sysctl.d(5)
  Process: 482 ExecStart=/lib/systemd/systemd-sysctl (code=exited, status=0/SUCCESS)
 Main PID: 482 (co...

Read more...

Changed in charm-neutron-openvswitch:
status: New → Incomplete
Changed in charm-nova-compute:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack nova-compute charm because there has been no activity for 60 days.]

Changed in charm-nova-compute:
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack neutron-openvswitch charm because there has been no activity for 60 days.]

Changed in charm-neutron-openvswitch:
status: Incomplete → Expired
Revision history for this message
Trent Lloyd (lathiat) wrote :

Setting this bug back to Confirmed. This issue still exists on new deployments, e.g. focal-ussuri.

Even though the sysctl is applied now by systemd-sysctl, same issue, the nf_conntrack module is not loaded automatically so the setting is not applied. The following errors are logged on startup:

sysctl[1464]: Couldn't write '1000000' to 'net/nf_conntrack_max', ignoring: No such file or directory
sysctl[1464]: Couldn't write '204800' to 'net/netfilter/nf_conntrack_buckets', ignoring: No such file or directory
sysctl[1464]: Couldn't write '1000000' to 'net/netfilter/nf_conntrack_max', ignoring: No such file or directory

The solution is to add nf_conntrack to /etc/modules similar to Bug #1885192 for charm-neutron-gateway

The nf_conntrack_max sysctl is currently a default sysctl on the nova-compute charm - though arguably it's linked closer to neutron-openvswitch and is also likely required by the ovn-chassis charm and possibly some other charms.

$ cat proc/sys/net/netfilter/nf_conntrack_max
262144

$ grep nf_conntrack_max etc/sysctl.d -Ri
etc/sysctl.d/50-nova-compute.conf:net.nf_conntrack_max=1000000
etc/sysctl.d/50-nova-compute.conf:net.netfilter.nf_conntrack_max=1000000

Changed in charm-neutron-openvswitch:
status: Expired → Confirmed
Changed in charm-nova-compute:
status: Expired → Confirmed
tags: added: sts
Trent Lloyd (lathiat)
Changed in charm-ovn-chassis:
status: New → Confirmed
Revision history for this message
Marcin Wilk (wilkmarcin) wrote :

Adding a bit more details about this bug on focal-ussuri.

Following log shows that on Bionic the system was able to load modules nf_conntrack_ipv4 and nf_conntrack_ipv6, but on Focal the module name is nf_conntrack. Following logs shows sequence of events before and after series upgrade to Focal.

grep -e "-- Reboot --" -e nf_conntrack -e 'kernel: Linux version' journalctl_--no-pager
Jun 10 12:17:34 ubuntu kernel: Linux version 4.15.0-184-generic (buildd@lcy02-amd64-006) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #194-Ubuntu SMP Thu Jun 2 18:54:48 UTC 2022 (Ubuntu 4.15.0-184.194-generic 4.15.18)
Jun 10 12:27:09 testsystem kernel: nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
-- Reboot --
Jun 18 20:04:41 testsystem kernel: Linux version 4.15.0-187-generic (buildd@lcy02-amd64-046) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #198-Ubuntu SMP Tue Jun 14 03:23:51 UTC 2022 (Ubuntu 4.15.0-187.198-generic 4.15.18)
Jun 18 20:04:41 testsystem kernel: nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
Jun 18 20:04:41 testsystem systemd-modules-load[1318]: Inserted module 'nf_conntrack_ipv4' <------ successfully loaded on Bionic (kernel 4.15.0-187-generic)
Jun 18 20:04:41 testsystem systemd-modules-load[1318]: Inserted module 'nf_conntrack_ipv6'
-- Reboot --
Jun 18 20:26:47 testsystem kernel: Linux version 5.4.0-120-generic (buildd@lcy02-amd64-006) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #136-Ubuntu SMP Fri Jun 10 13:40:48 UTC 2022 (Ubuntu 5.4.0-120.136-generic 5.4.189)
Jun 18 20:26:47 testsystem systemd-modules-load[1286]: Failed to find module 'nf_conntrack_ipv4' <-------- failed to load on Focal (kernel 5.4.0-120-generic)
Jun 18 20:26:47 testsystem systemd-modules-load[1286]: Failed to find module 'nf_conntrack_ipv6'
Jun 18 20:26:47 testsystem systemd-sysctl[1299]: Couldn't write '1000000' to 'net/nf_conntrack_max', ignoring: No such file or directory
Jun 18 20:26:47 testsystem systemd-sysctl[1299]: Couldn't write '204800' to 'net/netfilter/nf_conntrack_buckets', ignoring: No such file or directory
Jun 18 20:26:47 testsystem systemd-sysctl[1299]: Couldn't write '1000000' to 'net/netfilter/nf_conntrack_max', ignoring: No such file or directory
Jun 20 11:40:26 testsystem sudo[2800143]: root : TTY=pts/11 ; PWD=/var/log ; USER=root ; COMMAND=/sbin/sysctl net.netfilter.nf_conntrack_max
Jun 20 11:40:26 testsystem sudo[2800145]: root : TTY=pts/11 ; PWD=/var/log ; USER=root ; COMMAND=/sbin/sysctl net.netfilter.nf_conntrack_max

On Focal nf_conntrack module was successfully loaded afterwords (after the systemd-sysctl.service)
grep -Ei '^nf_conntrack ' lsmod
nf_conntrack 139264 10 xt_conntrack,nf_nat,nfnetlink_cttimeout,xt_nat,openvswitch,nf_conntrack_netlink,xt_connmark,xt_CT,nf_conncount,xt_REDIRECT

/etc/modules contains obsloleted module names, hence those failed to load
grep nf_conntrac etc/modules
nf_conntrack_ipv4
nf_conntrack_ipv6

In newer versions nf_conntrack_ipv4 and nf_conntrack_ipv6 was merged into nf_conntrack.

This is related to bug [1]

[1] https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1851764

Revision history for this message
Felipe Reyes (freyes) wrote :

this line https://opendev.org/openstack/charm-neutron-openvswitch/src/branch/master/hooks/neutron_ovs_utils.py#L295 loads the nf_conntrack* modules only when the firewall driver is openvswitch, is this needed for the iptables_hybrid as well?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.