Felix crashes when kube-proxy is in IPVS mode

Bug #2020695 reported by George Kraft
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Calico Charm
Fix Released
High
Mateo Florido

Bug Description

When configuring kube-proxy to use IPVS proxy mode, with the following (soon to be released) config:

juju config kubernetes-control-plane proxy-extra-config='{mode: ipvs, ipvs: {strictARP: true}}'
juju config kubernetes-worker proxy-extra-config='{mode: ipvs, ipvs: {strictARP: true}}'

In both journalctl and in /var/log/calico/felix/current, logs can be seen with Calico's Felix service crashing repeatedly:

2023-05-24 17:46:58.615 [ERROR][4004168] felix/ipsets.go 574: Bad return code from 'ipset list'. error=exit status 1 family="inet6" stderr="ipset v7.1: Kernel and userspace incompatible: settype hash:ip,port with revision 6 not supported by userspace.\n"
2023-05-24 17:46:58.615 [WARNING][4004168] felix/ipsets.go 322: Failed to resync with dataplane error=exit status 1 family="inet6"
2023-05-24 17:46:58.687 [INFO][4004168] felix/route_table.go 266: Calculated interface name regexp ifaceRegex="^ens192$" ipVersion=0x4 tableIndex=0
2023-05-24 17:46:58.688 [INFO][4004168] felix/vxlan_mgr.go 470: VXLAN tunnel device configured
2023-05-24 17:46:59.125 [ERROR][4004168] felix/ipsets.go 974: Failed to read IP sets error=exit status 1 family="inet"
2023-05-24 17:46:59.125 [PANIC][4004168] felix/ipsets.go 355: Failed to update IP sets after multiple retries. family="inet"
panic: (*logrus.Entry) 0xc000d3aaa0
goroutine 200 [running]:
github.com/sirupsen/logrus.Entry.log({0xc00007e1e0, 0xc000324600, {0x0, 0x0, 0x0}, 0x0, {0x0, 0x0}, 0x0}, 0x0, ...)
        /<email address hidden>/entry.go:128 +0x56c
github.com/sirupsen/logrus.(*Entry).Panic(0xc0009f00a0, {0xc000bb3b80, 0x1, 0x1})
        /<email address hidden>/entry.go:173 +0xfb
github.com/projectcalico/calico/felix/ipsets.(*IPSets).ApplyUpdates(0xc000317e00)
        /go/src/github.com/projectcalico/calico/felix/ipsets/ipsets.go:355 +0x737
github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply.func1({0x32438d0, 0xc000317e00})
        /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:1809 +0x3d
created by github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply
        /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:1808 +0xd88

This is a known issue in Calico:
https://github.com/projectcalico/calico/issues/5011
https://github.com/projectcalico/calico/issues/5717

Fixed for amd64 in Calico v3.22.1. Fixed for arm64 in Calico v3.23.2 and v3.22.4. We can fix this in the Calico charm by updating Calico.

George Kraft (cynerva)
Changed in charm-calico:
importance: Undecided → High
status: New → Triaged
milestone: none → 1.28
Revision history for this message
Mateo Florido (mateoflorido) wrote :
Changed in charm-calico:
assignee: nobody → Mateo Florido (mateoflorido)
status: Triaged → In Progress
George Kraft (cynerva)
Changed in charm-calico:
status: In Progress → Fix Committed
Adam Dyess (addyess)
Changed in charm-calico:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.