control node crashes while adding a default gateway to quantum router.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cisco Openstack |
New
|
Undecided
|
Unassigned |
Bug Description
I installed the g.3 release using one control node and three compute nodes on UCS B200M3 blades.
I'm using two quantum routers. The control node crashed after setting a gateway on the second router.
quantum router-gateway-set router-pcrf 10.87.252.130
Setting the gateway on the first router had no issues. Both routers are using the same gateway. The crash happens every time I try to set the gateway on the second router. I rebooted the control node several times but it always crashed one minute after the login prompt appears. I have to reinstall OpenStack on the blade to recover from the crash.
root@m2c1q5-ctrl:~# dpkg -l | grep openvswitch
ii openvswitch-common 1.4.0-1ubuntu1.5 Open vSwitch common components
ii openvswitch-
ii openvswitch-switch 1.4.0-1ubuntu1.5 Open vSwitch switch implementations
ii quantum-
ii quantum-
root@m2c1q5-ctrl:~#
root@m2c1q5-ctrl:~# uname -a
Linux m2c1q5-ctrl 3.2.0-57-generic #87-Ubuntu SMP Tue Nov 12 21:35:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
root@m2c1q5-ctrl:~#
I installed two other g.3 test beds about four weeks and did not see this issue.
root@m2c1q5-ctrl:~# tail -f /var/log/syslog
Dec 9 17:09:23 m2c1q5-ctrl puppet-agent[1379]: (/Stage[
Dec 9 17:09:23 m2c1q5-ctrl puppet-agent[1379]: Finished catalog run in 139.84 seconds
Dec 9 17:17:01 m2c1q5-ctrl CRON[19722]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Dec 9 17:17:40 m2c1q5-ctrl ovs-vsctl: 00001|vsctl|
Dec 9 17:17:42 m2c1q5-ctrl kernel: <6[31.741 p_als C 0020 efle oeTa
Dec 9 17:17:42 m2c1q5-ctrl ovs-vsctl: 00001|vsctl|
Dec 9 17:17:43 m2c1q5-ctrl kernel: [ 3520.116842] general protection fault: 0000 [#1] SMP
Dec 9 17:17:43 m2c1q5-ctrl kernel: [ 3520.117055] CPU 3
Dec 9 17:17:43 m2c1q5-ctrl kernel: [ 3520.117132] Modules linked in: ip6table_filter ip6_tables ipt_REDIRECT veth openvswitch(O) xt_tcpudp iptable_filter iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables vesafb 8021q joydev usbhid hid wmi aps_dcspea_oeip
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.145 <> 50182]Pd 88,cm:osvwth ane:G O3205-eei 8-bnuCssIcUS-
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.193 D:00000000 S:00000004 D:c1400002
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.213 B:ff81e85c 0:ff81dbd0 0:ff81d336
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.245 1:00000000 182900)G:ff88c00(00 nG:00000000<> 50114]C: C2 000007e0C3 0007db00C4 00000460<> 50115]D0 00000000D1 00000000raif ff872b00 akff81ec40)<> 500145] 00000002ff81d339 ff872b00000vots
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.839 [ffffa176> oeeueatos0f/x9 oevwth
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.976 [ffff815b> _mlo+x5/x9
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.9ct_
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[.171 [ffff818f> nokpg+xa04
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.152 [ffff81c6> _ofut049050<> 5321] <ffff13e6]
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.289 [ffff8] <ffff1499]
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.466 [ffff81eb> ppl+xb030<> 5038ff1389]
Dec 9 17:17:43 m2c1q5-ctrl kernel: 4[32.985 -[edtaea634fa6b --<0>[ 3520.405226] Kernel panic - not syncing: Fatal exception in interrupt
Dec 9 17:17:43 m2c1q5-ctrl kernel: [ 3520.410584] Pid: 28983, comm: ovs-vswitchd Tainted: G D O 3.2.0-57-generic #87-Ubuntu
Dec 9 17:17:43 m2c1q5-ctrl kernel: [ 3520.610456] [<ffffffff811be
Dec 9 17:17:43 m2c1q5-ctrl kernel: [ 3520.615579] [<ffffffff81532
root@m2c1q5-ctrl:~# cat /proc/kmsg
<4>[ 2986.500041] init: nova-novncproxy main process (1951) killed by TERM signal
48]ii:nv-bettr anpoes(205)triae ihsau
<4>[ 2988.212269] init: nova-consoleauth main process (2167) terminated with status 1
<4>[ 2989.115443] init: nova-scheduler main process (2390) terminated with status 1
<4>[ 2989.512955] init: nova-conductor main process (2499) terminated with status 1
<4>[ 2990.372186] init: keystone main process (2583) killed by TERM signal
<6>[ 3517.123716] device qr-5aacfb0a-d1 entered promiscuous mode
58998]i6tbe:
4[32.155 i:293 om v-sicdTitd ..-7gnrc#7Uut ico System n CBB0-3US-20M
4[32.103 I:0010:
<4>[ 3520.119413] RSP: 0018:ffff8817e2
<4>[ 3520.119620] RAX: ffff8817deb4d624 RBX: ffff88f5720RX 00000000<> 50190]RX 00000000RI 00000000RI ef5d0051<> 50108]RP ff872b88R8 ff87e460R9 ff87fb60<> 50106]R0 00000008R1: 0000000000000000 R12: ffff8817df3b3660
<4>[ 3520.120746] R13: ffff8f5272800 R14: ffff8817deba3400 R15: 0000000000000000
<4>[ 3520.121028] FS: 00007f52aa2ec700(00 Sff811f60000)
4[32.238 S 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3520.121573] R:0000043c R:0001d760 R:0000000e
4[32.284 R:00000000 R:00000000 DR2: 0000000000000000
<4>[ 3520.122134] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[ 3520.122416] Process ovs-vswitchd (pid: 28983, thednoff81e840,ts ff871350
0[32.122764] Stack:
<4>[ 3520.128787] 0000000000000001 000000000000001c 0000000000000004 ffff882fd5511000
<4>[ 3520.141506] 0000000000000000 ffff882fd55110c8 0000000000000001 0000000000000000
<4>[ 352.541 00000000 ff87fb60ff81e804 0000000000296
<0>[ 3520.167480] Call Trace:
<4>[ 3520.173792] [<ffffffff8153b
<4>[ 3520.180124] [<ffffffffa0110
<4>[ 3520.211495] [<ffffffffa0107
<4> 50239] <ffffffa0108018>] ovs_packet_
4[32.374 [ffff83b6> l_as+x00e
4[32.412 [ffff8571ee5>] genl_rcv_
<4>[ 3520.249585] [<ffffffff81571
<4>[ 3520.255954] [<ffffffff81571
<4>[ 3520.262170] [<ffffffff81571
<4[32.629 [ffff8517> elink_unicast+
<4>[ 3520.274386] [<ffffffff8153c
<4>[ 3520.280318] [<ffffffff81571
<4>[ 3520.286396] [<ffffffff8152e
<4>[ 3520.292737] [<ffffffff8118d
<4>[ 3520.298756] [<ffffffff8111b
4[32.071 [ffff8107> e_gopudt_
4[320.228 [ffff85ce> eiyivc05/x0<> 50377] <ffff15307f6>] ___sys_
<4>[ 3520.333322 [ffff810b> adem_al+0x269/0x370
<4>[ 3520.33853] <ffff1618]
4[32.4794] [<ffffffff81532
<4>[ 3520.353918] [<ffffff8520> y_eds+x902
4[32.587 [ffffff81669b82>] system_
<0>[ 3520.363966] Code: 55 b0 8d 74 0e 14 0f b6 4d cb 89 b3 c0 00 00 00 8b b8 c4 00 00 00 0f b7 75 a0 48 03 b8 d8 00 00 00 88 50 01 88 48 08 66 89 70 06 <f6> 47 06 40 74 0a f6 40 7c 01 0f 84 d0 fd ff ff 31 d2 4c 89 ee
<> 50391]RP <ffff0078]
<4>[ 3520.385256] RSP <ffff8817e28b58
<4>[ 3520.420344] Call Trace:
<4>[ 3520.425213] [<ffffffff81648
<4>[ 3520.430181] [<ffffffff81662
<4>[ 3520.435150] [<ffffffff81017
<4>[ 3520.440085] [<ffffffff81662
<4>[ 3520.445113] [<ffffffff81661
<4>[ 3520.450148] [<ffffffffa010f
<4>[ 3520.455282] [<ffffffffa010f
<4>[ 3520.460375] [<ffffffff8153b
<4>[ 3520.465453] [<ffffffffa0110
<4>[ 3520.470577] [<ffffffffa0107
<4>[ 3520.475751] [<ffffffff81165
<4>[ 3520.480922] [<ffffffffa010b
<4>[ 3520.491354] [<ffffffffa0107
<4>[ 3520.502259] [<ffffffffa0108
<4>[ 3520.513779] [<ffffffff8132b
<4>[ 3520.519756] [<ffffffff81571
<4>[ 3520.525737] [<ffffffff81571
<4>[ 3520.531571] [<ffffffff81571
<4>[ 3520.537269] [<ffffffff81571
<4>[ 3520.542855] [<ffffffff81571
<4>[ 3520.548312] [<ffffffff8153c
<4>[ 3520.553686] [<ffffffff81571
<4>[ 3520.559061] [<ffffffff8152e
<4>[ 3520.564371] [<ffffffff8118d
<4>[ 3520.569528] [<ffffffff8111b
<4>[ 3520.574600] [<ffffffff81170
<4>[ 3520.579781] [<ffffffff81118
<4>[ 3520.584906] [<ffffffff8113c
<4>[ 3520.590036] [<ffffffff8153c
<4>[ 3520.595192] [<ffffffff81530
<4>[ 3520.600295] [<ffffffff81140
<4>[ 3520.605407] [<ffffffff81665
<4>[ 3520.620681] [<ffffffff81532
<4>[ 3520.625684] [<ffffffff81669
I've also hit this crash. The only difference is how I got there. I just did a brand new g.3 installation using COI on the same hardware that it was running fine on, also using g.3. But now after I created my networks, I started adding internal interfaces to my router. The first interface went fine. When I added the second interface, the controller crashed and would not recover without a power cycle. However, after the power cycle, I get to the login prompt, but it immediately crashes again, just as Tim described.
Yesterday, I had a colleague reach out to me with the same issue. Adding an internal interface causes the controller to crash.