with continuous object add/delete, restarting control node service is not up
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R2.20 |
Won't Fix
|
Medium
|
Nischal Sheth | |||
R3.0 |
Fix Committed
|
Medium
|
Nischal Sheth | |||
Trunk |
Fix Committed
|
Medium
|
Nischal Sheth |
Bug Description
R2.20 Build 115 ubuntu 14.04 Juno multinode setup
I have 3-controller node environment where 30 vns are getting added, 30 vms are spawned on them and are deleted.
Another script creates a VN, a LIF, a VMI , and then deletes it
In parallel, another script restarts the 3 control nodes randomly.
It is seen that sometimes, the control node does not come up. It says End-Of-RIB..
Prakash would update the bug with his findings. gcore of control node on one of the nodes will be in http://
root@nodec2:
== Contrail Control ==
supervisor-control: active
contrail-control initializing (IFMap Server End-Of-RIB not computed)
contrail-
contrail-dns active
contrail-named active
-------
root@nodec1:~# cat bug-recreate.sh
device_
mac="00:
while :
do
neutron net-create bug-vn
neutron subnet-create bug-vn 100.1.1.0/24
vn_id=`neutron net-show bug-vn | grep " id " | awk '{ print $4}'`
python config-tor-intf.py "ge-0/0/0" $device_id "ge-0/0/0.0" 0 1 $vn_id "00:25:90:c3:09:6d"
sleep 20
python del-lifs-vmis.py
neutron net-delete bug-vn
sleep 20
done
root@nodec1:~# cat test1.sh
source /etc/contrail/
image_id=
while :
do
for i in {1..30};
do
neutron net-create bugvn$i
neutron subnet-create bugvn$i 100.$i.$i.0/24
nova boot --nic net-id=$vn_id --flavor 1 --image $image_id vm_$i
done
for i in {1..30};
do
nova delete vm_$i
done
for i in {1..30};
do
neutron net-delete bugvn$i
done
done
root@nodec1:~#
-------------------
bash-4.2$ cat restart_control1.sh
cmd="service contrail-control restart"
SSHOPT="-o StrictHostKeyCh
while :
do
sshpass -p c0ntrail123 ssh $SSHOPT root@nodec1 $cmd
# sleep $[ ( $RANDOM % 200 ) + 1 ]s
sshpass -p c0ntrail123 ssh $SSHOPT root@nodec2 $cmd
# sleep $[ ( $RANDOM % 200 ) + 1 ]s
sshpass -p c0ntrail123 ssh $SSHOPT root@nodec3 $cmd
sleep $[ ( $RANDOM % 200 ) + 1 ]s
done
bash-4.2$
The EOR is assumed when there is a inactivity for 10ms (Default eor timeout). Since in this case script is continuously modifying the system( POST/DELETE/ PUT), the inactivity is never seen to announce EOR is calculated.
ifmap/client/ ifmap_channel. cc: in IFMapChannel: :ReadPollRespon se(),
... rib_computed( )) {
StartEndOf RibTimer( );
..
if (!end_of_
// When the daemon is coming up, as long as we are receiving data,
// we have not received the entire db. Keep re-arming the EOR timer
// as long as we are receiving data.
}
...
...
Due to config script ifmap server is continuously sending PollRespose and hence EOR is not calculated.