Bug #1386460 “Control node assertion in RibOutUpdates::PeerDeque...” : Bugs : Juniper Openstack

OpenContrail Admin (ci-admin-f) on 2014-12-10

information type:

Proprietary → Public

Nischal Sheth (nsheth) on 2014-12-12

Changed in juniperopenstack:
status:	New → In Progress

Revision history for this message

OpenContrail Admin (ci-admin-f) wrote on 2014-12-13: A change has been merged

#1

Reviewed: https://review.opencontrail.org/5576
Committed: http://github.org/Juniper/contrail-controller/commit/e12740b76962930457bc55a948fc9af5de994a1a
Submitter: Zuul
Branch: R1.10

commit e12740b76962930457bc55a948fc9af5de994a1a
Author: Nischal Sheth <email address hidden>
Date: Thu Dec 11 13:58:44 2014 -0800

Fix corner case in SchedulingGroup::UpdatePeerQueue logic

An assertion fails if a peer gets blocked when dequeueing updates from
multiple RibOuts via SchedulingGroup::UpdatePeer.

Problem happens in the following situation:

- Peer was previously blocked and now has updates to send for 2 RibOuts.
- Updates for both RibOuts are for the same queue i.e. QBULK or QUPDATE.
- Peer shares a marker for the for the first RibOut with another peer or
peer's marker gets merged with marker for another peer when sending
updates for first RibOut (via RibOutUpdates::PeerDequeue)
- There are still more updates to be sent for the first RibOut i.e. the
processing in RibOutUpdates::PeerDequeue keeps going.
- Original peer gets send blocked, but we manage to dequeue all updates
for the first RibOut to the other peer with which the original peer's
marker got merged.
- RibOutUpdates::PeerDequeue returns true because of the previous point

At this point, we continue and try to dequeue updates for the 2nd RibOut
because RibOutUpdates::PeerDequeue returned success. We hit an assertion
in RibOutUpdates::PeerDequeue when called for the 2nd RibOut because the
original peer is not in the send ready set anymore.

Fix is to stop processing RibOuts for the peer if it's send blocked when
RibOutUpdates::PeerDequeue returns. This ensures that we don't hit the
assertion since we don't try to process the 2nd RibOut. Updates for the
2nd RibOut will be sent to the other peer when it's WorkPeer item gets
processed.

Change-Id: Ib1ef218ad9eecb1ca489b3045bdc3419e75caa21
Closes-Bug: 1386460

Revision history for this message

OpenContrail Admin (ci-admin-f) wrote on 2014-12-13:

#2

Reviewed: https://review.opencontrail.org/5574
Committed: http://github.org/Juniper/contrail-controller/commit/a3490d6cc1c1186f9b38d8213555670875ccebbc
Submitter: Zuul
Branch: master

commit a3490d6cc1c1186f9b38d8213555670875ccebbc
Author: Nischal Sheth <email address hidden>
Date: Thu Dec 11 13:58:44 2014 -0800

Fix corner case in SchedulingGroup::UpdatePeerQueue logic

An assertion fails if a peer gets blocked when dequeueing updates from
multiple RibOuts via SchedulingGroup::UpdatePeer.

Problem happens in the following situation:

- Peer was previously blocked and now has updates to send for 2 RibOuts.
- Updates for both RibOuts are for the same queue i.e. QBULK or QUPDATE.
- Peer shares a marker for the for the first RibOut with another peer or
peer's marker gets merged with marker for another peer when sending
updates for first RibOut (via RibOutUpdates::PeerDequeue)
- There are still more updates to be sent for the first RibOut i.e. the
processing in RibOutUpdates::PeerDequeue keeps going.
- Original peer gets send blocked, but we manage to dequeue all updates
for the first RibOut to the other peer with which the original peer's
marker got merged.
- RibOutUpdates::PeerDequeue returns true because of the previous point

At this point, we continue and try to dequeue updates for the 2nd RibOut
because RibOutUpdates::PeerDequeue returned success. We hit an assertion
in RibOutUpdates::PeerDequeue when called for the 2nd RibOut because the
original peer is not in the send ready set anymore.

Fix is to stop processing RibOuts for the peer if it's send blocked when
RibOutUpdates::PeerDequeue returns. This ensures that we don't hit the
assertion since we don't try to process the 2nd RibOut. Updates for the
2nd RibOut will be sent to the other peer when it's WorkPeer item gets
processed.

Change-Id: Ib1ef218ad9eecb1ca489b3045bdc3419e75caa21
Closes-Bug: 1386460

Revision history for this message

OpenContrail Admin (ci-admin-f) wrote on 2014-12-15:

#3

Reviewed: https://review.opencontrail.org/5575
Committed: http://github.org/Juniper/contrail-controller/commit/79965052c2309953b9a290739c078d35fc827e34
Submitter: Zuul
Branch: R2.0

commit 79965052c2309953b9a290739c078d35fc827e34
Author: Nischal Sheth <email address hidden>
Date: Thu Dec 11 13:58:44 2014 -0800

Fix corner case in SchedulingGroup::UpdatePeerQueue logic

An assertion fails if a peer gets blocked when dequeueing updates from
multiple RibOuts via SchedulingGroup::UpdatePeer.

Problem happens in the following situation:

- Peer was previously blocked and now has updates to send for 2 RibOuts.
- Updates for both RibOuts are for the same queue i.e. QBULK or QUPDATE.
- Peer shares a marker for the for the first RibOut with another peer or
peer's marker gets merged with marker for another peer when sending
updates for first RibOut (via RibOutUpdates::PeerDequeue)
- There are still more updates to be sent for the first RibOut i.e. the
processing in RibOutUpdates::PeerDequeue keeps going.
- Original peer gets send blocked, but we manage to dequeue all updates
for the first RibOut to the other peer with which the original peer's
marker got merged.
- RibOutUpdates::PeerDequeue returns true because of the previous point

At this point, we continue and try to dequeue updates for the 2nd RibOut
because RibOutUpdates::PeerDequeue returned success. We hit an assertion
in RibOutUpdates::PeerDequeue when called for the 2nd RibOut because the
original peer is not in the send ready set anymore.

Fix is to stop processing RibOuts for the peer if it's send blocked when
RibOutUpdates::PeerDequeue returns. This ensures that we don't hit the
assertion since we don't try to process the 2nd RibOut. Updates for the
2nd RibOut will be sent to the other peer when it's WorkPeer item gets
processed.

Change-Id: Ib1ef218ad9eecb1ca489b3045bdc3419e75caa21
Closes-Bug: 1386460

	Status	Importance	Assigned to	Milestone
Juniper Openstack	Status tracked in Trunk
R1.1	Fix Committed	High	Nischal Sheth	Juniper Openstack r1.22
R2.0	Fix Released	High	Nischal Sheth	Juniper Openstack r2.0-fcs
Trunk	Fix Released	High	Nischal Sheth	Juniper Openstack r2.1-fcs

Juniper Openstack

Control node assertion in RibOutUpdates::PeerDequeue in scaled setup

Bug Description

Other bug subscribers

Remote bug watches