Partial-Bug: #1528327
Fixed latency monitor code based on the Ceph 0.94.3 version.
Fixed issues in OSD throughput/IOPs calculation.
Updated code based on the latest Sandesh apis.
contrail-collector DB queue back presssure mechanism was not
working since the DB drop level is initialized to INVALID and
even the water marks levels are INVALID and hence the defer/undefer
callbacks are not called.
Issue:
------
During a flow index change vrouter-agent triggers a delete
on index tree using new flow handle instead of currently
held flow_handle resulting in flow entry getting associated
to two slots in the flow index tree, which further on flow
entry delete due to aging or eviction never releases the
slot for old flow handle, causing failures for further
insertions in the flow index tree
Fix:
----
Avoid taking flow handle as argument to DeleteByIndex and
use the currently associated flow_handle to remove from tree
Adding assert in DeleteByIndex to catch delete failure
Avoid doing delete from index tree in code paths other than
flow entry index update of flow entry delete.
Add logic for KSync Sock User to Mock vrouter behavior
returning index for an entry if it is already allocated
instead of allocating a new one.
Calling get_routing_instances could trigger another read of the VN
if the VN has no routing instance. This is not only inefficient, but
could also cause exception if the VN has disappeared. We can avoid
this by calling getattr.
Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted
Resolve trigger calls ProcessStaticRouteConfig to apply any pending static
route config. ProcessStaticRouteConfig accesses the NULL config pointer of
the routing instance
Fix:
1. Check whether the routing instance is deleted in ProcessStaticRouteConfig
2. Reset the resolve_trigger_ in StaticRouteMgr destructor
3. Add API to disable resolve_trigger_ and Add UT to test delayed processing
of resolve_trigger_
Issue:
------
uninitialized variable results in bad calculation of number
of MPLS labels for multicast
Fix:
----
initialize vrouter params to 0, handle case if
vrouter_max_labels is lesser than fixed unicast range.
Conflicts:
src/vnsw/agent/cmn/agent.cc
Closes-Bug: 1535735
Change-Id: Id4a7b74f12728e78dcb5b8b24d0848e560ff0138
(cherry picked from commit eb0488b9280871229122e71692c12fe92364fa8f)
Update task policy for contrail-dns.
When XMPP channel with agent clsoes, the records exported from there
are removed in the cleaner task context. Update the task policy to
ensure the config and bind tasks do not run in parallel with it.
This problem can happen when a number of vRouter agents restart in
quick succesion and subscribe to a table and advertise routes into
the table.
Consider the following sequence of events on a route in foo.inet.0:
- Route is added with a path with attribute A and nexthop N1
- XMPP peer P1 subscribes to the table
- As a result a RouteUpdate is created in QBULK (the join queue). The
RouteUpdate has an UpdateInfo with a bitset P1 and a RibOutAttr with
attribute A and nexthop N1
- Another path for the same route is added with nexthop N2
- This new path is ecmp eligible i.e. has same local preference as the
best path
- XMPP peer P2 subscribes to the table
- As a result the existing RouteUpdate in QBULK gets a new UpdateInfo.
A new UpdateInfo is created because RibOutAttr for P2 has attribute A
and nexthops (N1, N2). Note that a different UpdateInfo is required
for different RibOutAttr. This UpdateInfo has a bitset P2.
Notice that we now have a RouteUpdate with 2 UpdateInfos that have the
same attribute A, but different RibOutAttrs (by virtue having different
forwarding nexthops).
The UpdateQueue maintains a set (attr_set_ of type UpdatesByAttr) of
UpdateInfos keyed by BgpAttr and timestamp. The label/nexthops in the
RibOutAttr are not included in key to achieve optimal packing of bgp
updates by attribute. As a result, both the UpdateInfos are inserted
(or rather attempted to be inserted) into attr_set_ with the same key.
This causes a crash when we later try to erase both UpdateInfos from
the attr_set_ when doing export processing for the route.
Note that we run into this case only if join processing for P2 happens
before export processing for the route after the 2nd path got added.
If export processing happens before the join for P2, the RouteUpdate
would move from QBULK to QUPDATE and would have only 1 UpdateInfo with
attribute A and nexthop (N1, N2).
The fix consists of 2 parts:
1. Use the UpdateInfo pointer itself as the final tie-breaker in the
key for UpdateQueue::UpdatesByAttr to ensure we have no duplicates.
2. When traversing the UpdatesByAttr set to build update messages, fix
UpdateQueue::AttrNext to not return an UpdateInfo for same RouteUpdate
as the current UpdateInfo. Doing so invalidates the locking design in
RibOutUpdates and results in a deadlock.
Add unit tests to recreate the above scenario and verify the fix.
We are creating as many ksync sockets as number of TBB threads even
though we only use only the first socket for all operations. Modified
code to create only one ksync socket.
Issue:
------
ToR agent doesnot program/expect QFX to have two logical
switch with same VxLAN ID at any point of time, observing
the same in certain negative test scenarios doesnot allow
ToR agent to recover OVSDB database to sane state
Fix:
----
Allow creation of stale entry with duplicate VxLAN ID,
even though it is not expected, allowing creation helps
to recover OVSDB database to sane state the deletion of
the same on stale entry timeout.
Closes-Bug: 1535093
Change-Id: I32a4fbab665f433d6a5dae7eb185be8e50de53d0
(cherry picked from commit c1cfe79983da1b3391aef4205deb6e723be449c7)
Handle VMs with whitespace in the name
vrouter-port-control script fail when there is a whitespace in the vm
name (in fact in any of the arguments).
This patch add a regex based split on vrouter-port-control to fix that,
so that it will pass the arguments with whitespace in it correctly.
* Add exclusion between flow table and flow stats collector
Reference for flow entry can be release by flow stats collector
resulting in flow being deleted from flow tree and parallel
modification of flow tree from flow table and flow stats collector
context. Fixing the same.
Closes-bug:#1535040
Reviewed: https:/ /review. opencontrail. org/16686 github. org/Juniper/ contrail- controller/ commit/ 156ad0b760f9b53 2572116d813d7af a695555bea
Committed: http://
Submitter: Zuul
Branch: R2.22.x
commit 156ad0b760f9b53 2572116d813d7af a695555bea
Author: Atul Moghe <email address hidden>
Date: Mon Dec 21 14:29:14 2015 -0800
Cherry pick controller commits from R2.20 to R2.22.x
updating version.info from 2.22 to 2.23 in 2.20 branch
Closes-Bug:#1528370
Change-Id: Ic649422979a926 cc5f5b8457c0161 0b848dc206b
Storage stats daemon fix
Partial-Bug: #1528327
Fixed latency monitor code based on the Ceph 0.94.3 version.
Fixed issues in OSD throughput/IOPs calculation.
Updated code based on the latest Sandesh apis.
Change-Id: I12caf951f84c8b 213b1b5ec01371b b68b4c48cb3
Fix contrail-collector back pressure mechanism
contrail-collector DB queue back presssure mechanism was not
working since the DB drop level is initialized to INVALID and
even the water marks levels are INVALID and hence the defer/undefer
callbacks are not called.
Change-Id: Ib28141a69aeed3 c4ad6f50abbaed2 a285e3e7db2
Partial-Bug: #1528380
Fix Agent crash for flow index tree management
Issue:
------
During a flow index change vrouter-agent triggers a delete
on index tree using new flow handle instead of currently
held flow_handle resulting in flow entry getting associated
to two slots in the flow index tree, which further on flow
entry delete due to aging or eviction never releases the
slot for old flow handle, causing failures for further
insertions in the flow index tree
Fix:
----
Avoid taking flow handle as argument to DeleteByIndex and
use the currently associated flow_handle to remove from tree
Adding assert in DeleteByIndex to catch delete failure
Avoid doing delete from index tree in code paths other than
flow entry index update of flow entry delete.
Add logic for KSync Sock User to Mock vrouter behavior
returning index for an entry if it is already allocated
instead of allocating a new one.
Closes-Bug: 1527425 fdd924a5f1d35d6 b8dea03a3f0
Change-Id: I10e77fb59650ac
Fix discovery dependency issue. Originally made in master branch /review. opencontrail. org/#/c/ 15749
via https:/
Change-Id: I5d874de3714074 c66fa73bfd7c911 9772dc681fd
Partial-Bug: #1530186
Avoid calling get_routing_ instances on VN object
Calling get_routing_ instances could trigger another read of the VN
if the VN has no routing instance. This is not only inefficient, but
could also cause exception if the VN has disappeared. We can avoid
this by calling getattr.
Change-Id: Ie5500585b9e6c5 78576276c2c04ec 03f32c75112
Partial-Bug: 1528950
Fix Centos 65 agent compilation issues.
Closes-Bug: #1532159
Change-Id: Ia8b77619c80737 000d5bd949534c9 e0a16967359
Closes-Bug: #1524063, contrail-status is showing contrail-web-ui, even it is not configured, in case of SMLite
Change-Id: I55afc19140b1ce 52b3b529a644124 705de5ce6a8
Fix a corner case with routing instance delete
Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted
Resolve trigger calls ProcessStaticRo uteConfig to apply any pending static uteConfig accesses the NULL config pointer of
route config. ProcessStaticRo
the routing instance
Fix: uteConfig
1. Check whether the routing instance is deleted in ProcessStaticRo
2. Reset the resolve_trigger_ in StaticRouteMgr destructor
3. Add API to disable resolve_trigger_ and Add UT to test delayed processing
of resolve_trigger_
Change-Id: Icb1b9bad340cce fc9fbab75188034 ade79a6193a
Closes-bug: #1533435
Fix scons failure due to pip ugrade
Closes-Bug: 1536541 eddafc9b3c686ef 968f21e7e31
Change-Id: I5138f6fb7d073b
Fix uninitialized vrouter params
Issue:
------
uninitialized variable results in bad calculation of number
of MPLS labels for multicast
Fix:
----
initialize vrouter params to 0, handle case if
vrouter_max_labels is lesser than fixed unicast range.
Conflicts: agent/cmn/ agent.cc
src/vnsw/
Closes-Bug: 1535735 78dcb5b8b24d084 8e560ff0138 29122e71692c12f e92364fa8f)
Change-Id: Id4a7b74f12728e
(cherry picked from commit eb0488b92808712
Update task policy for contrail-dns.
When XMPP channel with agent clsoes, the records exported from there
are removed in the cleaner task context. Update the task policy to
ensure the config and bind tasks do not run in parallel with it.
Change-Id: Ic92f147060c39e 452742aa67ace7e 3a6f3ddc5a3
closes-bug: 1533811
Fix corner case in join processing
This problem can happen when a number of vRouter agents restart in
quick succesion and subscribe to a table and advertise routes into
the table.
Consider the following sequence of events on a route in foo.inet.0:
- Route is added with a path with attribute A and nexthop N1
- XMPP peer P1 subscribes to the table
- As a result a RouteUpdate is created in QBULK (the join queue). The
RouteUpdate has an UpdateInfo with a bitset P1 and a RibOutAttr with
attribute A and nexthop N1
- Another path for the same route is added with nexthop N2
- This new path is ecmp eligible i.e. has same local preference as the
best path
- XMPP peer P2 subscribes to the table
- As a result the existing RouteUpdate in QBULK gets a new UpdateInfo.
A new UpdateInfo is created because RibOutAttr for P2 has attribute A
and nexthops (N1, N2). Note that a different UpdateInfo is required
for different RibOutAttr. This UpdateInfo has a bitset P2.
Notice that we now have a RouteUpdate with 2 UpdateInfos that have the
same attribute A, but different RibOutAttrs (by virtue having different
forwarding nexthops).
The UpdateQueue maintains a set (attr_set_ of type UpdatesByAttr) of
UpdateInfos keyed by BgpAttr and timestamp. The label/nexthops in the
RibOutAttr are not included in key to achieve optimal packing of bgp
updates by attribute. As a result, both the UpdateInfos are inserted
(or rather attempted to be inserted) into attr_set_ with the same key.
This causes a crash when we later try to erase both UpdateInfos from
the attr_set_ when doing export processing for the route.
Note that we run into this case only if join processing for P2 happens
before export processing for the route after the 2nd path got added.
If export processing happens before the join for P2, the RouteUpdate
would move from QBULK to QUPDATE and would have only 1 UpdateInfo with
attribute A and nexthop (N1, N2).
The fix consists of 2 parts:
1. Use the UpdateInfo pointer itself as the final tie-breaker in the :UpdatesByAttr to ensure we have no duplicates.
key for UpdateQueue:
2. When traversing the UpdatesByAttr set to build update messages, fix :AttrNext to not return an UpdateInfo for same RouteUpdate
UpdateQueue:
as the current UpdateInfo. Doing so invalidates the locking design in
RibOutUpdates and results in a deadlock.
Add unit tests to recreate the above scenario and verify the fix.
Change-Id: I45ce1bbd72d8b6 a163a5aa6135849 1cc1d3f6a93
Closes-Bug: 1536729
Create only 1 ksync-socket
We are creating as many ksync sockets as number of TBB threads even
though we only use only the first socket for all operations. Modified
code to create only one ksync socket.
Change-Id: I2f1bf8558c219f c97402f8192c3d9 d6cebacaf98
Fixes-Bug: #1533495
Fix ToR agent crash for duplicate VxLAN-ID
Issue:
------
ToR agent doesnot program/expect QFX to have two logical
switch with same VxLAN ID at any point of time, observing
the same in certain negative test scenarios doesnot allow
ToR agent to recover OVSDB database to sane state
Fix:
----
Allow creation of stale entry with duplicate VxLAN ID,
even though it is not expected, allowing creation helps
to recover OVSDB database to sane state the deletion of
the same on stale entry timeout.
Closes-Bug: 1535093 3d6a5dae7eb185b e8e50de53d0 391aef4205deb6e 723be449c7)
Change-Id: I32a4fbab665f43
(cherry picked from commit c1cfe79983da1b3
Handle VMs with whitespace in the name
vrouter- port-control script fail when there is a whitespace in the vm
name (in fact in any of the arguments).
This patch add a regex based split on vrouter- port-control to fix that,
so that it will pass the arguments with whitespace in it correctly.
Change-Id: Ibf52dc23321d1c 4c7f231cb5cd386 afa495de0aa
Fixes-Bug: #1519768
Signed-off-by: hkumarmk <email address hidden>
* Add exclusion between flow table and flow stats collector
Reference for flow entry can be release by flow stats collector
resulting in flow being deleted from flow tree and parallel
modification of flow tree from flow table and flow stats collector
context. Fixing the same.
Closes-bug:#1535040
Change-Id: I8e7c18aaacbe1e d16639917dc5148 0af55b2da86