R1.10-build-2271: collector not responding

Bug #1348563 reported by Saravanan Musuvathi
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R1.1
Fix Released
Critical
Sundaresan Rajangam
Trunk
Fix Released
Critical
Sundaresan Rajangam

Bug Description

Have collected gcore of vizd process for further analysis...
Core available in bhushana@mayamruga-E71:~$ ~/Documents/technical/bugs/

Changed in juniperopenstack:
importance: Undecided → Critical
tags: added: vizis
Raj Reddy (rajreddy)
Changed in juniperopenstack:
status: New → Incomplete
assignee: nobody → Saravanan Musuvathi (smusuvathi)
Raj Reddy (rajreddy)
Changed in juniperopenstack:
assignee: Saravanan Musuvathi (smusuvathi) → Sundaresan Rajangam (srajanga)
status: Incomplete → In Progress
Raj Reddy (rajreddy)
tags: added: blocker
Revision history for this message
Sundaresan Rajangam (srajanga) wrote :
Revision history for this message
Sundaresan Rajangam (srajanga) wrote :
Revision history for this message
Sundaresan Rajangam (srajanga) wrote :

SandeshStateMachine::WorkQueue task acquires cdbq_mutex_ [CdbIf::Db_AddColumn()] and then WorkQueue::mutex_[WorkQueue<CdbIf::CdbIfColList>::MayBeStartRunner()] when sandesh is received from the generator. At the same time, CdbIf::WorkQueue task acquires WorkQueue::mutex_ [WorkQueue<CdbIf::CdbIfColList>::RunnerDone()] and then cdbq_mutex_ [CdbIf::Db_Uninit()] when connection to the Analytics database is disconnected and TTransportException is raised in CdbIf::Db_BatchAddColumn(). This results in deadlock as the locks are acquired in the reverse direction in different tasks that run in parallel.

Megh Bhatt (meghb)
information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.