R1.10-build-22-ubuntu-havana-ssh to vrouter stuck

Bug #1360350 reported by shajuvk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R1.1
Fix Released
Critical
Raja Sivaramakrishnan
Trunk
Fix Committed
Critical
Raja Sivaramakrishnan

Bug Description

Hi Team,

We are seeing the issue with R1.10 build-22 ubuntu Havana multimode sanity. One of the vrouter a6s19 not able to connect through ssh. Can someone take a look at the setup.

Console error shows libvirtd blocked morethan 120 sec. screenshot attached

A6s32, a6s33, a6s38, a6s19, a6s22

Vrouter node: a6s19 and a6s22
Console of a6s19: 10.84.6.69

Note: Aborted all other sanity runs on this setup to debug this issue. Will run the necessary sanity locally.

Thanks,
Shaju
From: Shaju V.K
Sent: Thursday, August 21, 2014 4:08 PM
To: Anand H Krishnan; Ashish Ranjan; Hari Prasad Killi; Rajagopalan Sivaramakrishnan
Cc: Divakar Dharanalakota; Praveen K V
Subject: RE: Ubuntu havana-2304 build- lost ssh connectivity from cfgm to vrouter

Hi Anand,

I couldn’t hit the ssh issue with latest ubuntu build 2305. But in console of vrouter nodes (a5s9 and a5s10) some errors related to ext4 file system. Could you please check the vrouter console- http://10.84.6.159/ and http://10.84.6.160/

Thanks,
Shaju

From: Anand H Krishnan
Sent: Thursday, August 21, 2014 10:27 AM
To: Ashish Ranjan; Hari Prasad Killi; Rajagopalan Sivaramakrishnan
Cc: Shaju V.K; Divakar Dharanalakota; Praveen K V
Subject: RE: Ubuntu havana-2304 build- lost ssh connectivity from cfgm to vrouter

The system has recovered now. The traces aren't really helpful. Hence, if
we are able to reproduce, it will be helpful. Shaju?

Thanks,
Anand
________________________________________
From: Ashish Ranjan
Sent: Thursday, August 21, 2014 10:27 PM
To: Hari Prasad Killi; Anand H Krishnan; Rajagopalan Sivaramakrishnan
Cc: Shaju V.K; Divakar Dharanalakota; Praveen K V
Subject: Re: Ubuntu havana-2304 build- lost ssh connectivity from cfgm to vrouter

Anand, Raja we need to look at this at highest priority.
This will gate R1.10 release.

Shaju will help repro if needed.

Ashish

On Aug 21, 2014, at 12:49 AM, Ashish Ranjan <email address hidden> wrote:

BTW this is new kernel.. We have not done sanity on this
3.11.0.22 kernel.. So we need to look into this.

Ashish

On Aug 21, 2014, at 12:39 AM, Hari Prasad Killi <email address hidden> wrote:

Ssh asks for passwd and gets struck. Dmesg shows attached trace – quite a few entries like this are present. Kernel thread locked ?

Regards,
Hari

From: Ashish Ranjan <email address hidden>
Date: Thursday, August 21, 2014 12:36 PM
To: Hari Prasad Killi <email address hidden>
Cc: "Shaju V.K" <email address hidden>
Subject: Re: Ubuntu havana-2304 build- lost ssh connectivity from cfgm to vrouter

See if we can try to debug this.
Its reproducible.

On Aug 20, 2014, at 6:54 PM, Shaju V.K <email address hidden> wrote:

Hi Team,

I am seeing ssh connectivity issue in multinode setup with Ubuntu havan build 2304. Sanity ran 47 test cases and 43 test cases are passed 1 failed and 3 skipped. While executing the 48th test case it lost the connectivity.

Not able to ssh a5s9 compute node from build machine or cfgm. Ping is working. Can someone take a look at below nodes. Tested file and logs are in node a5s6.

Revision history for this message
shajuvk (shajuvk) wrote :
Changed in juniperopenstack:
milestone: r1.11 → r1.10-fcs
Changed in juniperopenstack:
assignee: nobody → Raja Sivaramakrishnan (raja-u)
Revision history for this message
Raja Sivaramakrishnan (raja-u) wrote :

This is caused by cgroup issues in 3.11 kernel (where cgroup code was rewritten). Several fixes went into 3.13 including

author Tejun Heo <email address hidden> 2013-11-22 22:14:39 (GMT)
committer Tejun Heo <email address hidden> 2013-11-22 22:14:39 (GMT)
commit e5fca243abae1445afbfceebda5f08462ef869d3 (patch)
tree 4c5dc3301b6fe77fc70b4567c9f2c89c42a8d34c
parent 6ce4eac1f600b34f2f7f58f9cd8f0503d79e42ae (diff)
cgroup: use a dedicated workqueue for cgroup destruction

Move to 3.13.0-34 Ubuntu to get these fixes.

Revision history for this message
Shweta Naik (stnaik) wrote :

Changed the Fab task to upgrade the kernel to 3.13.0-34 version

information type: Proprietary → Public
Revision history for this message
Shweta Naik (stnaik) wrote :

commit ID's in R1.10:

adding third party packages:
https://github.com/Juniper/contrail-packaging/commit/1f9f1e902350c95f6bc5302285932a1683da2e8c

adding kernel version 3.13.0-34 to Makefile and making this version default:
https://github.com/Juniper/contrail-packaging/commit/8c92730c1583164260a292467283b8844f815d67

fab task to upgrade the kernel
https://github.com/Juniper/contrail-fabric-utils/commit/908433d05383d4a7d9dc830d033707006fd5271a

Ubuntu 3.13.0-34 is based on Linux mainline 3.13.11.4.

Kernel version after upgrade:
root@a5s193:~# uname -a
Linux a5s193 3.13.0-34-generic #60~precise1-Ubuntu SMP Wed Aug 13 15:55:33 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Shweta Naik (stnaik) wrote :

Mainline kernel version:
root@a5s193:~# cat /proc/version_signature
Ubuntu 3.13.0-34.60~precise1-generic 3.13.11.4

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.