Stefan, what's the most important thing we could do to help you help us? I see a few threads of inquiry in the comments that trailed off:
- abstracting Salvatore's script for high thrash rates on VM creation/destruction and network namespace creation, to try reproducing the issue without running openstack
- ext4 vs ext3
- capturing crashdumps from a devstack gate run (or simulated gate run on a local machine)
- determining whether the failure is a regression from a previous version of the kernel (seems not)
- determining whether the failure occurs on 3.11 kernel in addition to 3.2 kernel (seems it does)
What looks most promising? This is a critical issue, and still producing intermittent failures, so it's worth a prod or two on the OpenStack side to get things rolling again.
Stefan, what's the most important thing we could do to help you help us? I see a few threads of inquiry in the comments that trailed off:
- abstracting Salvatore's script for high thrash rates on VM creation/ destruction and network namespace creation, to try reproducing the issue without running openstack
- ext4 vs ext3
- capturing crashdumps from a devstack gate run (or simulated gate run on a local machine)
- determining whether the failure is a regression from a previous version of the kernel (seems not)
- determining whether the failure occurs on 3.11 kernel in addition to 3.2 kernel (seems it does)
What looks most promising? This is a critical issue, and still producing intermittent failures, so it's worth a prod or two on the OpenStack side to get things rolling again.