Heka logstreamer checkpoints may be corrupted after the root partition is full

Bug #1563346 reported by Swann Croiset
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StackLight
New
Medium
LMA-Toolchain Fuel Plugins

Bug Description

To avoid corruption on cache files maintained by Heka and prevent working on root partition (which could lead/participate to fill up the / partition) the lma-collector plugin must create a dedicated volume for /var/cache/lma_collector.

The size of this new volume must be calculated depending on the total buffer sizes, logstreamer checkpoints and sandbox preservations, currently 4 Gb would be enough for the worse situtation.

> ls /var/cache/lma_collector/:
logstreamer/
output_queue/
sandbox_preservation/

For example, when a file system is full the /var/cache/lma_collector/logstreamer/openstack.keystone is corrupted, the following error happens when Heka starts:

Initialization failed for 'keystone_7_0_logstreamer': invalid character 'e' after top-level value
2016/03/07 14:45:10 Error making runner for keystone_wsgi_logstreamer:

Impacts on corruption of buffers in output_queue/ is unknown but is surely harmful.

Tags: heka
Swann Croiset (swann-w)
description: updated
Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

I'm not sure that it needs to be fixed for the 0.9.x releases provided that we fix the other bugs filling up the / partition.

Changed in lma-toolchain:
milestone: 0.9.0 → 1.0.0
no longer affects: lma-toolchain/0.9
Changed in lma-toolchain:
importance: Undecided → Medium
summary: - dedicate one volume for the lma-collector cache
+ dedicate one partition for the lma_collector cache directory
Revision history for this message
Éric Lemoine (elemoine) wrote : Re: dedicate one partition for the lma_collector cache directory

> To avoid corruption on cache files maintained by Heka and prevent working on root partition (which could lead/participate to fill up the / partition) the lma-collector plugin must create a dedicated volume for /var/cache/lma_collector.

How do you know that a dedicated partition would prevent file corruption when the partition is full?

That being said I agree that a dedicated partition for /var/cache/lma_collector makes sense and is an important thing to have. We cannot take the risk of filling up the root partition.

Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

I think that Swann's idea is to avoid file corruption when something else than Heka fills up the root partition. Obviously if the dedicated partition is full then we're in trouble again.

Swann Croiset (swann-w)
summary: - dedicate one partition for the lma_collector cache directory
+ Heka logstreamer checkpoints are may be corrupted after the root
+ partition is full
summary: - Heka logstreamer checkpoints are may be corrupted after the root
- partition is full
+ Heka logstreamer checkpoints may be corrupted after the root partition
+ is full
Changed in lma-toolchain:
milestone: 1.0.0 → 0.10.0
Changed in lma-toolchain:
milestone: 0.10.0 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.