nagios failure "check-graylog-health" fails with Indexer failures 69000
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Graylog Charm |
New
|
Medium
|
Unassigned |
Bug Description
Running against the latest/stable graylog charm, and using the graylog 3/stable channel for the snap SQA had all the nagios units fail the "check-
we run elasticsearch with the settings `auto-create-index: .watches,
elastic is on an 12G mem and 2 CPU system with 50G disk, and graylog is on a 8G mem, 2 CPU system with a 40G disk.
Nothing stands out in the elastic or graylog logs as to what the problem may be further than this.
outputs from nagios are:
- check: check-graylog-
id: '124'
results:
check-output: 'CRITICAL: Indexer failures: 69000
OK: Indexer cluster health: green; Journal uncommitted messages: 3; Outstanding
notificat
return-code: 0
status: completed
timing:
completed: 2022-12-07 07:59:37 +0000 UTC
enqueued: 2022-12-07 07:59:36 +0000 UTC
started: 2022-12-07 07:59:37 +0000 UTC
unit: nrpe/7
- check: check-graylog-
id: '126'
results:
check-output: 'CRITICAL: Indexer failures: 69000
OK: Indexer cluster health: green; Journal uncommitted messages: 0; Outstanding
notificat
return-code: 0
status: completed
timing:
completed: 2022-12-07 07:59:37 +0000 UTC
enqueued: 2022-12-07 07:59:36 +0000 UTC
started: 2022-12-07 07:59:37 +0000 UTC
unit: nrpe/6
- check: check-graylog-
id: '121'
results:
check-output: 'CRITICAL: Indexer failures: 69000
OK: Indexer cluster health: green; Journal uncommitted messages: 103; Outstanding
notificat
return-code: 0
status: completed
timing:
completed: 2022-12-07 07:59:37 +0000 UTC
enqueued: 2022-12-07 07:59:36 +0000 UTC
started: 2022-12-07 07:59:37 +0000 UTC
unit: nrpe/1
testrun can be found at:
https:/
with bundle at:
https:/
and crashdump at:
https:/
summary: |
- nagios failure "check-graylog-health" fails with too many uncommited - journal messages + nagios failure "check-graylog-health" fails with Indexer failures 69000 |
tags: | added: bseng-910 |
Changed in charm-graylog: | |
importance: | Undecided → Medium |
Changed in charm-graylog: | |
importance: | Medium → High |
Changed in charm-graylog: | |
importance: | High → Medium |
hit the same issue during ps6 deployment