Make sure it's possible for Ops to restart fastcgi processes through NAGIOS
Bug #561894 reported by
George
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Open Library |
Fix Released
|
High
|
Ralf Muehlen |
Bug Description
This afternoon, 532 began seizing up, Edward was at the doctor's, and there wasn't a thing anyone else could do.
If we're going to get anything like 24 hour coverage when we go to WWW, Ops need to be able to restart things for us.
I've attached a screenshot of the state of the NAGIOS checks at the moment... you can see what cannot be restarted yet on there.
Changed in openlibrary: | |
status: | New → In Progress |
Changed in openlibrary: | |
milestone: | upstream-to-www → stability |
To post a comment you must log in.
Edward:
- document work search SOLR for Ops team
- Add monitoring for both instances of Upstream SOLRs
- look to move SOLR update process off Edward's dev box onto the SOLR production box (*07)