Age | Commit message (Collapse) | Author |
|
Change-Id: I0c6e27298c63bd37de1410985d054799818c22a4
|
|
|
|
check_mk operations can take a long time (such as when doing a
re-inventory using "check_mk -II") when multiple hosts are down. This
decreases the connect timeout to 5 seconds.
Change-Id: I1eac5f14bad2afc2ffc4cbf8c950c24b052a0d6e
|
|
Sometimes a floating point exception or segfault of
a process results in systemd restarting it, we want
to recognize this from the syslog
i.e.:
systemd[1]: pixelated-server.service: main process exited,
code=killed, status=8/FPE
systemd[1]: Unit pixelated-server.service entered failed state.
- Related: https://github.com/pixelated/pixelated-user-agent/issues/683
|
|
Otherwise, the nagios config will get regenerated and nagios gets
reloaded before all checks are registered by a check_mk inventory.
- Related: #6873
|
|
After upgrading the platform, there might be old check_mk checks
registered on the monitor hosts. We now run a check_mk inventory
on every run that also purged old non-existng checks.
- Resolves: #6873
|
|
|
|
It failed to calculate the sessions and tokens db names.
- Resolves: #7658
|
|
|
|
|
|
|
|
# Conflicts:
# puppet/modules/site_couchdb/manifests/plain.pp
|
|
This change will make sure that the user/group for leap-mx exist, and it
changes the mail location from /var/mail/vmail to the more helpful name
/var/mail/leap-mx.
This change requires:
https://github.com/leapcode/leap_mx/pull/78
and it would replace merge request:
https://github.com/leapcode/leap_mx/pull/65
and fix https://leap.se/code/issues/6936 and
https://leap.se/code/issues/7635
Change-Id: Idbe678dc999e394232c2eeef2b2018d39ab7cc3b
|
|
- Related: #6920
|
|
leap_cli integrates a check for running mx procs already,
which is also integrated into nagios (called "Mx/Are_MX_daemons_running")
|
|
|
|
Duplicate declaration:
File[/srv/leap/nagios/plugins/check_unix_open_fds.pl] is already
declared in file
/srv/leap/puppet/modules/site_check_mk/manifests/agent/couchdb/bigcouch.pp
at line 44; cannot redeclare at
/srv/leap/puppet/modules/site_check_mk/manifests/agent/couchdb.pp:23 on
node rewdevcouch1.rewire.org
|
|
When migrating from bigcouch to couchdb, we need to remove leftover nagios
tests for bigcouch.
- Added new classes: site_check_mk::agent::couchdb::bigcouch and
site_check_mk::agent::couchdb::master
- Tested: unstable.pixelated-project.org
- Resolves: https://github.com/pixelated/pixelated-platform/issues/126
|
|
Soledad now creates user-dbs, which has been done by tapicero
in the past. we need to remove any leftovers from tapicero.
|
|
|
|
Change-Id: Ic9af9ef3602abbb51edf1c9d71d4d264b4ace714
|
|
The rationale here is:
- bigcouch/its included erlang version is incredibly noisy and spits out
warnings/error msgs all the time
- it uses the worst logging format i ever saw, multiple lines directly
to a file (couch 2.0 uses lager as logging backend which can log to
syslog)
- trying to sort out the false positives will take too much time,
and who knows which of them will be resolved in couch 1.6/2.0
Change-Id: Idbe6b37a19cd65ce31a50d4c28eedb4cf15ba3b5
|
|
Increase warning/critical thresholds for time between tapicero heartbeat
checks so it will emit less false positives
Change-Id: I0f97373d88658b7f17b2c4e8c1963198dc3f66ed
|
|
Change-Id: Ie7943c9a541c3cd2feac7686ed1092aadc5a7c7a
|
|
These are warnings that might have different origins, each of
them we don't want to alarm the admin:
- A bitmask client bug (user will poke the client devs if things
break, and they will go after it)
- A simple network failure, packets might get cut of
- Malicious user tries to temper with TLS handshakes - this gets
more interesting, but still (like ssh bruteforce attacs) an admin
would not want to get annoyed by this by default, but they still
have the option to use log analysers of their choice if they want
to investigate this.
Change-Id: I23ca3b700e41f22f34ad3346ed4e647b86000bb2
|
|
Change-Id: If844b95c44e697f480df8ee2ae6607709b9942f7
|
|
Change-Id: I7b778e1e1af2784bd79840f20453ca8718927e25
|
|
Change-Id: I51ce8a9e8773d267c270a1725a497f9a43f2e9ff
Sidenote: $nagios_hosts was never used
|
|
Change-Id: I115ebdefd7365bf15a30c4a3ce7a4543ad757cec
|
|
Change-Id: Ibefc6ce08cf714cf79a460a8b6eb32e2851ce22c
|
|
condition with another tapicero instance #6534
Change-Id: Ie194a2983210601bd24aef5e74f8b7fa2b7c433f
|
|
|
|
their own files, fix mx logwatch path.
|
|
/var/log/tapicero, fix webapp logwatch location.
|
|
|
|
leap_couch_stats.sh is a local check_mk agent script
which provides per-db stats as well as global stats.
Change-Id: I1eba19a3a0210d3127acbad119dfd2918414ff4a
|
|
|
|
Change-Id: Ia5ac6f50e023d7d358d17c661b71c6a5880ec445
|
|
Change-Id: Id53d6432a58006653f4d9ddd6355ae505a5273eb
|
|
/etc/check_mk/mrpe.cfg (Bug #6788)
We used file_line before, but when the some check parameters change, a
new line would be added, leaving the old line there, resulting in two
checks with the same name but with different parameters.
Augeas can handle this better, but it is important to use 'rm' to
remove all old lines with different parameters before adding the
new line.
Change-Id: Iad69dfd20f487a16d372a4f4a4bc53299f9e4a66
|
|
minutes, until we are able to fix the issue with the test users creating db bloat.
|
|
systems by default (#6664)
Change-Id: Ic2d4416b7c55f00f01d4b2ade78339d653bc8993
|
|
Change-Id: I0149ac2e767531d9724b57b9e3bdae7943f954ff
|
|
Bug/6566
See merge request !19
|
|
Change-Id: I0d30afbcc6dcb90c6716f7c6bb0bca3e6ae0964a
|
|
Change-Id: I1d8cedfeb1153312c13f7f182c7ac3b031647dd4
|
|
Change-Id: I6d3fa5028ba6eaca7b21a7e850136ef980f6e782
|
|
In order to assure tapicero is still working, we need to monitor
/var/log/syslog for the last tapicero log msg, which should not be older
than the last check_mk_agent run (every 2 mins atm).
|
|
Conflicts:
puppet/modules/site_check_mk/files/agent/logwatch/bigcouch.cfg
Change-Id: I1646e49ffa5437a861b402b755bc15943c42ec4f
|
|
Change-Id: I73defd7964501e4eabe7dd05c02887e7aeb2f063
|