From: Greg Sabino Mullane
The output can be changed by use of the --output
option. The default output
is nagios, although this can be changed at the top of the script if you wish. The
-current option choices are "nagios", "mrtg", and "simple". To avoid having to
+current option choices are nagios, mrtg, and simple. To avoid having to
enter the output argument each time, the type of output is automatically set
if no --output argument is given, and if the current directory has one of the
output options in its name. For example, creating a directory named mrtg and
-populating it with symlinks via the --symlinks
argument would ensure that
+populating it with symlinks via the --symlinks argument would ensure that
any actions run from that directory will always default to an output of "mrtg"
The default output format is Nagios, which is a single line of information, along +
The default output format is for Nagios, which is a single line of information, along with four specific exit codes:
The output line is one of the words above, a colon, and then a short description of what was measured. Additional statistics information, as well as the total time the command -took, can be output as well: see the documentation on the arguments --showperf, ---perflimit, and --showtime.
+took, can be output as well: see the documentation on the arguments +--showperf, +--perflimit, and +--showtime.
Some actions accept an optional --mrtg
argument to further control the output.
Some actions accept an optional --mrtg argument to further control the output.
See the documentation on each action for details on the exact MRTG output for each one.
@@ -190,7 +192,7 @@ Instead, one should use a .pgpass file. -
The database connection options can be grouped: --host=a,b --host=c --port=1234 --port=3344 +
The database connection options can be grouped: --host=a,b --host=c --port=1234 --port=3344 would connect to a-1234, b-1234, and c-3344. Note that once set, an option carries over until it is changed again.
Examples:
@@ -273,9 +275,9 @@ VAL should be 0 or 1. The default is 1. Only takes effect if using Nagios outputSets a limit as to how many items of interest are reported back when using the -showperf option. This only has an effect for actions that return a large +showperf option. This only has an effect for actions that return a large number of items, such as table_size. The default is 0, or no limit. Be -careful when using this with the --include or --exclude options, as +careful when using this with the --include or --exclude options, as those restrictions are done after the query has been run, and thus your limit may not include the items you want. Only takes effect if using Nagios output mode.
Determines if the time taken to run each query is shown in the output. VAL -should be 0 or 1. The default is 1. No effect unless showperf is on. +should be 0 or 1. The default is 1. No effect unless showperf is on. Only takes effect if using Nagios output mode.
$PSQL
variable, also near the top of th
other systems are supported yet.
+Used only for the MRTG or simple output, for a few specific actions.
+@@ -337,7 +345,7 @@ if use the option --symlinks perl check_postgres.pl --symlinks
If the file name already exists, it will not be overwritten. If the file exists and is a symlink, you can force it to overwrite by using "--action=build_symlinks_force"
-Most actions take a --warning and an --critical option, indicating at what +
Most actions take a --warning and a --critical option, indicating at what point we change from OK to WARNING, and what point we go to CRITICAL. Note that because criticals are always checked first, setting the warning equal to the critical is an effective way to turn warnings off and always give a critical.
@@ -348,15 +356,15 @@ critical is an effective way to turn warnings off and always give a critical.Checks the current number of connections for one or more databases, and optionally compares it to the maximum allowed, which is determined by the -Postgres configuration variable max_connections. The --warning and ---critical options can take one of three forms. First, a simple number can be +Postgres configuration variable max_connections. The --warning and +--critical options can take one of three forms. First, a simple number can be given, which represents the number of connections at which the alert will be given. This choice does not use the max_connections setting. Second, the percentage of available connections can be given. Third, a negative number can be given which represents the number of connections left until max_connections -is reached. The default values for --warning and --critical are '90%' and '95%'. +is reached. The default values for --warning and --critical are '90%' and '95%'. You can also filter the databases by use of the ---include and --exclude options. See the BASIC FILTERING section +--include and --exclude options. See the BASIC FILTERING section for more details.
The --warning and --critical options can be specified as sizes or percents. +
The --warning and --critical options can be specified as sizes or percents. Valid size units are bytes, kilobytes, megabytes, gigabytes, terabytes, and exabytes. You can abbreviate all of those with the first letter. Items without units are assumed to be 'bytes'. The default values are '1 GB' and '5 GB'. The value @@ -416,7 +424,7 @@ used by the table and index, and what we compute that it should be.
Note that this action has two hard-coded values to avoid false alarms on
smaller relations. Tables must have at least 10 pages, and indexes at least 15,
before they can be considered by this test. If you really want to adjust these
-values, you can look for the variables $MINPAGES and $MINIPAGES at the top of the
+values, you can look for the variables $MINPAGES and $MINIPAGES at the top of the
check_bloat
subroutine.
Simply connects, issues a 'SELECT version()', and leaves. -Takes no --warning or --critical options.
+Takes no --warning or --critical options.For MRTG output, simply outputs a 1 (good connection) or a 0 (bad connection) on the first line.
@@ -506,7 +514,7 @@ bytes are assumed. The first column should be an integer representing the numberNormally, an alert is triggered if the values returned are greater than or equal to the critical or warning -value. However, an option of --reverse will trigger the alert if the returned value is +value. However, an option of --reverse will trigger the alert if the returned value is lower than or equal to the critical or warning value.
Checks the size of all databases and complains when they are too big. There is no need to run this command more than once per database cluster. Databases can be filtered with -the --include and --exclude options. See the BASIC FILTERING section +the --include and --exclude options. See the BASIC FILTERING section for more details. They can also be filtered by the owner of the database with the ---includeuser and --excludeuser options. +--includeuser and --excludeuser options. See the USER NAME FILTERING section for more details.
Checks on the available physical disk space used by Postgres. This action requires that you have the executable "/bin/df" available to report on disk sizes, and it also needs to be run as a superuser, so it can examine the data_directory -setting inside of Postgres. The --warning and --critical options are +setting inside of Postgres. The --warning and --critical options are given in either sizes or percentages. If using sizes, the standard unit types are allowed: bytes, kilobytes, gigabytes, megabytes, gigabytes, terabytes, or exabytes. Each may be abbreviated to the first letter only; no units at all @@ -647,22 +655,22 @@ and the name of the file system on the fourth line.
The actions table_size and index_size are simply variations of the relation_size action, which checks for a relation that has grown too big. Relations (in other words, tables and indexes) can be filtered with the ---include and --exclude options. See the BASIC FILTERING section +--include and --exclude options. See the BASIC FILTERING section for more details. Relations can also be filtered by the user that owns them, -by using the --includeuser and --excludeuser options. +by using the --includeuser and --excludeuser options. See the USER NAME FILTERING section for more details.
The values for the --warning and --critical options are file sizes, and +
The values for the --warning and --critical options are file sizes, and may have units of bytes, kilobytes, megabytes, gigabytes, terabytes, or exabytes. Each can be abbreviated to the first letter. If no units are given, bytes are assumed. There are no default values: both the warning and the critical option must be given. The return text shows the size of the largest relation found.
If the --showperf option is enabled, all of the relations with their sizes +
If the --showperf option is enabled, all of the relations with their sizes
will be given. To prevent this, it is recommended that you set the
---perflimit option, which will cause the query to do a
+--perflimit option, which will cause the query to do a
ORDER BY size DESC LIMIT (perflimit)
.
The units for --warning and --critical are specified as times. +
The units for --warning and --critical are specified as times. Valid units are seconds, minutes, hours, and days; all can be abbreviated to the first letter. If no units are given, 'seconds' are assumed. The default values are '1 day' and '2 days'. Please note that there are cases @@ -774,11 +782,11 @@ be provided via the <--mrtg> option.
Check the total number of locks on one or more databases. There is no need to run this more than once per database cluster. Databases can be filtered -with the --include and --exclude options. See the BASIC FILTERING section +with the --include and --exclude options. See the BASIC FILTERING section for more details.
The --warning and --critical options can be specified as simple numbers, +
The --warning and --critical options can be specified as simple numbers,
which represent the total number of locks, or they can be broken down by type of lock.
Valid lock names are 'total'
, 'waiting'
, or the name of a lock type used by Postgres.
These names are case-insensitive and do not need the "lock" part on the end,
@@ -812,7 +820,7 @@ checking, and ensures that the message shows up in the logs. It scans the
various log_* settings inside of Postgres to figure out where the logs should be.
If you are using syslog, it does a rough (but not foolproof) scan of
/etc/syslog.conf. Alternatively, you can provide the name of the logfile
-with the --logfile option. This is especially useful if the logs have a
+with the --logfile option. This is especially useful if the logs have a
custom rotation scheme driven be an external program. The --logfile option
supports the following escape characters: %Y %m %d %H
, which represent
the current year, month, date, and hour respectively. An error is always
@@ -843,11 +851,11 @@ failure, the fourth line will provide more detail on the failure encountered.
Checks how long a specific query takes to run, by executing a "EXPLAIN ANALYZE" -against it. The --warning and --critical options are the maximum amount of +against it. The --warning and --critical options are the maximum amount of time the query should take. Valid units are seconds, minutes, and hours; any can be abbreviated to the first letter. If no units are given, 'seconds' are assumed. Both the warning and the critical option must be given. The name of the view or -function to be run must be passed in to the --queryname option. It must consist +function to be run must be passed in to the --queryname option. It must consist of a single word (or schema.word), with optional parens at the end.
Checks the length of running queries on one or more databases. There is no need to run this more than once on the same database cluster. Databases can be filtered -by using the --include and --exclude options. See the BASIC FILTERING +by using the --include and --exclude options. See the BASIC FILTERING section for more details. You can also filter on the user running the -query with the --includeuser and --excludeuser options. +query with the --includeuser and --excludeuser options. See the USER NAME FILTERING section for more details.
The values for the --warning and --critical options are amounts of +
The values for the --warning and --critical options are amounts of time, and default to '2 minutes' and '5 minutes' respectively. Valid units are 'seconds', 'minutes', 'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, the unit is @@ -912,7 +920,7 @@ line gives the name of the database.
Checks that master-slave replication is working to one or more slaves. The slaves are specified the same as the normal databases, except with the number 2 at the end of them, so "--port2" instead of "--port", etc. -The values or the --warning and --critical options are units of time, and +The values or the --warning and --critical options are units of time, and at least one must be provided (no defaults). Valid units are 'seconds', 'minutes', 'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, the units are assumed to be seconds.
@@ -964,13 +972,13 @@ a time, an error is thrown.Checks the length of open transactions on one or more databases. There is no need to run this command more than once per database cluster. Databases can be filtered by use of the ---include and --exclude options. See the BASIC FILTERING section +--include and --exclude options. See the BASIC FILTERING section for more details. The owner of the transaction can also be filtered, by use of -the --includeuser and --excludeuser options. +the --includeuser and --excludeuser options. See the USER NAME FILTERING section for more details.
The values or the --warning and --critical options are units of time, and +
The values or the --warning and --critical options are units of time, and must be provided (no default). Valid units are 'seconds', 'minutes', 'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, the units are assumed to be seconds.
@@ -1002,11 +1010,11 @@ first line. The fourth line gives the name of the database.Checks the length of "idle in transaction" queries on one or more databases. There is no need to run this more than once on the same database cluster. Databases can be filtered -by using the --include and --exclude options. See the BASIC FILTERING +by using the --include and --exclude options. See the BASIC FILTERING section below for more details.
The --warning and --critical options are given as units of time, and both must +
The --warning and --critical options are given as units of time, and both must be provided (there are no defaults). Valid units are 'seconds', 'minutes', 'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, the unit are assumed to be seconds.
@@ -1046,7 +1054,7 @@ then symlinks will be overwritten. The option --symlinks is a shorter way of say This is done by generating a checksum of a sorted list of setting names and their values. Note that different users in the same database may have different checksums, due to ALTER USER usage, and due to the fact that superusers see more -settings than ordinary users. Either the --warning or the --critical option +settings than ordinary users. Either the --warning or the --critical option should be given, but not both. The value of each one is the checksum, a 32-character hexadecimal value. You can run with the special--critical=0
option
to find out an existing checksum.
@@ -1074,11 +1082,11 @@ checksum must be provided as the --mrtg
argument. The fourth line a
current checksum.
check_postgres_timesync
)
+check_postgres_timesync
)
Compares the local system time with the time reported by one or more databases. -The --warning and --critical options represent the number of seconds between +The --warning and --critical options represent the number of seconds between the two systems before an alert is given. If neither is specified, the default values are used, which are '2' and '5'. The warning value cannot be greater than the critical value. Due to the non-exact nature of this test, values of '0' or '1' are not recommended.
@@ -1102,7 +1110,7 @@ time and the database time. The fourth line returns the name of the database.Checks how close to transaction wraparound one or more databases are getting. -The --warning and --critical options indicate the number of transactions +The --warning and --critical options indicate the number of transactions left, and must be a positive integer. If either option is not given, the default values of 1.3 and 1.4 billion are used. There is no need to run this command more than once per database cluster. For a more detailed discussion of what this @@ -1138,7 +1146,7 @@ while line 4 indicates which database it is.
off of your data_directory, sometimes as a symlink to another physical disk for performance reasons. This action must be run as a superuser, in order to access the contents of the pg_xlog directory. The minimum version to use this action is -Postgres 8.1. The --warning and --critical options are simply the number of +Postgres 8.1. The --warning and --critical options are simply the number of files in the pg_xlog directory. What number to set this to will vary, but a general guideline is to put a number slightly higher than what is normally there, to catch problems early. @@ -1164,7 +1172,7 @@ out of space, at which point Postgres will shut down.Checks that the required version of Postgres is running. The ---warning and --critical options (only one is required) must be of +--warning and --critical options (only one is required) must be of the format X.Y or X.Y.Z where X is the major version number, Y is the minor version number, and Z is the revision.
The options --include and --exclude can be combined to limit which +
The options --include and --exclude can be combined to limit which things are checked, depending on the action. The name of the database can be filtered when using the following actions: backends, database_size, locks, query_time, txn_idle, and txn_time. @@ -1202,7 +1210,7 @@ The name of a file system can be filtered when using the disk_space action.
If only an include option is given, then ONLY those entries that match will be checked. However, if given both exclude and include, the exclusion is done first, and the inclusion after, to reinstate things that may have been excluded. Both ---include and --exclude can be given multiple times, +--include and --exclude can be given multiple times, and/or as comma-separated lists. A leading tilde will match the following word as a regular expression.
To match a schema, end the search term with a single period. Leading tildes can @@ -1237,9 +1245,9 @@ or which are named 'sql_settings' or 'green'. Specifically check items with the
The options --includeuser and --excludeuser can be used on some actions +
The options --includeuser and --excludeuser can be used on some actions to only examine database objects owned by (or not owned by) one or more users. -An --includeuser option always trumps an --excludeuser option. You can +An --includeuser option always trumps an --excludeuser option. You can give each option more than once for multiple users, or you can give a comma-separated list. The actions that currently use these options are:
To help in setting things up, this program can be run in a "test mode" by -specifying the --test option. This will perform some basic tests to +specifying the --test option. This will perform some basic tests to make sure that the databases can be contacted, and that certain per-action prerequisites are met, such as whether the user is a superuser, if the version of Postgres is new enough, and if stats_row_level is enabled.
@@ -1310,7 +1318,7 @@ instead of symlinks, so that the plugin only gets compiled one time.The settings_checksum action requires the Digest::MD5 module.
Some actions require access to external programs. If psql is not explicitly
-specified, the command which is used to find it. The program /bin/df
+specified, the command which
is used to find it. The program /bin/df
is needed by the check_disk_space action.
@@ -1335,7 +1343,7 @@ feature requests, and commit notices, send email to HISTORY
Items not specifically attributed are by Greg Sabino Mullane.
Add support for MRTG and "simple" output options.