From: Greg Sabino Mullane Date: Sat, 5 Apr 2008 21:13:15 +0000 (-0400) Subject: Update documentation file. X-Git-Tag: 2.9.0~365 X-Git-Url: http://waps.l3s.uni-hannover.de/gitweb/?a=commitdiff_plain;h=94d5524f83c2d390f2ece40f984bd902d9cecb84;p=check_postgres.git Update documentation file. --- diff --git a/check_postgres.pl.html b/check_postgres.pl.html new file mode 100644 index 000000000..6b7cc9e1c --- /dev/null +++ b/check_postgres.pl.html @@ -0,0 +1,879 @@ + + + + +check_postgres.pl - Postgres monitoring script for Nagios + + + + + + + +
+

+ + + +
+
+ + +

+

+
+

NAME

+

check_postgres.pl - Postgres monitoring script for Nagios

+

+

+
+

VERSION

+

This documents describes check_postgres.pl version 1.4.2

+

+

+
+

SYNOPSIS

+
+  ## Create all symlinks
+  check_postgres.pl --action=build_symlinks
+
+  ## Check connection to Postgres database 'pluto':
+  check_postgres.pl --action=connection --db=pluto
+
+  ## Same things, but using the symlink
+  check_postgres_connection --db=pluto
+
+  ## Warn if > 100 locks, critical if > 200, or > 20 exclusive
+  check_postgres_locks --warning=100 --critical="total=200;exclusive=20"
+
+  ## There are many other actions and options, please keep reading.
+

+

+
+

WEBSITE

+

The latest news and documentation can always be found at:

+

http://bucardo.org/nagios_postgres/

+

+

+
+

DESCRIPTION

+

check_postgres.pl is a Perl script that runs many different tests against +one or more Postgres databases. It uses the psql program to gather the +information, and returns one of four exit codes used by Nagios, as well +as a short description of the results. The exit codes are:

+
    +
  1. (OK) + +
  2. (WARNING) + +
  3. (CRITICAL) + +
  4. (UNKNOWN) + +
+

+

+
+

DATABASE CONNECTION OPTIONS

+

Almost all actions accept a common set of options, most dealing with connecting to the databases.

+
+
-H NAME or --host=NAME + +
+

Connect to the host indicated by NAME. Can be a comma-separated list of names. Multiple host arguments +are allowed. If no host is given, defaults to a local Unix socket.

+
+ +
-p PORT or --port=PORT + +
+

Connects using the specified PORT number. Can be a comma-separated list of port numbers, and multiple +port arguments are allowed. If no port number is given, we default to port 5432.

+
+ +
-db NAME or --dbname=NAME + +
+

Specifies which database to connect to. Can be a comma-separated list of names, and multiple dbname +arguments are allowed. If no dbname option is provided, defaults to 'postgres' if the psql +version is version 8 or greater, and 'template1' otherwise.

+
+ +
-u USERNAME or --dbuser=USERNAME + +
+

The name of the database user to connect as. Can be a comma-separated list of usernames, and multiple +dbuser arguments are allowed. If this is not provided, defaults to 'postgres'.

+
+ +
--dbpass=PASSWORD + +
+

Provides the password to connect to the database with. Use of this option is highly discouraged. +Instead, one should use a .pgpass file.

+
+ +
+

Connection options can be grouped: --host=a,b --host=c --port=1234 --port=3344 +would connect to a-1234, b-1234, and c-3344. Note that once set, an option +carries over until it is changed again.

+

Examples:

+
+  --host=a,b --port=5433 --db=c
+  Connects twice to port 5433, using database c, to hosts a and b
+  a-5433-c b-5433-c
+
+  --host=a,b --port=5433 --db=c,d
+  Connects four times: a-5433-c a-5433-d b-5433-c b-5433-d
+
+  --host=a,b --host=foo --port=1234 --port=5433 --db=e,f
+  Connects six times: a-1234-e a-1234-f b-1234-e b-1234-f foo-5433-e foo-5433-f
+
+  --host=a,b --host=x --port=5432,5433 --dbuser=alice --dbuser=bob -db=baz
+  Connects three times: a-5432-alice-baz b-5433-alice-baz x-5433-bob-baz
+

+

+
+

OTHER OPTIONS

+

Other common options include:

+
+
PSQL=PATH + +
+

Tells the script where to find the psql program. Useful if you have more than one version of the psql executable +around, or if it is not in your path. Note that this option is in all uppercase. By default, this option is +not allowed. To enable it, you must change the $NO_PSQL_OPTION near the top of the script to 0. Avoid using +this option if you can, and instead hard-code your psql location into the $PSQL variable, also near the top +of the script.

+
+ +
-t VAL or --timeout=VAL + +
+

Sets the timeout in seconds after which the script will abort whatever it is doing and return an UNKNOWN +status. The timeout is per Postgres cluster, not for the entire script. The default value is 10; the units +are always in seconds.

+
+ +
-h or --help + +
+

Displays a help screen with a summary of all actions and options.

+
+ +
-V or --version + +
+

Shows the current version.

+
+ +
-v or --verbose + +
+

Set the verbosity level. Can call more than once to boost the level. Setting it to three or higher (in other words, +issuing -v -v -v) turns on debugging information for this program which is sent to stderr.

+
+ +
--test + +
+

Enables test mode. See the TEST MODE section below.

+
+ +
--showperf=VAL + +
+

Determines if we output performance data in standard Nagios format (at end of string, after a pipe symbol, using +name=value). VAL should be 0 or 1. The default is 1.

+
+ +
--perflimit=i + +
+

Sets a limit s to how many items of interest are reported back when using the showperf option. This only has +an effect for actions that return a large number of items, such as table_size. The default is 0, or no limit. +Be careful when using this with --include or --exclude, as those restrictions are done after the query has +been run, and thus your limit may not include the items you want.

+
+ +
--showtime=VAL + +
+

Determines if the time taken to run each query is shown in the output. VAL should be 0 or 1. The default is 1. +No effect unless showperf is on.

+
+ +
--action=NAME + +
+

States what action we are running as. Required unless using a symlinked file, in which case the name of the file +is used to figure out the action.

+
+ +
+

+

+
+

ACTIONS

+

The script runs one or more actions. This can either be done with the --action +flag, or by using a symlink to the main file that contains the name of the action +inside of it. For example, to run the action "timesync", you may either issue:

+
+  check_postgres.pl --action=timesync
+

or use a program named:

+
+  check_postgres_timesync
+

All the symlinks are created for you if use the action "build_symlinks":

+
+  perl check_postgres.pl --action="build_symlinks"
+

If the file name already exists, it will not be overwritten. If the file exists +and is a symlink, you can force it to overwrite by using "build_symlinks_force"

+

Most actions take a --warning and an -critical option, indicating at what point we change from OK to WARNING +and then to CRITICAL. Note that because criticals are always checked first, setting the warning equal to the +critical is an effective way to turn warnings off and always give a critical.

+

The current supported actions are:

+
+
backends (symlink: check_postgres_backends) + +
+

Checks the current number of connections for one or more databases, and optionally comparing it to the maximum +allowed, which is determined by the 'max_connections' setting. The warning and option can take one of three forms. +First, a simple number can be given, which represents the number of connections at which the alert will be given. +This choice does not use the max_connections setting. Second, the percentage of available connections can be given. +Third, a negative number can be given which represents the number of connections left until max_connections is +reached. The default values for warning and critical are '90%' and '95%'. This action also supports the use of the +include and exclude options to filter out specific databases: see the INCLUDES section below for more detail.

+
+
+

Example 1: Give a warning when the number of connections on host quirm reaches 120, and a critical if it reaches 140. + check_postgres_backends --host=quirm --warning=120 --critical=150

+
+
+

Example 2: Give a critical when we reach 75% of our max_connections setting on hosts lancre or lancre2. + check_postgres_backends --warning='75%' --critical='75%' --host=lancre,lancre2

+
+
+

Example 3: Give a warning when there are only 10 more connection slots left on host plasmid, and a critical +when we have only 5 left. + check_postgres_backends --warning=-10 --critical=-5 --host=plasmid

+
+
+

Example 4: Check all databases except those with "test" in their name, but allow ones that are named "pg_greatest". Connect as port 5432 on the first two hosts, and as port 5433 on the third one. We want to always throw a critical when we reach 30 or more connections.

+
+
+
+ check_postgres_backends --dbhost=hong,kong --dbhost=fooey --dbport=5432 --dbport=5433 --warning=30 --critical=30 --exclude="~test" --include="pg_greatest,~prod"
+
+ +
bloat (symlink: check_postgres_bloat) + +
+

Checks the amount of bloat in tables and indexes. This action requires that stats collection be enabled on the +target databases, and that ANALYZE is run frequently as well. The --include and --exclude options can be used to +filter out which tables to look at: see the INCLUDE section below for more details. The --warning and --critical +option must be specified in sizes. Valid units are bytes, kilobytes, megabytes, gigabytes, terabytes, and exabytes. +You can abbreviate all of those with the first letter. Items without units are assumed to be 'bytes'. The default values +are '1 GB' and '5 GB'. The number represents the number of "wasted bytes", or the difference between what is actually +used by the table and index, and what we compute it should be.

+
+
+

Note that this action has two hard-coded values to avoid false alarms on smaller relations. Tables must have at +least 10 pages, and indexes at least 15, before they can be considered by this test. If you really want to adjust +these values, you can look for the variables $MINPAGES and $MINIPAGES at the top of the check_bloat subroutine.

+
+
+

Please note that the values computed by this action are not precise, and should be used as a guideline only. Great +effort was made to estimate the correct size of a table, but in the end it is only an estimate. The correct index size is +much more of a guess than the correct table size, but both should give a rough idea of how bloated they are.

+
+
+

Example 1: Warn if any table on port 5432 is over 100 MB bloated, and critical if over 200 MB + check_postgres_bloat --port=5432 --warning='100 M', --critical='200 M'

+
+
+

Example 2: Give a critical if table 'orders' on host 'sami' has more than 10 megs of bloat + check_postgres_bloat --host=sami --include=orders --critical='10 MB'

+
+ +
connection (symlink: check_postgres_connection) + +
+

Simply connects, issues a 'SELECT version()', and leaves. +Takes no --warning or --critical options.

+
+ +
database_size (symlink: check_postgres_database_size) + +
+

Checks the size of all databases and complains when they are too big. Makes no sense to run this more than once +per cluster. Databases can be filtered with the --include and --exclude options: See the INCLUDE section below for more +detail. The warning and critical can be specified as bytes, kilobytes, megabytes, gigabytes, terabytes, or exabytes. +Each may be abbreviated to the first letter as well. If no unit is given, the unit is assumed to be bytes. +There are not defaults for this action: the warning and critical must be specified. The warning cannot be greater than +the critical. The output returns all databases sorted by size largest first, with both bytes and a "pretty" form +returned.

+
+
+

Example 1: Warn if any database on host flagg is over 1 TB in size, and critical if over 1.1 TB. + check_postgres_database_size --host=flagg --warning='1 TB' --critical='1.1 t'

+
+
+

Example 2: Give a critical if the database template1 on port 5432 is over 10 MB. + check_postgres_database_size --port=5432 --include=template1 --warning='10MB' --critical='10MB'

+
+ +
disk_space (symlink: check_postgres_disk_space) + +
+

Checks on the available physical disk space used by Postgres. This action requires that you have the executable "/bin/df" +available to report on disk sizes, and it requires that it be run as a superuser, so it can examine the 'data_directory' +setting inside of Postgres. The --warning and --critical options are given in either sizes or percentages. If using sizes, +the standard unit types are allowed: bytes, kilobytes, gigabytes, megabytes, gigabytes, terabytes, or exabytes. Each +may be abbreviated to the first letter only; no units at all indicates 'bytes'. The default values are '90%' and '95%'.

+
+
+

This command checks the following things to determine all of the different physical disks being used by Postgres.

+
+
+
data_directory + +
+

The disk that the main data directory is on.

+
+ +
log directory + +
+

The disk that the log files are on.

+
+ +
WAL file directory + +
+

The disk that the write-ahead logs are on (e.g. symlinked pg_xlog)

+
+ +
tablespaces + +
+

Each tablespace that is on a separate disk

+
+ +
+

The output shows the total size used and available on each disk, as well as the percentage, ordered by highest to lowest +percentage used. Each item above maps to a file system: these can be included or excluded: see the INCLUDE section below +for more information on the --include and --exclude options.

+

Example 1: Make sure that no file system is over 90% for the database on port 5432. + check_postgres_disk_space --port=5432 --warning='90%' --critical="90%'

+

Example 2: Check that all file systems starting with /dev/sda are smaller than 10 GB and 11 GB (warning and critical) + check_postgres_disk_space --port=5432 --warning='10 GB' --critical='11 GB' --include=~^/dev/sda

+
index_size (symlink: check_postgres_index_size) + +
table_size (symlink: check_postgres_table_size) + +
relation_size (symlink: check_postgres_relation_size) + +
+

The actions table_size and index_size are simply variations of the relation_size index, which checks for a relation +that has grown too big. Relations (in other words, tables and indexes) can be filtered with the --include and +--exclude options: See the INCLUDE section below for more detail. The warning and critical are given in file sizes, and +can have units of bytes, kilobytes, megabytes, gigabytes, terabytes, or exabytes. Each can be abbreviated to the +first letter, only. If no units are given, bytes is assumed. There are no default values: both warning and critical +must be given. The return text shows the size of the largest relation found.

+
+
+

If the showperf option is enabled, all of the relations with their sizes will be given. To prevent this, is +is recommended that you set the perflimit, which will cause the query to do a ORDER BY size DESC LIMIT (perflimit).

+
+
+

Example 1: Give a critical if any table is larger than 600MB on host burrick. + check_postgres_table_size --critical='600 MB' --warning='600 MB' --host=burrick

+
+
+

Example 2: Warn if the table products is over 4 GB in size, and give a critical at 4.5 GB. + check_postgres_table_size --host=burrick --warning='4 GB' --critical='4.5 GB' --include=products

+
+ +
last_analyze (symlink: check_postgres_last_analyze) + +
last_vacuum (symlink: check_postgres_last_vacuum) + +
+

Checks how long it has been since vacuum (or analyze) was last run on each table in one or more databases. This requires +that stats_rows_level is enabled, and the target database must be version 8.2 or higher. Tables can be excluded and +included: see the INCLUDE section below for details. The units for --warning and --critical are times. Valid units are +seconds, minutes, hours, and days; all can be abbreviated to the first letter. If no units are given, 'seconds' is assumed. +The default values are '1 day' and '2 days'. Please note that there are cases in which this field does not get +automatically populated. If certain tables are giving you problems, make sure that they have dead rows to vacuum, +or just exclude them from the test.

+
+
+

Example 1: Warn if any table has not been vacuumed in 3 days, and give a critical at a week, for host wormwood + check_last_vacuum --host=wormwood --warning='3d' --critical='7d'

+
+ +
listener (symlink: check_postgres_listener) + +
+

Confirm that someone is listening for one or more specific strings. Only one of warning or critical is needed. The format +is a simple string representing the LISTEN target, or a tilde character followed by a string for a regular expression +check.

+
+
+

Example 1: Give a warning if nobody is listening for the string bucardo_mcp_ping on ports 5555 and 5556 + check_postgres_listener --port=5555,5556 --warning=bucardo_mcp_ping

+
+
+

Example 2: Give a critical if there are no active LISTEN requests matching 'grimm' on database oskar + check_postgres_listener --db oskar --critical=~grimm

+
+ +
locks (symlink: check_postgres_locks) + +
+

Check the total number of locks on one or more databases. Makes no sense to run this more than once per cluster. +Databases can be filtered with the --include and --exclude options: See the INCLUDE section below for more detail. +The warning and critical can be specified as simple numbers, which represent the total number of locks, or they can +be broken down by type of lock. Valid lock names are "total", "waiting", or a type of lock used by Postgres. +These names are case-insensitive and do not need the "lock" part on the end, so 'exclusive' will match +'ExclusiveLock'. The format is name=number, with different items separated by semicolons.

+
+
+

Example 1: Warn if the number of locks is 100 or more, and critical if 200 or more, on host garrett + check_postgres_locks --host=garrett --warning=100 --critical=200

+
+
+

Example 2: On the host artemus, warn if 200 or more locks exist, and give a critical if over 250 total locks exist, +or if over 20 exclusive locks exist, or if over 5 connections are waiting for a lock. + check_postgres_locks --host=artemus --warning=200 --critical="total=250;waiting=5;exclusive=20"

+
+ +
logfile (symlink: check_postgres_logfile) + +
+

Ensures that the logfile is in the expected location and is being logged to. This action issues a command that throws +an error on each database it is checking, and ensures that the message shows up in the logs. It scans the various +log_* settings inside of Postgres to figure out where the logs should be. If you are using syslog, it does a rough +but not foolproof scan of /etc/syslog,conf. Alternatively, you can provide the name of the logfile with the --logfile +option. This is especially useful if the logs have a custom rotation scheme driven be an external program. The +--logfile option supports the following escape characters: %Y %m %d %H, which represent the current year, month, date, +and hour respectively. An error is always reported as critical unless the warning option has been passed in as a +non-zero value. Other than that specific usage, the --warning and --critical options should not be used.

+
+
+

Example 1: On port 5432, ensure the logfile is being written to the file /home/greg/pg8.2.log + check_postgres_logfile --port=5432 --logfile=/home/greg/pg8.2.log

+
+
+

Example 2: Same as above, but raise a warning, not a critical + check_postgres_logfile --port=5432 --logfile=/home/greg/pg8.2.log -w 1

+
+ +
query_runtime (symlink: check_postgres_query_runtime) + +
+

Checks how long a specific query takes to run, by executing a "EXPLAIN ANALYZE" against it. The --warning and --critical +options are the maximum amount of time the query should take. Valid units are seconds, minutes, and hours; any can be +abbreviated to the first letter. If no units are given, 'seconds' is assumed. Both warning and critical must be given. +The name of the view or function to be run must be passed in to the --queryname +option. It must consist of a single word (or schema.word format), with optional parens at the end.

+
+
+

Example 1: Give a critical if the function named "speedtest" fails to run in 10 seconds or less. + check_postgres_query_runtime --queryname='speedtest()' --critical=10 --warning=10

+
+ +
query_time (symlink: check_postgres_query_time) + +
+

Checks the length of running queries on one or more databases. It makes no sense to run this more than once +on the same cluster (all databases are returned no matter where you connect from). Databases can be included or +excluded with the --include and --exclude option: see the INCLUDE section below for more details. The warning and +critical options are an amount of time, and default to '2 minutes' and '5 minutes'. Valid units are 'seconds', 'minutes', +'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, +the unit is assumed to be seconds.

+
+
+

Example 1: Give a warning if any query has been running longer than 3 minutes, and a critical if longer than 5 minutes. + check_postgres_query_time --port=5432 --warning='3 minutes' --critical='5 minutes'

+
+
+

Example 2: Using default values (2 and 5 minutes), check all databases except those starting with 'template'. + check_postgres_query_time --port=5432 --exclude=~^template

+
+ +
txn_time (symlink: check_postgres_txn_time) + +
+

Checks the length of open transactions on one or more databases. It makes no sense to run this more than once +on the same cluster (all databases are returned no matter where you connect from). Databases can be included or +excluded with the --include and --exclude option: see the INCLUDE section below for more details. The warning and +critical options are an amount of time, and must be provided (no default). Valid units are 'seconds', 'minutes', +'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, +the unit is assumed to be seconds. Requires Postgres 8.3 or better.

+
+
+

Example 1: Give a critical if any transaction has been open for more than 10 minutes: + check_postgres_txn_time --port=5432 --critical='10 minutes'

+
+ +
txn_idle (symlink: check_postgres_txn_idle) + +
+

Checks the length of "idle in transaction" queries on one or more databases. It makes no sense to run this more than once +on the same cluster (all databases are returned no matter where you connect from). Databases can be included or +excluded with the --include and --exclude option: see the INCLUDE section below for more details. The warning and +critical options are an amount of time, and must be provided (no default). Valid units are 'seconds', 'minutes', +'hours', or 'days'. Each may be written singular or abbreviated to just the first letter. If no units are given, +the unit is assumed to be seconds. Requires Postgres 8.3 or better.

+
+
+

Example 1: Give a warning if any connection has been idle in transaction for more than 15 seconds: + check_postgres_txn_idle --port=5432 --warning='15 seconds'

+
+ +
rebuild_symlinks + +
rebuild_symlinks_force + +
+

This action requires no other arguments, and does not create to any databases, but simply creates symlinks for +each action, in the form "check_postgres_<action_name>". If the file already exists, it will not be overwritten. +If the action is rebuild_symlinks_force, then symlinks will be overwritten.

+
+ +
settings_checksum (symlink: check_postgres_settings_checksum) + +
+

Check that all the Postgres settings are the same as last time you checked. This is done by generating a checksum +of a sorted list of setting names and their values. Note that different users in the same database may have +different checksums, due to ALTER USER usage, and due to the fact that superusers see more settings than +ordinary users. Either the --warning or the --critical should be given. but not both. The value of each one is +the checksum, a 32-character hexadecimal value. You can run with the special --critical=0 option to find out +an existing checksum.

+
+
+

This action requires the Digest::MD5 module.

+
+
+

Example 1: Find the initial checksum for the database on port 5555 using the default user (usually postgres) + check_postgres_settings_checksum --port=5555 --critical=0

+
+
+

Example 2: Make sure no settings have changed and warn if so, using the checksum from above. + check_postgres_settings_checksum --port=5555 --warning=cd2f3b5e129dc2b4f5c0f6d8d2e64231

+
+ +
timesync (symlink: check_postgres_timesync) + +
+

Compares the local system time with the time reported by one or more databases. The warning and critical options represent +the number of seconds at which the warning or critical should be given. If neither is specified, the default values +are used, which are '2' and '5'. The warning cannot be greater than the critical. Due to the non-exact nature of this +test, a value of '0' or '1' is not recommended.

+
+
+

The string returned shows the time difference as well as the time on each side written out.

+
+
+

Example 1: Check that databases on hosts ankh, morpork, and klatch are no more than 3 seconds off from the local time: + check_postgres_timesync --host=ankh,morpork.klatch --critical=3

+
+ +
txn_wraparound (symlink: check_postgres_txn_wraparound) + +
+

Checks how close to transaction wraparound one or more databases are getting. The warning and critical indicate +the number of transactions left and must be a positive integer. If either is not given, the default values of +1.3 and 1.4 billion are used. It makes no sense to run this check more than once on a single cluster. For a more +detailed discussion of what this number represents and what to do about it, please visit the page +http://www.postgresql.org/docs/current/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

+
+
+

The warning and value can have underscores in the number for legibility, as Perl does.

+
+
+

Example 1: Check the default values for the localhost database + check_postgres_txn_wraparound --host=localhost

+
+
+

Example 2: Check port 6000 and give a critical at 1.7 billion transactions left: + check_postgres_txn_wraparound --port=6000 --critical=1_700_000_000t

+
+ +
wal_files (symlink: check_postgres_wal_files) + +
+

Checks how many WAL files exist in the pg_xlog file, which is found off of your data directory, sometimes +as a symlink to another disk for performance reasons. This must be run as a superuser, in order to +access the contents of the pg_xlog directory. The minimum version to use this action is 8.1. The +warning and critical are simply the number of files in the pg_xlog directory. What number to set this +to will vary, but a general guideline is to put a number slightly higher than what is normally there, +to catch problems early.

+
+
+

Normally, WAL files are closed and then re-used, but a long-running open transaction, or a faulty +log shipping method, may cause Postgres to create too many files. Ultimately, this will cause the +disk they are on to run out of space, at which point Postgres will shut down.

+
+
+

Example 1: Check that the number of WAL files is 20 or less on host "pluto" + check_postgres_txn_wraparound --host=pluto --critical=20

+
+ +
version (symlink: check_version) + +
+

Checks that the required version of Postgres is running. The --warning and --critical arguments (only one is required) +must be of the format X.Y or X.Y.Z where X is the major version number, Y is the minor version number, and Z is the +revision.

+
+
+

Example 1: Give a warning if the database on port 5678 is not version 8.4.10: + check_postgres_version --port=5678 -w=8.4.10

+
+
+

Example 2: Give a warning if any databases on hosts valley,grain, or sunshine is not 8.3: + check_postgres_version -H valley,grain,sunshine --critical=8.3

+
+ +
+

+

+
+

INCLUSION AND EXCLUSION

+

The options --include and --exclude can be combined to limit which things are checked, depending on the action. +The name of the database can be filtered when using the following actions: +backends, database_size, last_vacuum, last_analyze, locks, and query_time. +The name of a relation can be filtered when using the following actions: +bloat, index_size, table_size, and relation_size. +The name of a setting can be filtered when using the settings_checksum action. +The name of a file system can be filtered when using the disk_space action. +The name of a setting can be filtered when using the settings_checksum action.

+

If only an include option is given, then ONLY those entries that match will be checked. However, if given +both exclude and include, the exclusion is done first, and the inclusion second to reinstate things that +may have been excluded. Both --include and --exclude can be given multiple times, or as comma-separated lists. +A leading tilde will match the following word as a regular expression.

+

Examples:

+
+ --include=pg_class
+ Only checks items named pg_class
+
+ --include=~pg_
+ Only checks items containing the letters 'pg_'
+
+ --include=~^pg_
+ Only check items beginning with 'pg_'
+
+ --exclude=test
+ Exclude the item named 'test'
+
+ --exclude=~test
+ Exclude all items containing the letters 'test
+
+ --exclude=~ace --include=faceoff
+ Exclude all items containing the letters 'ace', but allow the item 'faceoff'
+
+ --exclude=~^pg_,~slon,sql_settings --exclude=green --include=~prod,pg_relname
+ Exclude all items which start with the letters 'pg_', which contain the letters 'slon', or which are named 
+ 'sql_settings' or 'green'. Specifically check items with the letters 'prod' in their names, and always 
+ check the item named 'pg_relname'.
+

+

+
+

TEST MODE

+

To help in setting things up, this program can be run in a "test mode" by specifying the --test option. This will +perform some basic tests to make sure that the databases can be contacted, and that certain per-action prerequisites +are met. Currently, we check that the user is a superuser if required by that action, and that the version of Postgres +is new enough for those actions that depend on a specific version.

+

+

+
+

DEPENDENCIES

+
+
Access to a working version of psql + +
Some very standard Perl modules: + +
+
Getopt::Long + +
File::Basename + +
File::Temp + +
Time::HiRes (if opt{showtime} is set to true, which is the default) + +
+
+

The 'settings_checksum' action requires the Digest::MD5 module.

+

Some actions require access to external programs. If psql is not explicitly specified, the command +'which' is used to find it. The program "/bin/df" is needed by the 'check_disk_space' action.

+

+

+
+

DEVELOPMENT

+

Development happens using the git system. You can clone the latest version by doing: + git-clone http://bucardo.org/nagios_postgres.git

+

+

+
+

HISTORY

+

Items not specifically attributed are by Greg Sabino Mullane.

+
+
Version 1.4.1 + +
+

Fix bug preventing --dbpass argument from working (Robert Treat)

+
+ +
Version 1.4.1 + +
+

Minor documentation fixes.

+
+ +
Version 1.4.0 + +
+

Have check_wal_files use pg_ls_dir (idea by Robert Treat)

+
+
+

For last_vacuum and last_analyze, respect autovacuum effects, add separate +autovacuum checks (ideas by Robert Treat)

+
+ +
Version 1.3.1 + +
+

Have txn_idle use query_start, not xact_start

+
+ +
Version 1.3.0 + +
+

Add in txn_idle and txn_time actions.

+
+ +
Version 1.2.0 + +
+

Add the check_wal_files method, which counts the number of WAL files +in your pg_xlog directory.

+
+
+

Fix some typos in the docs.

+
+
+

Explicitly allow -v as an argument.

+
+
+

Allow for a null syslog_facility in check_logfile

+
+ +
Version 1.1.2 + +
+

Fix error preventing --action=rebuild_symlinks from working.

+
+ +
Version 1.1.1 + +
+

Switch vacuum and analyze date output to use 'DD', not 'D'. (Glyn Astill)

+
+ +
Version 1.1.0 + +
+

Fixes, enhancements, and performance tracking, December 2007

+
+
+

Add performance data tracking via --showperf and --perflimit

+
+
+

Lots of refactoring and cleanup of how actions handle arguments.

+
+
+

Do basic checks to figure out syslog file for 'logfile' action.

+
+
+

Allow for exact matching of beta versions with 'version' action.

+
+
+

Redo the default arguments to only populate when neither 'warning' nor 'critical' is provided.

+
+
+

Allow just warning OR critical to be given for the 'timesync' action.

+
+
+

Remove 'redirect_stderr' requirement from 'logfile' due to 8.3 changes.

+
+
+

Actions 'last_vacuum' and 'last_analyze' are 8.2 only (Robert Treat)

+
+ +
Version 1.0.16 + +
+

First public release, December 2007

+
+ +
+

+

+
+

BUGS AND LIMITATIONS

+

The index bloat size optimization is still very rough.

+

Some actions may not work on older versions of Postgres (before 8.0).

+

Please report any problems to greg@endpoint.com.

+

+

+
+

AUTHOR

+

Greg Sabino Mullane <greg@endpoint.com>

+

+

+
+

LICENSE AND COPYRIGHT

+

Copyright (c) 2007-2008 Greg Sabino Mullane <greg@endpoint.com>.

+

Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met:

+
+  1. Redistributions of source code must retain the above copyright notice, 
+     this list of conditions and the following disclaimer.
+  2. Redistributions in binary form must reproduce the above copyright notice, 
+     this list of conditions and the following disclaimer in the documentation 
+     and/or other materials provided with the distribution.
+

THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO +EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT +OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING +IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY +OF SUCH DAMAGE.

+ + + +