Documentation
Snapshots-Check
Checks the disk space occupied by volume-snapshops.
Usage
$ check_netapp.pl Snapshots -H <hostname|IP> -u <user%pass> [-n <volname>] [ -b vol | reserve ] [-w ] [-c ] [ ... ]
Description
This plugin checks the disk space used by snapshots. Depending on the --base the total cumulated size of all snapshots is either compared to the size of the snap-reserve or the size of the volume. The 'size of the volume' is defined as volume-size plus the snap-reserve. The user is warned if the snapshots occupy more than a certain part of the snap-reserve or the total size of the volume. If the metric is switched to absolute and dynamic thresholds are used, these variables are defined as follows: MAX: volume-size plus the snap-reserve VOL_SIZE: volume-size w/o any snap-reserve (If the metric is relative, these VARs are all 100 - but you won't need them in this case.') Skipping Volumes: If the option --instance|n is not specified, all volumes are checked. Volumes will be skipped, if at least one of the following is true: - The size is smaller or equal to zero (volume restricted or offline) - 'snapshot-blocks-reserved' is not defined - Volume is excluded by means of --exclude|X Use -v to see, which volumes are skipped. Due to the complexity of this plugin, a lot of requests are sent to the filer (depending on the number of snapshots and volumes). Therefore it may be advisable to have a timeout greater than 30 seconds. Consider using -v for debugging, if you are not satisfied with the result.
Simple Examples
Note: The '-u' switch is missing in all the following examples, so the filer-credentials are taken from the default-file: '/usr/local/nagios/etc/netapp_credentials'
$ check_netapp Snapshots -H filer
Check all volumes and alert if the default values for the warning or critical thresholds are exceeded. The default-values are documented in the section for the '--warning'-switch.
$ check_netapp Snapshots -H filer --base=vol
Same as above (--base by default is set to vol), snapsize is calculated relative to the volumesize.
$ check_netapp Snapshots -H filer --base=reserve
Same as above, but the snapsize is calculated relative to the snapreserve (and not the volumesize) Always results in a critical exit, if at least one volume has no snapreserve. (See examples below and --check_only for a solution.)
$ check_netapp Snapshots -H filer -w 15 -c 30
Warn if snapshots use more than 15% of the volumes size, send a critical alert if they occupy more than 30%.
$ check_netapp Snapshots -H filer -w 50 -c 80 --base=reserve
Warn if snapshots use more than 50% of the volumes snap-reserve, send a critical alert if they occupy more than 80%.
$ check_netapp Snapshots -H filer -n vol0 -w 65 -c 80
Check only vol0 and warn if more than 65% of the volumes size is used for snapshots, critical if more than 80%.
$ check_netapp Snapshots -H filer --metric=absolute -w 50 -c 80 --factor=Gi
Warn if any snapshot is over 50 Giga Byte (GB), critical if over 80 GB.
Advanced Examples
Note: Host- and credential-arguments are omitted. So you have to fill in -H$ check_netapp Snapshots -H filer --metric=absolute -w =MAX*60/100 -c =MAX*95/100
Returns absolute values and warns if 60% of the volume is occupied by snapshots, critical alert if over 95%.
$ check_netapp Snapshots -H filer --base=reserve --check_only=with_reserve
Check all volumes, which have a snap-reserve. Calculate the relative usage of snapshots based on the size of the snap-reserve.
$ check_netapp Snapshots -H filer --base=vol --check_only=no_reserve
Check all volumes which do not have a snap-reserve. Calculate the relative usage of snapshots based on the size of the volume.
$ check_netapp Snapshots -H filer --metric=number --younger_than=24h --older_than=7h --critical=1:
Check all volumes for yesterdays snapshot-copy. Critical if not a single snapshot from yesterday is found. This-check example is scheduled to run daily at 7am. In this case, 'yesterday' is defined as the last 24 hours and not as the last 7 hours.
$ check_netapp Snapshots -H filer --metric=number --older_than=14d --critical=1
Check all volumes for snapshots older than 2 weeks. Critical alert if one or more outdated snapshots are found.
$ check_netapp Snapshots -H filer --metric=number --older_than=1d --name_matches=snmv --critical=1
Check all volumes for left-over snmv-snapshots (older than 1 day and the snapshot-name contains 'snmv').
$ check_netapp Snapshots -H filer --metric=relative --older_than=7d -w 10 -c 50
Check all volumes for snapshots older than 1 week. Warns if these older snapshots occupy more than 10% of the volumes space.
$ check_netapp Snapshots -H filer --metric=relative --older_than=7d -w 10 -c 50 --base=reserve
Check all volumes for snapshots older than 1 week. Warns if these older snapshots occupy more than 10% of the volumes snap-reserve.