Skip to content

Python Solar Node Monitoring with Discord Notification

License

Notifications You must be signed in to change notification settings

osrn/lazy-delegate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lazy Delegate - Python Solar Node Monitoring with Discord Notification Facility

Solar Delegate Node monitoring script for running periodic health checks and reporting via Discord. Following facilities are monitored:

Host status

  • last boot time and pending restart
  • cpu load
  • memory usage
  • swap usage
  • disk usage

Node processes

Network status

  • Relay sync status and lag
  • Forger missed blocks
  • Forger rank
  • Delegate voters

Discord embed side colors indicate alert status and the probes causing alert are displayed as bold underlined code style.


Project and most of the probes were inspired by Solar Delegate @mtaylan 's Solar Node Monitoring scripts


Requires

  • Python3
  • Python virtual environment
  • Process Manager 2 (pm2)
  • Webhook url associated with a Discord server & channel

Update instructions for 0.55b

pm2 stop lazy-delegate
cd ~/lazy-delegate/ && git pull
. .venv/bin/activate
pip3 install -r requirements.txt && deactivate

add the following in ~/lazy-delegate/src/config/config

DEBUG=0
DISCORD_USER='<@your_userid_not_bots>'

start and check logs

pm2 start lazy-delegate && pm2 logs lazy-delegate

Installation

Replace SUDO_USER with a username having sudo rights (i.e. having sudo group) and run command below

cd && bash <(curl -s https://raw.githubusercontent.com/osrn/lazy-delegate/main/install.sh) SUDO_USER

move on to the configuration

Discord channel webhook creation is common knowledge hence not mentioned here.


Configuration

clone the sample config provided

cd ~/lazy-delegate && cp src/config/config.sample src/config/config


config options

PM2='path-to-pm2-executable'

Path to pm2 executable


CHK_FORGER=1

Enable(1)/disable(0) monitoring PM2 Forger process. Relay process is always checked.


CHK_TBW=1

Enable(1)/disable(0) monitoring PM2 TBW-pool and TBW-pay processes


CHK_POOL=1

Enable(1)/disable(0) monitoring PM2 TBW-pool process


NODE_IP=xx.xx.xx.xx

IP address of the forger node to be monitored - as registered in PEER LIST


DELEGATE_NAME='xxxx'

Registered delegate name for the forger node


RANKLIMIT=52

Alert will be produced when rank > RANKLIMIT


LOCAL_API='http://127.0.0.1:6003/api'

Default is local node to query API. However, this can be set to any relay node with public API


NET_API='https://sxp.mainnet.sh/api'

Best to point to the public API for the network. Yet, it is ok to set to any relay node with public API, or even localhost. Remember to change, when Mainnet.


PRERELEASE=0

Set to 0 for Mainnet and 1 if Testnet


PROBE_CYCLE = 120

Probe execution (health check) interval in seconds. Notice that a value < 60 may suffer from github API rate limiting with a 403 Forbidden response.


DEBUG = 0

Set to 1 for verbose logging


HEARTBEAT_CYCLE = 3600

Interval in seconds for heartbeat messages sent to discord.


DISCORD_HOOK='https://discord.com/api/webhooks/xxxxx/yyyyyyyyyy'

Discord hook :)


DISCORD_USER='your_userid_not_bots'

Userid of the discord user to notify with a @mention for alert situation. User will not be mentioned if no alert or alert ceased.


Run

start the app and monitor logs

cd ~/lazy-delegate && pm2 start package.json && pm2 logs lazy-delegate

to start the app at boot with pm2

cd && pm2 save

to start pm2 at boot;

Option 1) Have sudo privileges? pm2 startup and follow the instructions

Option 2) No sudo privileges like solar? (crontab -l; echo "@reboot /bin/bash -lc \"source /home/solar/.solarrc; pm2 resurrect\"") | sort -u - | crontab -


Maintenance

to stop|start|restart the process on-the-fly

pm2 stop|start|restart lazy-delegate

Whenever the config file changes, app needs to be restarted

pm2 restart lazy-delegate

to remove the process for whatever reason:

pm2 stop lazy-delegate
pm2 delete lazy-delegate
# optionally, remove logs
rm ~/.pm2/logs/lazy-delegate*

Inside the mind

Node is probed periodically for health checks and any issues raised or cleared during the rest period are reported to Discord instantly. Issues are reported only once, the first time.

Probe class is responsible for keeping track of the values and governing the alarm raising and clearing logic.

A heartbeat status report is sent in regular intervals. Any missing report should indicate a problem with the host, node or lazy-delegate app itself.

Change Log

v0.62b

Solar Core 3.3.0-next.3 API compatibility

  • Adaptation for Solar Core 3.3.0-next.3 API change for use of block id in delegate attribute

v0.61b

fix: testnet release version

  • added config option PRERELEASE to specify which core version to check against; release(0) and prelease(1) branches

v0.6b

script start/stop handler

  • added handler for SIGINT and SIGTERM for cleanup and service status notification
  • added rank alert limit. To set, add RANKLIMIT=xx in the config. Default value is 52.

v0.56b

notification and alert improvements

  • notification when rank changes
  • notification when voter count changes
  • voter count change is not an alert reason in heartbeat anymore
  • notification now includes a footer with timestamp

v0.55b

better notification for alert conditions

  • mention user in alert condition message to receive notification
  • better visibility for probes causing alert condition in heartbeat
  • turn on/off verbose logging via config. colored output for debug messages

v0.54b

  • fix: error in last block produced check before epoch

v0.53b

  • fix: rank, voters and missed blocks should not be reported if CHK_FORGER=0
  • minor doc changes

v0.52b

  • PM2 executable path is now read from the config (solar core 3.2.0-next2 does not export alias to user's shell)
  • TBW-pool process probe now can be enabled/disabled independent of TBW-tbw & TBW-pay probes
  • fixed forger process alert condition test

v0.51b install.sh

  • non-sudo user friendly installation for required apt packages
  • rewrites CPATH to prevent python package compilation errors (CPATH is restored back afterwards)
  • Added missing python3-pip APT package to the installation
  • Stop jobs before complete reinstall

documentation

  • how to start pm2 at boot

v0.5b

  • Values for probes with an active alert are now shown as codeblock in heartbeat status message
  • An info message will be sent to the discord channel if delegate gained any voters during the rest period

TODO

  • Add probe for Lazy Delegate version

About

Python Solar Node Monitoring with Discord Notification

Resources

License

Stars

Watchers

Forks

Packages

No packages published