Enable IPMI Watchdog
Watchdog is a very useful tool. When you server is down or freeze, watchdog can automatically restart the machine.
First, Enable the kernel module by modifying the modprobe at “/etc/modules-load.d/modules.conf”
echo "ipmi_devintf" >> /etc/modules-load.d/modules.conf echo "ipmi_si" >> /etc/modules-load.d/modules.conf echo "ipmi_msghandler" >> /etc/modules-load.d/modules.conf echo "ipmi_watchdog" >> /etc/modules-load.d/modules.conf
Second, install the ipmi tools.
sudo apt install openipmi sudo apt install ipmitool sudo apt install watchdog
Third, modify and insert the following line in “/etc/default/openipmi”
sudo vim /etc/default/openipmi #modify the following parameter IPMI_Watchdog=yes IPMI_WATCHDOG_OPTIONS="timeout=240 action=reset"
Forth, modify the watchdog config at “/etc/watchdog.conf”
sudo vim /etc/watchdog.conf #modify the following parameter watchdog-device = /dev/watchdog watchdog-timeout = 240 interval = 10
The watchdog will timeout in 240 seconds. The system will reset the watchdog timer according to the “interval” attribute which is 10 seconds.
Finally, if you want to run the watchdog at startup, modify the WantedBy parameter at “/lib/systemd/system/watchdog.service”
sudo vim /lib/systemd/system/watchdog.service #Modify the following parameter [Install] WantedBy=multi-user.target
Then, you need to reload the systemd.
sudo systemctl daemon-reload
Last but not least, if you need to check the watchdog status, you can use the following command.
sudo ipmitool mc watchdog get #via ipmi management tools sudo bmc-watchdog -g #via bmc control