Using a Shelly Plug as a Watchdog

I’m going to show you how to use a shelly plug as a watchdog to reboot a misbehaving device.

One of my server is a cheap N100, sometimes stop responding until I reboot it. The problem as it is a basic PC used as a server there is no way to issue a reboot remotely.

After the last crash which made some of my website and other services unavailable during the holiday season, I decided to use a shelly plug as a watchdog to reboot the server automatically.

For the plug I chose a shelly Plug S. It is cheap and easy to use, and allow scripting.

Setup

  • Configure the shelly plug
  • Set the output to be always on
  • Set the server to start when powered

Shelly script

This script is used to reboot the server when the watchdog expires after 10 minutes.

The watchdog is only activated after the server has started the first time. There is also an exponential backoff to avoid too much server reboot.

// Partially inspired by https://gist.github.com/Eetezadi/c539d15fa4648856c2732a71ef0a4e25
let CONFIG = {
  toggleTime: 10,                      // seconds to keep relay off
  maxBackoffTime: 3600 * 1000,         // max backoff (1 hour)
  initialWatchdogTimeout: 600 * 1000,  // Watchdog timeout (10 minutes)
  backoffMultiplier: 2,                // backoff multiplier
};

// Timer handle for the watchdog
let watchdogTimer = null;
let currentWatchdogTime = CONFIG.initialWatchdogTimeout;

// Function called when watchdog expires
function onWatchdogExpired() {
    print("Watchdog expired! Taking action...");
    
    // Turn off the switch and toggle it back after toggleTime seconds
    Shelly.call("Switch.Set", { id: 0, on: false, toggle_after: CONFIG.toggleTime });

    // Increase the timeout by the backoff multiplier
    currentWatchdogTime *= CONFIG.backoffMultiplier;
    // Cap the timeout
    if (currentWatchdogTime > CONFIG.maxBackoffTime) {
        currentWatchdogTime = CONFIG.maxBackoffTime;
    }
    startWatchdog();
}

// Function to start the watchdog timer
function startWatchdog() {
    if (watchdogTimer !== null) {
        Timer.clear(watchdogTimer);
    }
    watchdogTimer = Timer.set(currentWatchdogTime, false, onWatchdogExpired);
    print("Watchdog started. Timeout in " + currentWatchdogTime / 1000 + " seconds.");
}

// Function to reset the watchdog timer to initial state
function resetWatchdog() {
    if (watchdogTimer !== null) {
        Timer.clear(watchdogTimer);
    }
    currentWatchdogTime = CONFIG.initialWatchdogTimeout;
    startWatchdog();
}

// Register the HTTP endpoint
HTTPServer.registerEndpoint("watchdog", function(request, response) {
    resetWatchdog();
    
    response.code = 200;
    response.headers = [["Content-Type", "application/json"]];
    response.body = JSON.stringify({
        status: "ok",
        message: "Watchdog reset",
        timeout_ms: currentWatchdogTime
    });
    response.send();
});

// Start the watchdog on script start
// startWatchdog();  

print("Watchdog script started.");

// Get device IP on startup
Shelly.call("Wifi.GetStatus", null, function(result) {
    if (result && result.sta_ip) {
        deviceIP = result.sta_ip;
        print("Reset endpoint: http://"+deviceIP+"/script/"+Script.id+"/watchdog");      
    } else {
      print("Reset endpoint: http://<shelly-ip>/script/"+Script.id+"/watchdog");
    }
});

Linux server script

The script to trigger the watchdog is a simple bash script with curl and scheduled by systemd (also works with cron):

#!/bin/bash

# Configuration
SHELLY_IP="192.168.1.100"  # Change to your Shelly's IP
SCRIPT_ID="1"               # Change to your script ID
TIMEOUT=10                  # curl timeout in seconds

# Build URL
URL="http://${SHELLY_IP}/script/${SCRIPT_ID}/watchdog"

# Make the request
response=$(curl -s -w "\n%{http_code}" --max-time "$TIMEOUT" "$URL" 2>&1)
http_code=$(echo "$response" | tail -n1)
body=$(echo "$response" | sed '$d')

# Log result
if [ "$http_code" = "200" ]; then
    echo "$(date -Iseconds) - OK: $body"
else
    echo "$(date -Iseconds) - FAILED: HTTP $http_code - $body" >&2
    exit 1
fi

Make it executable:

sudo chmod +x /usr/local/bin/shelly-watchdog.sh

Then create the systemd service and timer:

sudo nano /etc/systemd/system/shelly-watchdog.service

Add the following content:

[Unit]
Description=Shelly Watchdog Ping
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/shelly-watchdog.sh

# Optional: run as non-root user
# User=nobody
# Group=nogroup

Then create the systemd timer:

sudo nano /etc/systemd/system/shelly-watchdog.timer

Add the following content:

[Unit]
Description=Shelly Watchdog Timer
Requires=shelly-watchdog.service

[Timer]
# Run every 5 minutes
OnBootSec=1min
OnUnitActiveSec=5min

# Optional: add randomized delay to prevent exact timing
# RandomizedDelaySec=30

[Install]
WantedBy=timers.target

Enable and start the timer:

# Reload systemd
sudo systemctl daemon-reload

# Enable timer to start on boot
sudo systemctl enable shelly-watchdog.timer

# Start timer now
sudo systemctl start shelly-watchdog.timer

# Check status
sudo systemctl status shelly-watchdog.timer
sudo systemctl list-timers | grep shelly

# check the logs
journalctl -u shelly-watchdog.service
# check timer status
sudo systemctl status shelly-watchdog.timer
# Check last run status
sudo systemctl status shelly-watchdog.service

The recommended ping timeout should be at least half of the time. This is why the timer is set to run every 5 minutes, it is possible to run with shorter intervals if needed.