[SOLVED] Help needed to use DD-WRT as a failover router

Post new topic   Reply to topic    DD-WRT Forum Index -> Advanced Networking
Author Message
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Fri Jul 15, 2022 4:09    Post subject: [SOLVED] Help needed to use DD-WRT as a failover router Reply with quote
Hello fellow DD-WRT people !

I am thinking of setting up an old DD-WRT router I have laying around (a good old TP-Link TL-WDR3600 running DD-WRT v3.0-r49492 std (07/14/22)) as a failover/backup router + WIFI AP for when the power goes out and/or when I lose my pfSense or LAN switch for whatever reason. I currently already have the UPS setup for the LAN switch and pfsense box but the server running pfsense makes it last only about 15 minutes and I am looking for something just to keep WIFI devices up during power outages.

Here is my setup:
- Modem -> current router (pfSense in a computer) -> LAN switch -> WIFI AP, PC, etc

I would like to add a second link from the modem to my old DD-WRT router (WAN port) to act as a failover router, and a link between the DD-WRT router (LAN port) to the LAN switch.

The DD-WRT would have a reserved LAN IP on the same subnet as the regular LAN. I would like it to monitor the connectivity to the pfsense gateway (ping every X seconds), and as long as the pfsense is up, the DD-WRT's DHCP server and WIFI would be disabled. If the pfsense box stops responding (because of power outage, maintenance, or other, the DD-WRT router would turn on the WIFI and DHCP server. The DHCP server would give IPs in a reserved range not included in the pfsense DHCP server range, and provide a different gateway (the DD-WRT's LAN IP address instead of the pfsense IP) so that the trafic could go out and make it as transparent as possible. Once connectivity to the pfsense box is re-established, WIFI and DHCP server would be disabled again.

I'm pretty sure there must be some kind of shell scripting / cron job combo to do this, but I am unsure about the right commands. Do you guys have any idea on how to achieve this ?

Something along the lines of

Code:

ping 192.168.1.1
if ping = true {
  verify if dhcp is running {
    true = kill dhcp
  }
  verify if wifi is running {
    true = kill wifi
  }
}
if ping = false {
  verify if dhcp is running {
    false {
      start dhcp
      wait 5 seconds
    }
  }
  verify if wifi is running {
    false = start wifi
}
end


Thanks for your help !
viz


Last edited by vizi0n on Mon Jul 18, 2022 0:23; edited 2 times in total
Sponsor
kernel-panic69
DD-WRT Guru


Joined: 08 May 2018
Posts: 14217
Location: Texas, USA

PostPosted: Fri Jul 15, 2022 4:21    Post subject: Reply with quote
Please consider upgrading to the current release. We generally don't focus on anything until you've upgraded since everything in the router database is outdated and not generally supported in the forum.

https://download1.dd-wrt.com/dd-wrtv2/downloads/betas/2022/07-14-2022-r49492/tplink_tl-wdr3600v1/

Please consider reading the applicable announcements and stickies starting with the Forum Rules and Guidelines English.

_________________
"Life is but a fleeting moment, a vapor that vanishes quickly; All is vanity"
Contribute To DD-WRT
Pogo - A minimal level of ability is expected and needed...
DD-WRT Releases 2023 (PolitePol)
DD-WRT Releases 2023 (RSS Everything)

----------------------
Linux User #377467 counter.li.org / linuxcounter.net
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Fri Jul 15, 2022 4:34    Post subject: Reply with quote
kernel-panic69 wrote:
Please consider upgrading to the current release. We generally don't focus on anything until you've upgraded since everything in the router database is outdated and not generally supported in the forum.

https://download1.dd-wrt.com/dd-wrtv2/downloads/betas/2022/07-14-2022-r49492/tplink_tl-wdr3600v1/

Please consider reading the applicable announcements and stickies starting with the Forum Rules and Guidelines English.

Thanks for the heads up. First thing I did when I dug out the router from the storage bin was to upgrade the 2016 firmware to the latest one on the database. It is now upgraded to DD-WRT v3.0-r49492 std (07/14/22)
eibgrad
DD-WRT Guru


Joined: 18 Sep 2010
Posts: 9157

PostPosted: Fri Jul 15, 2022 5:44    Post subject: Reply with quote
For failover purposes, I'd be more inclined to have the primary router manage this process entirely. Let me explain.

If we're talking about a power failure, it seems to me we can safely assume NOTHING is going to work. IOW, this isn't typically a situation where some things work and others don't. Either everything's up and running, or nothing is, at least locally.

Given the above, what I would do is simply establish another gateway on the existing network for failover purposes. The primary router would use its own WAN until such time it became unresponsive, then change the default gateway to the other router. Similarly, it would monitor the WAN for recovery and change the default gateway back to itself.

All in all, NOT very complicated. Certainly no more complicated than what happens when you configure a change in the default gateway on the primary router w/ the OpenVPN client. It just happens that the change in the default gateway is a virtual network interface on the same device. But the logic is exactly the same in terms of monitoring.

What the OP is describing is something vastly more complicated, because it does more than just change the default gateway. It changes the DHCP server, wifi APs, perhaps DNS, etc. IOW, it's a wholesale change in what is effectively the primary router, including all its services. That just seems to overly complicate matters. And as I said before, I don't see where you're going to have the primary router w/o power, and the failover router w/ power anyway in order to justify this configuration.

Now I realize there's another problem here, and that is pfSense. I don't use it myself, but I *assume* scripting the solution I described above on that platform is NOT going to be as easy as it is with DD-WRT, FreshTomato, etc. It makes me wonder if perhaps this is why the OP chose the configuration that he did, where most (if not all) of this process is driven by the DD-WRT (failover) router. IOW, the DD-WRT router is forced to do the monitoring and "pickup the slack" as necessary for the unavailable primary router.

_________________
ddwrt-ovpn-split-basic.sh (UPDATED!) * ddwrt-ovpn-split-advanced.sh (UPDATED!) * ddwrt-ovpn-client-killswitch.sh * ddwrt-ovpn-client-watchdog.sh * ddwrt-ovpn-remote-access.sh * ddwrt-ovpn-client-backup.sh * ddwrt-mount-usb-drives.sh * ddwrt-blacklist-domains.sh * ddwrt-wol-port-forward.sh * ddwrt-dns-monitor.sh (NEW!)
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Fri Jul 15, 2022 5:59    Post subject: Reply with quote
The ONU and the TP-Link are on a 12v UPS connected to a car battery. If power fails, the ONU and the TP-Link will remain powered on for hours, even maybe a day due to the very low load they create and the huge capacity of the battery. The failover is for electrical outage, not network outage. The detection of the power outage is actually done by probing a 24/7 device on the network (the regular router) with ping.

The purpose of this is to keep WIFI services functional so the wife and kids can use their tablet and I can still work with my laptop. I can't rely on the switch to have hot/standby routes as the switch and server hosting the router are on the same UPS, and they will power down at the same time. Nothing else from the network will go through them, because nothing else will have power. That is why I need to find the commands to start/stop the wifi and DHCP.

I think that you are seeing this much more complex than it actually is. In bash scripting this would be relatively easy with the right syntax, which I don't have in this case.

I could just leave the DD-WRT with its own SSID, and connect to it "if needed", but I would prefer the other approach to lower the chances of interferences, and devices sticking to the wrong SSID
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Fri Jul 15, 2022 6:12    Post subject: Reply with quote
so far I've managed to shut the WIFI using "ifconfig wlan0 down" and "ifconfig wlan1 down". I'll be able to check if up/down with grep.
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Fri Jul 15, 2022 7:11    Post subject: Reply with quote
Well I finally got it to run.

I have created a small script called "monitor.sh" in /tmp/root folder

Code:

#!/bin/sh

pingResult=$(ping -c 5 10.40.0.1 | grep "packet loss" | cut -d" " -f 7 | sed 's/[^0-9]//g')
# pingResult returns percentage of packetloss

dhcpRunning=$(cat /tmp/dnsmasq.conf | grep -c "dhcp")
# dhcpRunning returns 1 if not running, returns 6 if running

wlanRunning=$(ifconfig | grep -c "wlan")
# wlanRunning returns 0 if WIFI disabled, returns 2 if WIFI enabled (wlan0 wlan1)

if [ $pingResult -eq 100 ]
then

        echo "Main router DOWN"

        # If 100% packetloss enable DHCP server and enable WLAN interfaces
        if [ $dhcpRunning -eq 1 ]
        then
                stopservice dnsmasq
                cp /tmp/root/dnsmasq.conf.dns-dhcp /tmp/dnsmasq.conf
                startservice dnsmasq
                echo "DHCP server is now ENABLED"
        fi
        if [ $wlanRunning -lt 2 ]
        then
                ifconfig wlan0 up
                ifconfig wlan1 up
                echo "WIFI is now ENABLED"
        fi
else

        echo "Main router UP"

        # If less than 100% packetloss, make sure DHCP server and WLAN interfaces are disabled
        if [ $dhcpRunning -gt 1 ]
        then
                stopservice dnsmasq
                cp /tmp/root/dnsmasq.conf.dns-only /tmp/dnsmasq.conf
                startservice dnsmasq
                echo "DHCP server is now DISABLED"
        fi
        if [ $wlanRunning -gt 0 ]
        then
                ifconfig wlan0 down
                ifconfig wlan1 down
                echo "WIFI is now DISABLED"
        fi
fi


I also made a copy of /tmp/dnsmasq.conf with and without DHCP enabled in /root, which I copy to overwrite the existing config file used to launch dnsmasq.

Now I just need to figure out the cron to run this script every 15 seconds and I'm all set. As long as files dont disappear after a reboot!
egc
DD-WRT Guru


Joined: 18 Mar 2014
Posts: 12881
Location: Netherlands

PostPosted: Fri Jul 15, 2022 7:49    Post subject: Reply with quote
You should be able to use:
stopservice dnsmasq
startservice dnsmasq

To check a connection there are plenty of examples search "watchdog"

_________________
Routers:Netgear R7000, R6400v1, R6400v2, EA6900 (XvortexCFE), E2000, E1200v1, WRT54GS v1.
Install guide R6400v2, R6700v3,XR300:https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=316399
Install guide R7800/XR500: https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=320614
Forum Guide Lines (important read):https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=324087
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Fri Jul 15, 2022 7:54    Post subject: Reply with quote
egc wrote:
You should be able to use:
stopservice dnsmasq
startservice dnsmasq

I've got it quite functional. I just need to set the storage and crontab. Seems like this router does not have enough memory for a jffs storage, so I'll be getting a USB stick
vizi0n
DD-WRT Novice


Joined: 15 Jul 2022
Posts: 7

PostPosted: Sun Jul 17, 2022 7:39    Post subject: Reply with quote
So, I have finally completed my watchdog/failover DD-WRT router. I had to connect a USB flash drive so the router keeps the scripts over reboots because this router does not support a built-in JFFS2 partition. The files are all located in the /opt directory.

My ONU (modem) has 2 ethernet ports and is just a layer 2 device.

- ONU Port 1 is connected to my main router
- ONU Port 2 is connected to my backup (DD-WRT router)
- Both routers are connected to the same LAN switch and are on the same subnet. Main router has .1 IP and DD-WRT has .254
- DHCP provides the corresponding gateway (main router provides .1 as gateway, DD-WRT provides .254 as gateway).
- Each DHCP has its own range of IPs that do not conflict with each other (main router .100-.149 and DD-WRT .200-.249)

The transition is smooth and should happen within 20 seconds of failure of the main router and/or main WIFI access points


It essentially works like this :

Step 1
A cron job script is created to allow running the verification twice per minute. This script is located in /opt/monitor_cronjob.sh and it calls the actual monitoring script.

Content of monitor_cronjob.sh
Code:
#!/bin/sh
for i in 0 1; do /opt/monitor_lan.sh & sleep 25; done; /opt/monitor_lan.sh


Step 2
What the /opt/monitor_lan.sh script essentially does is the following:

- If primary router is unracheable, DD-WRT stops dnsmasq, replaces dnsmasq config file by the one that includes the DHCP config, and restarts dnsmasq to enable the DHCP server
- If primary router is reachable, DD-WRT stops dnsmasq, replaces its config by the one without DHCP config, and restarts dnsmasq to only keep the DNS server feature

- If both WIFI access points are unreachable, DD-WRT starts both WIFI interfaces
- If at least one WIFI access point is reacheable, DD-WRT shuts both WIFI interfaces

Please bear with me, I am not a programmer and I know there is room for optimization and combining stuff. I just don't know enough to do that.

Content of monitor_lan.sh
Code:
#!/bin/sh

RED="\e[31m"
GREEN="\e[32m"
YELLOW="\e[33m"
ENDCOLOR="\e[0m"

date

function pingHost {
        ping -c 5 -W 1 $1 | grep "packet loss" | cut -d" " -f 7 | sed 's/[^0-9]//g'
}

function verifyDHCP {
   echo -e -n "${YELLOW}DD-WRT DHCP${ENDCOLOR} : "
   dhcpRunning=$(echo $dnsmasq | grep -c "dhcp")
   # dhcpRunning returns 0 if not running, returns 1 if running
   if [ $dhcpRunning -eq 0 ]
   then
         dhcpRunning="DISABLED"
         echo -e "[${RED} $dhcpRunning ${ENDCOLOR}]"
   else
         dhcpRunning="ENABLED"
         echo -e "[${GREEN} $dhcpRunning ${ENDCOLOR}]"
   fi
}

function verifyWIFI {
   echo -e -n "${YELLOW}DD-WRT WIFI${ENDCOLOR} :"
   if [ $wlanRunning -gt 0 ]
   then
         wlanRunning="ENABLED"
         echo -e " [${GREEN} $wlanRunning ${ENDCOLOR}]"
   else
         wlanRunning="DISABLED"
         echo -e " [${RED} $wlanRunning ${ENDCOLOR}]"
   fi
}

function killDNSMASQ {
        dnsmasqPID=$(ps | grep dnsmasq | grep -v grep | grep "conf" | sed 's/^ *//g' | cut -d" " -f1)
        kill $dnsmasqPID &
}


### VERIFY IF DNSMASQ IS RUNNING, AND STOCK OR CUSTOM CONFIG

dnsmasq=$(ps | grep dnsmasq | grep -v grep)
dnsmasqRunning=$(echo $dnsmasq | grep -c)
if [ $dnsmasqRunning -eq 1 ]
then
        stockDnsmasq=$(echo $dnsmasq | grep -c "/tmp/dnsmasq.conf")
        if [ $stockDnsmasq -eq 1 ]
        then
                stopservice dnsmasq
        fi
else
        dnsmasq --conf-file=/opt/dnsmasq.conf.dns-only
fi

### VERIFY MAIN ROUTER STATUS

echo -e "==============================================="
echo -e "=== ${YELLOW}VERIFYING${ENDCOLOR} : MAIN ROUTER STATUS ============"
echo -e "==============================================="
echo -e -n "${YELLOW}PINGING${ENDCOLOR} : 10.40.0.1"
ping_mainrouter=$(pingHost 10.40.0.1)
echo " // Packetloss : $ping_mainrouter"
# $ping_mainrouter returns percentage of packetloss for main router

if [ $ping_mainrouter -eq 100 ]
then
        echo -e "${YELLOW}MAIN ROUTER${ENDCOLOR} : [ ${RED}DOWN${ENDCOLOR} ]"

        verifyDHCP
        if [ $dhcpRunning == "DISABLED" ]
        then
                echo -e "${YELLOW}STARTING${ENDCOLOR} : DD-WRT DHCP SERVER"
                killDNSMASQ
                dnsmasq --conf-file=/opt/dnsmasq.conf.dns-dhcp
                echo -e "${YELLOW}DD-WRT DHCP SERVER${ENDCOLOR} : [ ${GREEN}ENABLED${ENDCOLOR} ]"
        fi
else
        echo -e "${YELLOW}MAIN ROUTER${ENDCOLOR} : [ ${GREEN}UP${ENDCOLOR} ]"

        verifyDHCP
        if [ $dhcpRunning == "ENABLED" ]
        then
                echo -e "${YELLOW}STOPPING${ENDCOLOR} : DD-WRT DHCP SERVER"
                killDNSMASQ
                dnsmasq --conf-file=/opt/dnsmasq.conf.dns-only
                echo -e "${YELLOW}DD-WRT DHCP SERVER${ENDCOLOR} : [ ${GREEN}DISABLED${ENDCOLOR} ]"
        fi
fi
echo -e "==============================================="

### VERIFY WIFI STATUS IN DD-WRT

echo -e "==============================================="
echo -e "=== ${YELLOW}VERIFYING${ENDCOLOR} : WIFI STATUS ==================="
echo -e "==============================================="

### VERIFY IF LAN ACCESS POINTS ARE RESPONDING TO PING

echo -e -n "${YELLOW}PINGING${ENDCOLOR} : 10.40.0.8 (AP #1)"
ping_ap1=$(pingHost 10.40.0.8)
echo " // Packetloss : $ping_ap1"
echo -e -n "${YELLOW}PINGING${ENDCOLOR} : 10.40.0.9 (AP #2)"
ping_ap2=$(pingHost 10.40.0.9)
echo " // Packetloss : $ping_ap2"

### VERIFY IF DD-WRT WIFI IS ACTIVE

wlanRunning=$(ifconfig | grep -c "wlan")
# wlanRunning returns 0 if WIFI is disabled, returns 1 or 2 if WIFI is enabled (interfaces wlan0 wlan1)

if [ $ping_ap1 -eq 100 ] && [ $ping_ap2 -eq 100 ]
then
        echo -e "${YELLOW}LAN Access Points${ENDCOLOR} : [ ${RED}DOWN${ENDCOLOR} ]"
      verifyWIFI
        if [ $wlanRunning == "DISABLED" ]
        then
                echo -e "${YELLOW}ENABLING${ENDCOLOR} : DD-WRT WIFI"
                ifconfig wlan0 up
                ifconfig wlan1 up
                echo -e "${YELLOW}DD-WRT WIFI${ENDCOLOR} : [ ${GREEN}ENABLED${ENDCOLOR} ]"
        fi
else
        echo -e "${YELLOW}LAN Access Points${ENDCOLOR} : [ ${GREEN}UP${ENDCOLOR} ]"

        verifyWIFI
      if [ $wlanRunning == "ENABLED" ]
        then
                echo -e "${YELLOW}DISABLING${ENDCOLOR} : DD-WRT WIFI"
                ifconfig wlan0 down
                ifconfig wlan1 down
                echo -e "${YELLOW}DD-WRT WIFI${ENDCOLOR} : [ ${RED}DISABLED${ENDCOLOR} ]"
        fi
fi
echo -e "==============================================="


Step 3
I also needed to keep a copy of the dnsmasq config with and without DHCP enabled. These files are also located in /opt but keep in mind that your config in these files will vary.

Content of dnsmasq.conf.dns-only
Code:
interface=br0
resolv-file=/tmp/resolv.dnsmasq
bogus-priv
conf-file=/etc/rfc6761.conf
clear-on-reload
stop-dns-rebind
dhcp-option=252,"\n"
cache-size=1500


Content of dnsmasq.conf.dns-dhcp
Code:
interface=br0
resolv-file=/tmp/resolv.dnsmasq
dhcp-leasefile=/tmp/dnsmasq.leases
dhcp-lease-max=50
dhcp-option=br0,3,10.40.0.254
dhcp-authoritative
dhcp-option=6,8.8.8.8,8.8.4.4
dhcp-range=br0,10.40.0.200,10.40.0.249,255.255.255.0,5m
bogus-priv
conf-file=/etc/rfc6761.conf
clear-on-reload
stop-dns-rebind
dhcp-option=252,"\n"
cache-size=1500


The DHCP server lease time is very short (5 minutes) and also has a different range than the one on the primary router. It is this short to allow for a smooth swithover back to the regular router once it is back online.

Step 4
A cronjob (added in the Administration page of DD-WRT) runs every minute

Code:
* * * * * root /opt/monitor_cronjob.sh


And that's it ! It just works Smile

I've thought I could share in case someone else wants to do somthing similar in the future. It's always nice to find some info in the forums !
Display posts from previous:    Page 1 of 1
Post new topic   Reply to topic    DD-WRT Forum Index -> Advanced Networking All times are GMT

Navigation

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum