djtms / wanfailoverscript Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/wanfailoverscript
Automatically exported from code.google.com/p/wanfailoverscript
Design WFS checks if your WAN connection is still up by sending ICMP messages to multiple target hosts. It automatically switches between primary and secondary WAN link by changing the default gateway. Installation untar the tar file run the install.sh script. edit the configuration /etc/wfs/wfs.conf to your liking add target hosts used for testing in /etc/wfs/targets.txt start wfs through "/etc/init.d/wfs start" For email notifications (Ubuntu) apt-get install mailutils postfix (installs mail and a local MTA) configure postfix as an internet gateway do not forward an inbound mail to this instance, it should be behind a firewall and for outbound notification traffic only. Assumption I asume that your host has access to two different routers that act as the primary and backup gateway for the primary WAN connection and backup WAN connection. It may be that your host has two interfaces, with two IP-addresses for each WAN connection, but this is not required. How WFS tests for link availability WFS uses ICMP messages to test if the primary WAN link is still up. There is a risk that the target host goes off-line while the WAN link is perfectly fine. This would cause an unnecessary fail-over event. To prevent unnecessary failover, at least two hosts should be used for availability testing. WFS can use an arbitrary number of target hosts for availability testing. WFS tests each host in sequence. Thus, first host 1 is tested, then host 2, until the last host, after which WFS starts again with the first. It is strongly advised to use more than two target hosts. WFS sends an ICMP message each specified interval, and this could be considered abusive by the target hosts. To increase the delay between tests, add more hosts to the 'targets' file /etc/wfs/targets.txt and/or increase the test delay between hosts. Your default gateway address on your ISP's network is a good host to have in your targets.txt file as this confines some of the ICMP traffic to your subnet at your ISP as well. It is also a great test of whether you can reach the Internet or not!. How failover works WFS changes the default gateway and thus the default route to which all traffic is sent. It uses the native 'route' command found on most Linux distributions. When the primary WAN link is restored When executed, WFS adds static routes for all target hosts that are used for availability monitoring. These routes specifically use the router/gateway for the primary WAN interface, even if the default gateway is switched to the failover connection. This allows WFS to monitor if the primary WAN interface is still down or back up again. If the latter is true, then WFS will switch the default route back to the primary interface. Example: The primary WAN gateway is 10.0.0.1 in this example. server:~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 74.125.77.104 10.0.0.1 255.255.255.255 UGH 0 0 0 eth1 209.85.229.104 10.0.0.1 255.255.255.255 UGH 0 0 0 eth1 219.25.29.14 10.0.0.1 255.255.255.255 UGH 0 0 0 eth1 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 0.0.0.0 10.0.0.1 0.0.0.0 UG 0 0 0 eth1 When a failover to a backup WAN link occurs (192.168.0.1) it would look like this: server:~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 74.125.77.104 10.0.0.1 255.255.255.255 UGH 0 0 0 eth1 209.85.229.104 10.0.0.1 255.255.255.255 UGH 0 0 0 eth1 219.25.29.14 10.0.0.1 255.255.255.255 UGH 0 0 0 eth1 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 0.0.0.0 192.168.0.1 0.0.0.0 UG 0 0 0 eth0 Configuration options To configure WFS to your needs, edit the WFS script itself or edit /etc/wfs/wfs.conf. The first thing you should do is to edit the file containing IP addresses of hosts that will be used for testing. Contents of /etc/wfs/targets.txt: 74.125.77.104 209.85.229.104 219.25.29.14 These are the hosts that are used to test if the primary WAN interface is available. Please note that this is an example. PRIMARY_GW=10.0.0.1 SECONDARY_GW=192.168.0.1 Here, the primary and secondary (failover) WAN gateway adress is configured. When a failure is detected, WFS will switch from the primary gateway to the secondary gateway. INTERVAL=20 TEST_COUNT=2 WFS will test the availability of the primary WAN link each 'INTERVAL' seconds, in this example thus 20 seconds. WFS will test by sending 2 ICMP messages. THRESHOLD=3 To prevent a premature failover, such as when connectivity is only lost for a couple of seconds, or when a single test target goes down, a threshold must be reached before a failover is performed. In this example, the failover test must fail three times before a failover is committed. Based on the interval used in the previous example, a failover will take about a minute after the connection has gone bad. By changing the values, you can switch faster or slower. COOLDOWNDELAY=120 To prevent fast switching between primary and secondary WAN link due to a flaky connection, a cool down period is used. Only after this period WFS will switch back to the primary interface again. From 2.03 onwards COOLDOWNDELAY is deprecated and replaced with: COOLDOWNDELAY01=3600 COOLDOWNDELAY02=600 COOLDOWNDELAY01 is the cool down interval after a failover event. All testing will pause and connectivity remain through the secondary WAN until this counter has expired. Value is in seconds. COOLDOWNDELAY02 is the cool down interval on failback event. All testing will pause and connectivity remain through the primary WAN until this counter has expired. Value is in seconds. Sometimes you may want to have a long cooldown period after a failover event to keep services stable while you troubleshoot the primary wan connection. Whereas after the connection fails back to the primary you would want a shorter period before connection monitoring starts again MAIL_TARGET="" Assuming that your backup WAN link allows you to send email messages, you can specify an email address. This address will receive a message when a failover occurs or when the primary link is restored. Note this email functionality requires the command line mail command. This can be installed by apt-get install mailutils on Ubuntu. An MTA such as postfix is also required. The email will contain a copy of the routing table as it is after the failover or failback event. This allows you to verify that the route has changed and is correct. PRIMARY_CMD="" SECONDARY_CMD="" These two variables can be used to execute a custom command after a failover or restore is performed. When switching to the secondary interface, the secondary_cmd command is executed. When switching back to the primary interface, primary_cmd is executed. MAX_LATENCY=2 The maximum allowed latency on an ICMP ping request before the request is considered to have failed. This allows us to failover in the even of a severely degraded connection which is still live but running very slowly. Logging Log messages are sent to /var/log/daemon or /var/log/messages. Logging can be more verbose by configuring the 'DEBUG' option within the configuration file. Example of debug output: May 29 21:47:08 server WFS: INFO ------------------------------ May 29 21:47:08 server WFS: INFO WAN Failover Script 2.0 May 29 21:47:08 server WFS: INFO ------------------------------ May 29 21:47:08 server WFS: INFO Primary gateway: 10.0.0.1 May 29 21:47:08 server WFS: INFO Secondary gateway: 10.0.0.1 May 29 21:47:08 server WFS: INFO Threshold before failover: 3 May 29 21:47:08 server WFS: INFO Number of target hosts: 4 May 29 21:47:08 server WFS: INFO Tests per host: 2 May 29 21:47:08 server WFS: INFO ------------------------------ May 29 21:47:08 server WFS: INFO Starting monitoring of WAN link. May 29 21:47:09 server WFS: INFO WAN Link: PRIMARY May 29 21:47:30 server WFS: INFO Host 74.125.77.104 UNREACHABLE May 29 21:47:30 server WFS: INFO Failed targets is 1, threshold is 3. May 29 21:47:30 server WFS: INFO WAN Link: PRIMARY May 29 21:47:36 server WFS: INFO Host 69.147.125.65 UNREACHABLE May 29 21:47:36 server WFS: INFO Failed targets is 2, threshold is 3. May 29 21:47:36 server WFS: INFO WAN Link: PRIMARY May 29 21:47:42 server WFS: INFO Host 98.137.149.56 UNREACHABLE May 29 21:47:42 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:47:42 server WFS: INFO Primary WAN link failed. Switched to secondary link. May 29 21:48:18 server WFS: INFO Host 209.85.227.105 UNREACHABLE May 29 21:48:18 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:48:24 server WFS: INFO Host 74.125.77.104 UNREACHABLE May 29 21:48:24 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:48:30 server WFS: INFO Host 69.147.125.65 UNREACHABLE May 29 21:48:30 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:48:36 server WFS: INFO Host 98.137.149.56 UNREACHABLE May 29 21:48:36 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:48:42 server WFS: INFO Host 209.85.227.105 UNREACHABLE May 29 21:48:42 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:48:48 server WFS: INFO Host 74.125.77.104 UNREACHABLE May 29 21:48:48 server WFS: INFO Failed targets is 3, threshold is 3. May 29 21:48:54 server WFS: INFO Failed targets is 2, threshold is 3. May 29 21:49:01 server WFS: INFO Failed targets is 1, threshold is 3. May 29 21:49:07 server WFS: INFO Primary WAN link OK. Switched back to primary link. May 29 21:49:43 server WFS: INFO WAN Link: PRIMARY May 29 21:50:04 server WFS: INFO WAN Link: PRIMARY May 29 21:50:25 server WFS: INFO WAN Link: PRIMARY
I've tested your fine script and found two major issues with it
1. It will just "bounce" between gateways for me. As I see it, script will test
for availability with icmp but as it fails on primary gw targets are once again
reachable from secondary and the script once again switch to primary!? Are you
sure you can test without -I ping command option and get desired behavior?
2. Default value of MAX_LATENCY is total overkill, ping will just stall if
default gw isn't available on primary interface which is most probable cause of
failure.
Best regards,
Nikola
Original issue reported on code.google.com by [email protected]
on 29 Nov 2012 at 9:50
Wan Fail over script works fine for me when i make DSL connection as Primary
and 3g connection as Secondary. And default gateway in routing table updates
correctly based on the link failure of this connection. But after interchanging
modes (primary and secondary) , it doesn't work for me. The scenario is,
I made ppp0 connetion as primary and DSL connection as secondary . In this case
wfs script running fine upto failover occurs. Thats if i disconnect 3g dongle
for the failover to occur, all routing entries associated with ppp0 interface
disappear. And the wfs doesn't find the host gateway for the checking of
primary connection. So whats the solution for me ?
Thanks in Advance,
Sanju
Original issue reported on code.google.com by [email protected]
on 2 Apr 2014 at 5:54
Hello
I found small problem in info log which can be confusing.
line 154 in wfs - version 2.04 (and 2.03 too)
log INFO "Max latency in ms: $MAX_LATENCY"
but correct is
log INFO "Max latency in seconds: $MAX_LATENCY"
because paramter of ping -W is in seconds
Thanks for your work
Original issue reported on code.google.com by [email protected]
on 29 Sep 2013 at 8:48
Is there any chance to remove the dependency to bash?
I'm asking because I'd like to run the script on OpenWRT and on my router
(Edimax 6200n) isn't enough space availabe to install bash. There is only "ash"
and "sh" available.
Original issue reported on code.google.com by [email protected]
on 2 Feb 2014 at 9:43
Out put of route -n
X.X.X.X 0.0.0.0 255.255.255.255 UH 0 0 0 ppp0
172.16.16.0 0.0.0.0 255.255.255.252 U 0 0 0 eth2
10.10.1.0 0.0.0.0 255.255.255.248 U 0 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth2
0.0.0.0 0.0.0.0 0.0.0.0 U 0 0 0 ppp0
Original issue reported on code.google.com by [email protected]
on 5 Feb 2012 at 10:07
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.