2

I have:

  • Raspi 3b+
  • 1st internet connection on eth0 through built in adapter
  • 2nd internet connection on eth1 through a USB-dongle

I followed the official manual and the AP is running just fine.

What I'm trying to do is routing the traffic though eth1 when no internet connection is available on eth0. It's working but with a very big latency and packets drops.

Case 1:

  • eth0 has internet

  • eth1 has internet

Result: everything works smoothly.

# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default 192.168.1.254 0.0.0.0 UG 202 0 0 eth0 default 192.168.8.1 0.0.0.0 UG 205 0 0 eth1 192.168.1.0 0.0.0.0 255.255.255.0 U 202 0 0 eth0 192.168.8.0 0.0.0.0 255.255.255.0 U 205 0 0 eth1 192.168.253.0 0.0.0.0 255.255.255.0 U 303 0 0 wlan0 

Case 2:

  • eth0 has no internet anymore

  • eth1 has internet

Result: big latency, packets drops.

# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default 192.168.8.1 0.0.0.0 UG 205 0 0 eth1 192.168.1.0 0.0.0.0 255.255.255.0 U 202 0 0 eth0 192.168.8.0 0.0.0.0 255.255.255.0 U 205 0 0 eth1 192.168.253.0 0.0.0.0 255.255.255.0 U 303 0 0 wlan0 

# cat /etc/dnsmasq.conf resolv-file=/etc/resolv.dnsmasq.conf interface=wlan0 server=8.8.8.8 server=8.8.4.4 dhcp-range=192.168.253.2,192.168.253.254,255.255.255.0,12h dhcp-authoritative 

# iptables-save # Generated by iptables-save v1.6.0 on Sun May 5 18:44:06 2019 *nat :PREROUTING ACCEPT [2637:573309] :INPUT ACCEPT [605:71308] :OUTPUT ACCEPT [658:46686] :POSTROUTING ACCEPT [10:1489] -A POSTROUTING -o eth1 -j MASQUERADE -A POSTROUTING -o eth0 -j MASQUERADE COMMIT # Completed on Sun May 5 18:44:06 2019 # Generated by iptables-save v1.6.0 on Sun May 5 18:44:06 2019 *filter :INPUT ACCEPT [1667:192581] :FORWARD ACCEPT [24823:15540031] :OUTPUT ACCEPT [1590:161791] -A FORWARD -i eth1 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT -A FORWARD -i eth0 -o eth1 -j ACCEPT COMMIT # Completed on Sun May 5 18:44:06 2019 

# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.103 netmask 255.255.255.0 broadcast 192.168.1.255 inet6 zzzzzzzzzzzzzzz prefixlen 64 scopeid 0x20<link> ether xx:xx:xx:xx:xx:xx txqueuelen 1000 (Ethernet) RX packets 133500 bytes 132927619 (126.7 MiB) RX errors 0 dropped 12 overruns 0 frame 0 TX packets 97296 bytes 63923420 (60.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.8.100 netmask 255.255.255.0 broadcast 192.168.8.255 inet6 vvvvvvvvvvvvvvvv prefixlen 64 scopeid 0x20<link> ether yy:yy:yy:yy:yy:yy txqueuelen 1000 (Ethernet) RX packets 23672 bytes 11549930 (11.0 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 19765 bytes 10665918 (10.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 253 bytes 30503 (29.7 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 253 bytes 30503 (29.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.253.1 netmask 255.255.255.0 broadcast 192.168.253.255 inet6 mmmmmmmmmmmmmmmmmmmm prefixlen 64 scopeid 0x20<link> ether aa:aa:aa:aa:aa:aa txqueuelen 1000 (Ethernet) RX packets 155008 bytes 100683215 (96.0 MiB) RX errors 0 dropped 8 overruns 0 frame 0 TX packets 193745 bytes 187522062 (178.8 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 

When the USB dongle is plugged into PC I have a smooth Internet connection, so it's not a problem of the dongle.

Could anyone please help me to figure out what's going on and how to fix that? Thanks in advance.


[UPDATE]

Update infos from comments on answer: Mean while I have another problem after configuring bonding for interfaces. Eth1 is the usb-modem, which becomes the Ethernet interface with usb_modeswitch. When this interface is not bonding - it becomes 'up' with ip address (common situation). When bonding - eth1 is down, however the eth0 is up. I believe, the problem comes from usb dongle router.

Definitely, this issue happens right after I disable dhcp on eth1. echo "denyinterfaces eth0 eth1" >> /etc/dhcpcd.conf

2
  • You have 2 default routes - which won’t work. You have to decide exactly what you are trying to do. Commented May 5, 2019 at 22:45
  • Milliways, you are right! With help of Ingo I'm now making a research on 'bonding' mechanism. Commented May 8, 2019 at 12:09

2 Answers 2

1

What you want to achieve is a typical fail-over scenario. You cannot simply use two connections in the hope the second one will be used successfully if the first one fails. It is no problem to have two connections, each with an ip address. The kernel will always only use one interface as its default route to the internet. And it will use that one with the lowest metric. In Case 1 it will use eth0 with metric 202 (lower than 205 for eth1) and with its source ip address 192.168.1.254.

If eth0 fails then the kernel has no problem to dynamically switch to eth1, the next available default route. And it uses the new source ip address 192.168.8.1.

And that is the problem. Any stateful TCP connection established with source ip 192.168.1.254 to whatever destination ip address will break. These are mostly ssh, any authenticated login sessions and maybe database connections, whatever is thinkable for a stateful connection.

This problem is solved by using bonding. This defines an intermediate interface bond0 that doesn't change its ip address. Only the underlaying slave interfaces eth0 and eth1 will switch the physical connection. How it works in principle you can look at Howto migrate from networking to systemd-networkd with dynamic failover. You may be able to implement it with classic networking. Or you decide to also use systemd-networkd and configure the access point with it using Setting up a Raspberry Pi as an access point - the easy way.

4
  • 1
    That's a very good answer! hat is off. :) Commented May 6, 2019 at 0:16
  • Really thank you for pointing me up! Finally, I know how this technique called - bonding ))) Mean while I have another problem after configuring bonding for interfaces. Eth1 is the usb-modem, which becomes the Ethernet interface with usb_modeswitch. When this interface is not bonding - it becomes 'up' with ip address (common situation). When bonding - eth1 is down, however the eth0 is up. I believe, the problem comes from usb dongle router. How do you think, what shoud I dig to solve this issue? Commented May 6, 2019 at 16:20
  • Definitely, this issue happens right after I disable dhcp on eth1. echo "denyinterfaces eth0 eth1" >> /etc/dhcpcd.conf Commented May 6, 2019 at 16:30
  • @MaximIlin Sorry, but I can't help with classic networking and with dhcpcd. I'm not using it since years. In general is should be possible to bond any interface also that from your modem dongle. I don't know how does it get its ip address. The bond0 interface should get the ip address instead. Commented May 6, 2019 at 17:09
0

Finally I managed this 'failover' to be working!

As @Milliways said, the main problem was coming from multiple default gateways.

With suggestion of @Ingo I was trying to implement bonding, however as I was told here, this technique was impossible for my case.

So I just wrote several bash scripts for making my case real.


Task

  • 2 Internet providers: one by built-in ethernet connection (eth0), another one through USB modem dongle

  • eth0 is the first-priority channel, while USB modem is an emergency channel

  • when eth0 is available, we pass the whole traffic through it

  • when eth0 is unavailable, we switch to USB modem and moving on

  • as soon as eth0 is back online, we switch back.


# definitions INTERFACE_FROM="wlan0" ACCESSPOINT_IP="192.168.253" INTERFACE_TO_0="eth0" INTERFACE_TO_1="eth1" 

mkdir -p /root/failover cp "DIR_WITH_SCRIPTS (see below)" /root/failover chmod 0755 -R --quiet /root/failover sed -i "s/##INTERFACES##/$INTERFACE_TO_0 $INTERFACE_TO_1/g" /root/failover/modify_routes sed -i "s/##INTERFACE_FROM##/$INTERFACE_FROM/g" /root/failover/modify_routes sed -i "s/##INTERFACES##/$INTERFACE_TO_0 $INTERFACE_TO_1/g" /root/failover/checker sed -i "s/##INTERFACE_FROM##/$INTERFACE_FROM/g" /root/failover/checker cat > /etc/network/interfaces.d/10_$INTERFACE_TO_0 << EOF auto $INTERFACE_TO_0 allow-hotplug $INTERFACE_TO_0 iface $INTERFACE_TO_0 inet dhcp dns-nameservers 8.8.8.8 8.8.4.4 # With logging to custom log file # post-up /root/failover/save_interface_data $INTERFACE_TO_0 >> /root/failover/log 2>&1 # Logging by means of Linux post-up /root/failover/save_interface_data $INTERFACE_TO_0 metric 20 EOF cat > /etc/network/interfaces.d/20_$INTERFACE_TO_1 << EOF auto $INTERFACE_TO_1 allow-hotplug $INTERFACE_TO_1 iface $INTERFACE_TO_1 inet dhcp dns-nameservers 8.8.8.8 8.8.4.4 pre-up sleep 20 # With logging to custom log file # post-up /root/failover/save_interface_data $INTERFACE_TO_1 >> /root/failover/log 2>&1 # Logging by means of Linux post-up /root/failover/save_interface_data $INTERFACE_TO_1 metric 40 EOF # With logging to custom log file # for i in $( seq 0 5 55 ); do (crontab -l ; echo "* * * * * sleep $i; /root/failover/checker >> /root/failover/log 2>&1") | crontab -; done # Logging to /dev/null for i in $( seq 0 5 55 ); do (crontab -l ; echo "* * * * * sleep $i; /root/failover/checker > /dev/null 2>&1") | crontab -; done 

'save_interface_data' file

#!/bin/bash set -e echo -e "Save_interface_data: Start [$(date '+%Y-%m-%d %H:%M:%S')]" INTERFACE_NAME=$1 INTERFACES_DATA_DIR='/root/failover/interfaces_data' mkdir -p $INTERFACES_DATA_DIR echo "Save_interface_data: Truncate $INTERFACES_DATA_DIR/current_active_interface" > $INTERFACES_DATA_DIR/current_active_interface INTERFACE_IP='' INTERFACE_GATEWAY='' echo "Save_interface_data: Truncate $INTERFACES_DATA_DIR/$INTERFACE_NAME" > $INTERFACES_DATA_DIR/$INTERFACE_NAME # Wait for IP echo "Save_interface_data: [$INTERFACE_NAME] Wait 20 seconds for the IP address obtained" SUCCESS="false" end=$((SECONDS+20)) while [ $SECONDS -lt $end ]; do printf "." INTERFACE_IP=$((ip address show dev $INTERFACE_NAME 2>/dev/null || echo "") | (grep -Eo 'inet (addr:)?([0-9]*\.){3}[0-9]*' || echo "") | (grep -Eo '([0-9]*\.){3}[0-9]*' || echo "") | grep -v '127.0.0.0') if [[ ! -z "$INTERFACE_IP" ]]; then SUCCESS="true" break fi sleep 1 done echo "" if [ "$SUCCESS" = "true" ]; then echo "Save_interface_data: [$INTERFACE_NAME] Ok" echo "INTERFACE_IP=$INTERFACE_IP" >> $INTERFACES_DATA_DIR/$INTERFACE_NAME else echo "Save_interface_data: [$INTERFACE_NAME] Can not get the IP address of within 20 seconds.\Aborting." exit 0 fi # Wait for Gateway echo "Save_interface_data: $INTERFACE_NAME: Wait 20 seconds for the Gateway address obtained" SUCCESS="false" end=$((SECONDS+20)) while [ $SECONDS -lt $end ]; do printf "." INTERFACE_GATEWAY=$(ip route | (grep -Eo "default(.*?)dev(.*?)$INTERFACE_NAME" 2>/dev/null || echo "") | (grep -Eo '([0-9]*\.){3}[0-9]*' || echo "") | head -1) if [[ ! -z "$INTERFACE_GATEWAY" ]]; then SUCCESS="true" break fi sleep 1 done echo "" if [ "$SUCCESS" = "true" ]; then echo "Save_interface_data: [$INTERFACE_NAME] Ok" echo "INTERFACE_GATEWAY=$INTERFACE_GATEWAY" >> $INTERFACES_DATA_DIR/$INTERFACE_NAME else echo "Save_interface_data: [$INTERFACE_NAME] Can not get the Gateway address within 20 seconds.\nAborting." exit 0 fi echo "Save_interface_data: End" 

'modify_routes' file

#!/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin set -e echo -e "Modify_routes: Start [$(date '+%Y-%m-%d %H:%M:%S')]" THIS_SCRIPT=`realpath $0` if [ -f "$THIS_SCRIPT-RUNNING" ]; then echo "Modify_routes: script already running.\nAborting." exit 0 fi echo "Modify_routes: Mark script as running" touch "$THIS_SCRIPT-RUNNING" WATCH_INTERFACES=( ##INTERFACES## ) # WATCH_INTERFACES=( eth0 eth1 ) INTERFACE_FROM="##INTERFACE_FROM##" # INTERFACE_FROM="wlan0" INTERFACES_DATA_DIR='/root/failover/interfaces_data' # DONT_SAVE_current_active_interface=$1 ( # Reset current_active_interface if [ ! -f "$INTERFACES_DATA_DIR/current_active_interface" ]; then echo "Modify_routes: Touch $INTERFACES_DATA_DIR/current_active_interface" touch $INTERFACES_DATA_DIR/current_active_interface else echo "Modify_routes: Truncate $INTERFACES_DATA_DIR/current_active_interface" truncate -s 0 $INTERFACES_DATA_DIR/current_active_interface fi echo "Modify_routes: Remove all 'default' gateways" while [[ ! -z $(ip route | (grep -Eo "default" 2>/dev/null || echo "")) ]]; do route delete default done FOUND="false" echo "Modify_routes: Find first available interface and set new default gateway" for INTERFACE_NAME in "${WATCH_INTERFACES[@]}" do echo "Modify_routes: Checking $INTERFACE_NAME" iptables -t nat -D POSTROUTING -o $INTERFACE_NAME -j MASQUERADE 1>/dev/null 2>&1 || echo '' > /dev/null if [ "$FOUND" = "false" ]; then echo "Modify_routes: Nothing found so far, so move on!" echo "Modify_routes: Source $INTERFACES_DATA_DIR/$INTERFACE_NAME" [ -f "$INTERFACES_DATA_DIR/$INTERFACE_NAME" ] && source "$INTERFACES_DATA_DIR/$INTERFACE_NAME" # Variables coming from "$INTERFACES_DATA_DIR/$INTERFACE_NAME" file # $INTERFACE_IP # $INTERFACE_GATEWAY echo "Modify_routes: Pinging $INTERFACE_GATEWAY through $INTERFACE_NAME" if ping -q -c 1 -W 1 -I $INTERFACE_NAME $INTERFACE_GATEWAY >/dev/null 2>&1; then echo "Modify_routes: Good!" echo "Modify_routes: Add default gateway $INTERFACE_GATEWAY" route add default gw "$INTERFACE_GATEWAY" echo "Modify_routes: Mark as found" FOUND="true" iptables -t nat -A POSTROUTING -o $INTERFACE_NAME -j MASQUERADE echo "Modify_routes: Set new active interface $INTERFACE_NAME" echo "$INTERFACE_NAME" > $INTERFACES_DATA_DIR/current_active_interface # fi fi else echo "Modify_routes: Skipping" fi done ) || echo "Modify_routes: Error occurred while running $THIS_SCRIPT.\nAborting." echo "Modify_routes: Remove running flag" rm -f "$THIS_SCRIPT-RUNNING" echo "Modify_routes: End" 

'checker' file

This script is being executed by cron every 5 seconds #!/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin set -e echo -e "Checker: Start [$(date '+%Y-%m-%d %H:%M:%S')]" WATCH_INTERFACES=( ##INTERFACES## ) # WATCH_INTERFACES=( eth0 eth1 ) INTERFACE_FROM="##INTERFACE_FROM##" # INTERFACE_FROM="wlan0" INTERFACES_DATA_DIR='/root/failover/interfaces_data' ACTIVE_INTERFACE=$(cat $INTERFACES_DATA_DIR/current_active_interface || echo "") # Remove all defaults except one remove_all_default_gateways_except_one() { local GW=$1 echo "Checker: Remove all defaults except $GW" for tmp_GW in $(ip route | (grep -Eo "^default(.*?)" || echo "") | (grep -Eo '([0-9]*\.){3}[0-9]*' || echo "") | grep -v '127.0.0.0' ) do echo "Checker: Found gateway $tmp_GW" if [ "$tmp_GW" != "$GW" ]; then echo "Checker: This gateway ($tmp_GW) is not $GW" route delete default gw "$tmp_GW" 2>/dev/null || echo "Checker: Remove error (unknown gateway [$tmp_GW])" echo "Checker: Gateway ($tmp_GW) removed" else echo "Checker: Skipping" fi done } # If empty $ACTIVE_INTERFACE if [[ -z "$ACTIVE_INTERFACE" ]]; then echo "Checker: Active interface is empty" /root/failover/modify_routes echo "Checker: End" exit 0 fi echo "Checker: Try to get back to 1st-priority interface" if [ "$ACTIVE_INTERFACE" != "${WATCH_INTERFACES[0]}" ]; then echo "Checker: Active interface is not the 1-st priority one (${WATCH_INTERFACES[0]})" echo "Checker: Source $INTERFACES_DATA_DIR/${WATCH_INTERFACES[0]}" [ -f "$INTERFACES_DATA_DIR/${WATCH_INTERFACES[0]}" ] && source "$INTERFACES_DATA_DIR/${WATCH_INTERFACES[0]}" echo "Checker: Pinging $INTERFACE_GATEWAY through ${WATCH_INTERFACES[0]}" if ping -q -c 1 -W 1 -I "${WATCH_INTERFACES[0]}" $INTERFACE_GATEWAY >/dev/null 2>&1; then echo "Checker: Good!" /root/failover/modify_routes echo "Checker: End" exit 0 else echo "Checker: Bad" fi else echo "Checker: Active interface is already the 1-st priority one (${WATCH_INTERFACES[0]})" fi echo "Checker: Common case for currently active interface" echo "Checker: Source $INTERFACES_DATA_DIR/$ACTIVE_INTERFACE" [ -f "$INTERFACES_DATA_DIR/$ACTIVE_INTERFACE" ] && source "$INTERFACES_DATA_DIR/$ACTIVE_INTERFACE" echo "Checker: Pinging $INTERFACE_GATEWAY through $ACTIVE_INTERFACE" if ping -q -c 1 -W 1 -I $ACTIVE_INTERFACE $INTERFACE_GATEWAY >/dev/null 2>&1; then echo "Checker: Good!" remove_all_default_gateways_except_one $INTERFACE_GATEWAY echo "Checker: End" exit 0 else echo "Checker: Bad" /root/failover/modify_routes echo "Checker: End" exit 0 fi 

Cheers!

3
  • What an effort! Something like a Rube Goldberg machine. You was told wrong things that you cannot use bonding for your problem. It is a classical use case for bonding Mode 1 (active-backup). I have shown you an example. Commented May 12, 2019 at 21:06
  • "It is a classical use case for bonding" - I'm of the same opinion! However I could not make it working. Unfortunately I don't have enough background in linux administration. Commented May 13, 2019 at 9:09
  • You could use my example. Then I can help you. Commented May 13, 2019 at 9:15

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.