bert hubert <ahu@ds9a.nl>
Welcome! This page reflects some experiments I did that show promise in providing loadbalancing which can be very interesting in some situations.
Update! Horms has implemented this idea & more in Active-Active.
This is most useful for services which are CPU bound and not network bound.
Doing so is expensive and often not needed. It is however a very good way of scaling to enormous bandwidths - because of the tricks these solutions employ, they are able to do gigabits of traffic.
We want to be able to provide loadbalancing for hosts that do not saturate their ethernet, but do need more CPU or IO horsepower than a single box can provide.
Even if you are confident that you are savvy enough to fool around, only use what we descibe here if your service is CPU or IO bound, and if you are not saturating your network. If the latter is the case, doing loadbalancing like this will only hurt performance!
We'll assume that you have four servers, 192.168.0.10 to 192.168.0.13, and
that the service you want to provide will live on the virtual IP address
192.168.0.2. We also assume that your subnet is 192.168.0.0/24
(192.168.0.0-192.168.0.255), and that your default gateway is 192.168.0.1,
which need not be a Linux machine. Furthermore, you are using a hub and not
a switch.
In ascii art:
[Client] | [Internet] - 192.168.0.1 --[HUB]---+---------+-----+-----+ default | | | | gateway | | | | 192.168.0.10 11 12 13Ok - now a customer on the internet wants to access your webserver on 192.168.0.10, and a SYN packet (which starts a TCP/IP session) arrives at your default gateway, which then needs to access a host that feels responsible for 192.168.0.10.
In order to find the right host, the router sends out an Address Resolution Protocol (ARP) 'who-has 192.168.0.10? tell 192.168.0.1'-query. Normally then one of your servers responds with its MAC address '00:10:D7:01:20:11 has 192.168.0.10'. Your router then uses this information to route the SYN packet to the proper MAC address, which is then accepted by your webserver 192.168.0.10.
It is vital that you understand this before proceeding! The MAC address can be likened to the address of your building, '12 Router Avenue'. The destination IP address is like the name of your company. The router is the mailperson that stands in your street and shouts 'Where do I deliver mail for Evil Linux Routing Tricks INC?'. Your receptionist would then shout back 'Give it to the people over at 12 Router Avenue', which would prompt the mailperson to deliver mail at that building.
Router -> mailperson
Destination IP address -> company name
MAC Address (also Hardware Address, Ethernet Address) -> house number +
street
ARP query -> mailperson shouting 'Where do I deliver..'
ARP response -> receptionist that replies 'Over at 12 Router Avenue'
# ip link set eth0 down # ip link set eth0 address 1:0:0:0:0:0 # ip link set eth0 up # ip route add default via 192.168.0.1 # ip addr add dev eth0 192.168.0.2
FIXME: There are MAC addresses reserved for stunts like these, but I haven't yet looked them up - please let me know.
The first three commands are self explanatory. The fourth is needed to reestablish the default route that went down together with the interface. The last command then adds 192.168.0.2 to the list of addresses the host feels responsible for.
If you execute this remotely, make sure you do so from a script, as you might lose contact after 'ip link set eth0 down'! You might even wish to use 'nohup' to make sure your script survives. If you haven't yet tried the wonderful 'ip' tool, please install iproute2 - it is far superior in configuring the kernel than ifconfig and friends are.
The new picture:
[Client] | [Internet] - 192.168.0.1 --[HUB]---+---------+-----+-----+ default | | | | gateway | | | | 192.168.0.10 11 12 13 additional: 192.168.0.2 2 2 2 all have same MAC addressWhat then happens is that the SYN packet for 192.168.0.2 comes along, the router does an ARP query to get the MAC address, and gets 4 identical responses. This in itself is not a problem - it would be neater if only one machine responded, but hey.
Now comes the problem. The SYN packet gets transmitted over the network, and again all four machines respond with a SYN|ACK! The router doesn't care about this, it is an IP device and has no clue what a SYN|ACK packet is. So it sends all four packets back to the client that initiated the connection.
But the client now does get confused and swiftly drops the connection. Four almost, but not quite, identical SYN|ACK packets is too much to deal with for a simple client.
The solution is simple: for each SYN packet, only one host should respond. Now the problem is how to achieve that.
First let's do this for two hosts. We want all even IP addresses to go to 192.168.0.10, all odd ones to 192.168.0.11. We do do with the following iptables commands:
[192.168.0.10]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.0/0.0.0.1 -j DROP [192.168.0.11]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.1/0.0.0.1 -j DROP [192.168.0.12]# iptables -A INPUT -d 192.168.0.2 -j DROP [192.168.0.13]# iptables -A INPUT -d 192.168.0.2 -j DROPThe ip addresses between brackets denote on which hosts the commands need to be executed. We expressed the 'even/odd' constraint by using the rather unconventional 0.0.0.1 netmask, '-1' in /-notation.
Basically we say 'drop all traffic to 192.168.0.2 unless the source ip address is even' (or odd, in case of 192.168.0.11). More explicitly, 'drop all traffic to 192.168.0.2 if the last bit is/is not 0'.
Well, we're nearly there :-) If you now connect from the outside world to 192.168.0.2, depending on the even/oddness of your source IP address, you'll get connected to either 192.168.0.10 or to 192.168.0.11!
[192.168.0.10]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.0/0.0.0.3 -j DROP [192.168.0.11]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.1/0.0.0.3 -j DROP [192.168.0.12]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.2/0.0.0.3 -j DROP [192.168.0.13]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.3/0.0.0.3 -j DROPThis reads like 'drop all traffic to 192.168.0.2 *unless* the last 2 bits of the IP address are {00,01,10,11}'. If you have 8 hosts this starts to look something like this:
[192.168.0.10]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.0/0.0.0.7 -j DROP [192.168.0.11]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.1/0.0.0.7 -j DROP [192.168.0.12]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.2/0.0.0.7 -j DROP (...) [192.168.0.17]# iptables -A INPUT -d 192.168.0.2 \! -s 0.0.0.7/0.0.0.7 -j DROPIf your number of servers is not a power of 2, things get lots more interesting! See also the 'Where to go from here' chapter.
This prevents the servers from routing stuff back to the network and enables them to receive TCP and UDP traffic meant for them. All machines receive ICMP traffic for the virtual IP address, but iptables stateful filtering make sure that the kernel stack only sees relevant ICMP messages.
We also make sure that traffic to the non-virtual IP address *is* accepted properly. The line by line summary:
# iptables -A INPUT -p icmp --icmp-type echo-request -j ACCEPT -d 192.168.0.2 -j ACCEPT
Such a tool would also calculate and insert the right iptables rules automatically.
Solutions might be to get netfilter in a position where it can change source MAC addresses on outgoing packets. This should also happen on ARP queries and replies. As far as I know this is a hot item currently.
Another solution would be to teach linux that a card can have two addresses, a 'listen address' and a 'send address'.
I will be discussing this with the relevant people. If you feel that you are one of those people, please contact me.