7. A DNS load balanced HA cluster with Bind9 and BalanceNG

Abstract

This example shows how to setup a dual node, load balanced and high available DNS cluster with Bind9 and BalanceNG in a few easy steps. This is exactly the way we operate our own primary DNS server in our DMZ.

Step 1: Preparing the Nodes

We picked two 1U boxes having two Intel based 1GBit interfaces already on board, which is just fine for this setup. We decided to install a 8GByte Transcend Flash Disk SSD module instead which is plugged directly into the IDE interface connector on the motherboard. The benefits of a flash solid state disk are substantial better MTBF measures and also reduced power consumption. 8GBytes of SSD HD space is more than enough.

The OS in use for this setup is Ubuntu Server LTS.

Step 2: Physical Network Configuration

Both nodes are connected to the 1GBit core switch in the DMZ, where eth0 interface is configured as usual and the eth1 interface is being used exclusively by BalanceNG in DSR Direct Server Return mode.

The network setup overview looks like this:

Dual Node Bind9/BalanceNG cluster in DSR mode

Step 3: Linux Network Configuration

We are calling the one node ns0a and the other ns0b. There’s no dedicated VRPP mastership (The “first” one booted gets master first).

The DNS cluster virtual IP address will be 10.235.210.1, which requires a loopback alias to be established. Node ns0a gets the Linux OS address 10.235.210.43 and node ns0b the address 10.235.210.44 (both used for usual Linux administraation with SSH).

/etc/network/interfaces of ns0a looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0 lo:0 lo:1

iface eth0 inet static
 address 10.235.210.43
 netmask 255.255.255.0
 gateway 10.235.210.254

iface lo:0 inet static
 address 10.235.210.1
 netmask 255.255.255.255

and /etc/network/interfaces of ns0b looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0 lo:0 lo:1

iface eth0 inet static
 address 10.235.210.44
 netmask 255.255.255.0
 gateway 10.235.210.254

iface lo:0 inet static
 address 10.235.210.1
 netmask 255.255.255.255

The following lines need to be appended/included to /etc/sysctl.conf on both nodes in order to prevent “ARP flux” problems as also described in our FAQ and knowledge base:

net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=2

Step 4: Installation of required Packages

The following Ununtu/Debian Packages need to be installed on both nodes as the next step:

  • bind9
  • mon
  • libnet-dns-perl
  • BalanceNG (64Bit .deb package as available for download)

Step 5: Licensing

The nodeid of the BalanceNG host can be retrieved that way:

# bng -N
11:22:33:44:55:66

The license is activated by the “license” configuration command which we insert into the file /etc/bng.global. This makes the license active for all instances of BalanceNG on the particular node (or Linux box).

After key generation we insert into /etc/bng.global on ns0a this line:

license NS0ATEST 17d17854ad3d234d1e8603629f0ae5ae

and on ns0b this line:

license NS0BTEST 124f4ebf63e3f5dab69fb813ef3d0216

Licensing can then be verified as follows:

# bng control
BalanceNG: connected to PID 14598
bng# show license
 status: valid full license
 serial: NS0ATEST
 nodeid: 11:22:33:44:55:66
 type "show version" for version and Copyright information
bng#

Step 6: Bind9 Preparation

This step is done as usual, we just have to make sure that Bind9 is listening on the following addresses:

  • The loopback address 127.0.0.1,
  • the virtual loopback alias address 10.235.210.1,
  • and on the eth0 native address (10.235.210.43 on ns0a and 10.235.210.44 on ns0b).

We just used the following line in the options-Section of /etc/bind/named.conf to establish this:

listen-on {any;};

After configuration we established a SSH-key trust relationship between the nodes and wrote a small script which allows to keep the zone files in /etc/bind on both nodes in sync.

The remaining part of the Bind9 configuration is “as usual” and not in the scope of this example. The “BIND 9 Administrator Reference Manual” is a very helpful resource for this task.

Step 7: BalanceNG agent configuration and script

We decided for this setup to make use of the BalanceNG agent “bngagent” health check script capabilities. The file /etc/rc.local on both nodes looks like this for this purpose:

#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

bngagent -c"/etc/bind/agentcheck.sh" 439
exit 0

And the file /etc/bind/agentcheck.sh skript looks like this (using the dns.monitor of the “mon” monitoring daemon):

#!/bin/sh
/usr/lib/mon/mon.d/dns.monitor -zone balanceng.net -master 10.235.210.1 10.235.210.1
if [ "$?" = "0" ]
then
 echo 1
else
 echo 0
fi

In this case the dns.monitor checks the presence of Bind9 by contacting the DNS service on the loopback alias address (which is identical to the BalanceNG virtual server address of “server 1”).

Step 8: The BalanceNG Configuration Files

BalanceNG configuration for NS0A

//        configuration taken ...
//        BalanceNG ...
hostname  NS0A
set       localdsr 1
interface eth1
vrrp      {
          vrid 7
          priority 200
          network 1
}
network   1 {
          name "local network"
          addr 10.235.210.0
          mask 255.255.255.0
          real 10.235.210.45
          interface eth1
}
register  network 1
enable    network 1
server    1 {
          ipaddr 10.235.210.1
          port 53
          method session
          targets 1,2
}
register  server 1
enable    server 1
target    1 {
          ipaddr 10.235.210.43
          port 53
          ping 5,12
          agent 439,5,13
          tcpopen 53,5,12
          dsr enable
}
target    2 {
          ipaddr 10.235.210.44
          port 53
          ping 5,12
          agent 439,5,13
          tcpopen 53,5,12
          dsr enable
}
register  targets 1,2
enable    targets 1,2
//        end of configuration

BalanceNG configuration for NS0B

 //        configuration taken Thu May 1 20:49:00 2008
 //        BalanceNG 2.084 (created 2008/05/01)
 hostname  NS0B
 set       localdsr 1
 interface eth1
 vrrp      {
           vrid 7
           priority 200
           network 1
 }
 network   1 {
           name "local network"
           addr 10.235.210.0
           mask 255.255.255.0
           real 10.235.210.46
           interface eth1
 }
 register  network 1
 enable    network 1
 server    1 {
           ipaddr 10.235.210.1
           port 53
           method session
           targets 1,2
 }
 register  server 1
 enable    server 1
 target    1 {
           ipaddr 10.235.210.43
           port 53
           ping 5,12
           agent 439,5,13
           tcpopen 53,5,12
           dsr enable
 }
 target    2 {
           ipaddr 10.235.210.44
           port 53
           ping 5,12
           agent 439,5,13
           tcpopen 53,5,12
           dsr enable
 }
 register  targets 1,2
 enable    targets 1,2
 //        end of configuration
 

Step 9: Testing

At the very end you should be able to see BalanceNG sessions being created and name resolution should work like a charm. A typical CLI dialog could look like this (on the current VRRP master NS0A):

root@ns0a:~# bng control
BalanceNG: connected to PID 10668
NS0A# show vrrp
 state MASTER
 vrid 7
 priority 200
 ipaddr0 82.135.110.1
NS0A# show sessions
 71 sessions
 hash     ip-address      port  srv tgt age  stout
 -------- --------------- ----- --- --- ---- -------
 5604999  10.85.134.135   any     1   1   18 600
 5604997  10.85.134.133   any     1   2   18 600
 80901    10.1.60.5       any     1   1   20 600
 15546371 10.237.56.3     any     1   2   21 600
 13763506 10.210.3.178    any     1   2   44 600
 1679371  10.25.160.11    any     1   2   46 600
 15369391 10.234.132.175  any     1   2   58 600
 5915664  10.90.68.16     any     1   1  102 600
 6888621  10.105.28.173   any     1   1  107 600
 15571300 10.237.153.100  any     1   2  109 600
 2328074  10.35.134.10    any     1   1  113 600
 ... remaining sessions not shown
NS0A#

On the current backup (NS0B in this example) you can verify the VRRP state as follows:

BalanceNG: connected to PID 10359
NS0B# show vrrp
 state BACKUP
 vrid 7
 priority 200
 ipaddr0 82.135.110.1
NS0B#

Because of the very short living DNS sessions it’s not necessary to synchronise the session table between the nodes.