Migrating a live server to another host with no downtime

I have had a 1U server co-located for some time now at iWeb Technologies' datacenter in Montreal. So far I've had no issues and it did a wonderful job hosting websites & a few other VMs, but because of my concern for its aging hardware I wanted to migrate away before disaster struck. Modern VPS offerings are a steal in terms of they performance they offer for the price, and Linode's 4096 plan caught my eye at a nice sweet spot. Backed by powerful CPUs and SSD storage, their VPS is blazingly fast and the only downside is I would lose some RAM and HDD-backed storage compared to my 1U server. The bandwidth provided wit the Linode was also a nice bump up from my previous 10Mbps, 500GB/mo traffic limit. When CentOS 7 was released I took the opportunity to immediately start modernizing my CentOS 5 configuration and test its configuration. I wanted to ensure full continuity for client-facing services - other than a nice speed boost, I wanted clients to take no manual action on their end to reconfigure their devices or domains. I also wanted to ensure zero downtime. As the DNS A records are being migrated, I didn't want emails coming in to the wrong server (or clients checking a stale inboxes until they started seeing the new mailserver IP). I can easily configure Postfix to relay all incoming mail on the CentOS 5 server to the IP of the CentOS 7 one to avoid any loss of emails, but there's still the issue that some end users might connect to the old server and get served their old IMAP inbox for some time. So first things first, after developing a prototype VM that offered the same service set I went about buying a small Linode for a month to test the configuration some of my existing user data from my CentOS 5 server. MySQL was sufficiently easy to migrate over and Dovecot was able to preserve all UUIDs, so my inbox continued to sync seamlessly. Apache complained a bit when importing my virtual host configurations due to the new 2.4 syntax, but nothing a few sed commands couldn't fix. So with full continuity out of the way, I had to develop a strategy to handle zero downtime. With some foresight and DNS TTL adjustments, we can get near zero downtime assuming all resolvers comply with your TTL. Simply set your TTL to 300 (5 minutes) a day or so before the migration occurs and as your old TTL expires, resolvers will see the new TTL and will not cache the IP for as long. Even with a short TTL, that's still up to 5 minutes of downtime and clients often do bad things... The IP might still be cached (e.g. at the ISP, router, OS, or browser) for longer. Ultimately, I'm the one that ends up looking bad in that scenario even though I have done what I can on the server side and have no ability to fix the broken clients. To work around this, I discovered an incredibly handy tool socat that can make magic happen. socat routes data between sockets, network connections, files, pipes, you name it. Installing it is as easy as: yum install socat A quick script later and we can forward all connections from the old host to the new host:

# Stop services on this host
for SERVICE in dovecot postfix httpd mysqld;do
  /sbin/service $SERVICE stop

# Some cleanup
rm /var/lib/mysql/mysql.sock

# Map the new server's MySQL to localhost:3307
# Assumes capability for password-less (e.g. pubkey) login
ssh $NEWIP -L 3307:localhost:3306 &
socat unix-listen:/var/lib/mysql/mysql.sock,fork,reuseaddr,unlink-early,unlink-close,user=mysql,group=mysql,mode=777 TCP:localhost:3307 &

# Map ports from each service to the new host
for PORT in 110 995 143 993 25 465 587 80 3306;do
  echo "Starting socat on port $PORT..."
  socat TCP-LISTEN:$PORT,fork TCP:${NEWIP}:${PORT} &
  sleep 1
And just like that, every connection made to the old server is immediately forwarded to the new one. This includes the MySQL socket (which is automatically used instead of a TCP connection a host of 'localhost' is passed to MySQL). Note how we establish a SSH tunnel mapping a connection to localhost:3306 on the new server to port 3307 on the old one instead of simply forwarding the connection and socket to the new server - this is done so that if you have users who are permitted on 'localhost' only, they can still connect (forwarding the connection will deny access due to a connection from a unauthorized remote host). Update: a friend has pointed out this video to me, if you thought 0 downtime was bad enough... These guys move a live server 7km through public transport without losing power or network!

Avoiding kernel crashes when under DDoS attacks with CentOS 5

iWeb Technologies has recently been the victim of several [1 2 3 4] distributed denial of service (DDoS) attacks over the past three or so weeks and it's become a rather irritating issue for iWeb's customers. Not all of their server or their customer's co-located servers are affected in each attack, but often during the attacks there is a general slowdown on their network and a minority of the servers (both hosted and co-located) experience some packet loss.

The server hosting this website was, unfortunately, was one the servers that went completely offline during the DDoS attack on October 14th. iWeb has been a great host so far and their staff looked into the issue immediately after I submitted a support ticket, so within 30 minutes I had a KVM/IP attached to my server.

After an hour or so iWeb had the DDoS attacks under control but my server's kernel would panic within roughly 10 minutes of the network going up. The time between each kernel panic was inconsistent, but would it would crash every time the network went up given a bit of time. I found this message repeated in my server logs every time before a hang:

ipt_hook: happy cracking.

That led me to this post dating back to November 2003 on the Linux Kernel Mailing Lists (LKML) about the same message. I jumped on irc.freenode.net and joined #netfilter to ask what they thought about the message. While I waited for a response from #netfilter, I looked into installing kexec in order to get a backtrace since the kernel oops messages were not being logged (This tutorial proved especially handy).

The backtrace revealed that the problem was indeed in the netfilter/iptables kernel modules, but seemed to be triggered by QEMU-KVM:

Process qemu-kvm (pid: 3911, threadinfo ffff8101fc74c000, task ffff8101fe4ba100)
Stack:  0001d7000001d600 0001d9000001d800 0001db000001da00 0001dd000001dc00
0001df000001de00 0001e1000001e000 0001e3000001e200 0001e5000001e400
0001e7000001e600 0001e9000001e800 ffff81020001ea00 ffff8101fe5d8bc0
Call Trace:
<IRQ>  [<ffffffff80236459>] dev_hard_start_xmit+0x1b7/0x28a
[<ffffffff88665bab>] :ip_tables:ipt_do_table+0x295/0x2fa
[<ffffffff886c7b2c>] :bridge:br_nf_post_routing+0x17c/0x197
[<ffffffff80034077>] nf_iterate+0x41/0x7d
[<ffffffff886c7816>] :bridge:br_nf_local_out_finish+0x0/0x9b
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c7816>] :bridge:br_nf_local_out_finish+0x0/0x9b
[<ffffffff8000f470>] __alloc_pages+0x78/0x308
[<ffffffff886c85f9>] :bridge:br_nf_local_out+0x23f/0x25e
[<ffffffff80034077>] nf_iterate+0x41/0x7d
[<ffffffff886c3192>] :bridge:br_forward_finish+0x0/0x51
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c3192>] :bridge:br_forward_finish+0x0/0x51
[<ffffffff8025042d>] rt_intern_hash+0x474/0x4a0
[<ffffffff886c3367>] :bridge:__br_deliver+0xb4/0xfc
[<ffffffff886c2294>] :bridge:br_dev_xmit+0xc7/0xdb
[<ffffffff80236459>] dev_hard_start_xmit+0x1b7/0x28a
[<ffffffff8002f76f>] dev_queue_xmit+0x1f3/0x2a3
[<ffffffff80031e19>] ip_output+0x2ae/0x2dd
[<ffffffff8025359a>] ip_forward+0x24f/0x2bd
[<ffffffff8003587a>] ip_rcv+0x539/0x57c
[<ffffffff80020c21>] netif_receive_skb+0x470/0x49f
[<ffffffff886c3ef9>] :bridge:br_handle_frame_finish+0x1bc/0x1d3
[<ffffffff886c801b>] :bridge:br_nf_pre_routing_finish+0x2e9/0x2f8
[<ffffffff886c7d32>] :bridge:br_nf_pre_routing_finish+0x0/0x2f8
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c7d32>] :bridge:br_nf_pre_routing_finish+0x0/0x2f8
[<ffffffff886c8c18>] :bridge:br_nf_pre_routing+0x600/0x61c
[<ffffffff80034077>] nf_iterate+0x41/0x7d
[<ffffffff886c3d3d>] :bridge:br_handle_frame_finish+0x0/0x1d3
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c3d3d>] :bridge:br_handle_frame_finish+0x0/0x1d3
[<ffffffff886c407e>] :bridge:br_handle_frame+0x16e/0x1a4
[<ffffffff800a4e3f>] ktime_get_ts+0x1a/0x4e
[<ffffffff80020b34>] netif_receive_skb+0x383/0x49f
[<ffffffff8003055c>] process_backlog+0x89/0xe7
[<ffffffff8000ca51>] net_rx_action+0xac/0x1b1
[<ffffffff80012562>] __do_softirq+0x89/0x133
[<ffffffff8005e2fc>] call_softirq+0x1c/0x28
<EOI>  [<ffffffff8006d636>] do_softirq+0x2c/0x7d
[<ffffffff8004de57>] netif_rx_ni+0x19/0x1d
[<ffffffff887b951d>] :tun:tun_chr_writev+0x3b4/0x402
[<ffffffff887b956b>] :tun:tun_chr_write+0x0/0x1f
[<ffffffff800e3307>] do_readv_writev+0x172/0x291
[<ffffffff887b956b>] :tun:tun_chr_write+0x0/0x1f
[<ffffffff80041ef3>] do_ioctl+0x21/0x6b
[<ffffffff8002fff5>] vfs_ioctl+0x457/0x4b9
[<ffffffff800b9c60>] audit_syscall_entry+0x1a8/0x1d3
[<ffffffff800e34b0>] sys_writev+0x45/0x93
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

Code: c3 41 56 41 55 41 54 55 48 89 fd 53 8b 87 88 00 00 00 89 c2
RIP  [<ffffffff80268427>] icmp_send+0x5bf/0x5c0
RSP <ffff810107bb7830>

This was interesting as I do have several KVM virtual machines running on the server, some with bridged networking and others with shared networking. The #netfilter guys confirmed that the happy cracking message was due to the attempted creation of malformed packets by root. My guest guess was that some of the packets from the DDoS attacks were hitting my server and so the bridge was faithfully attempting to forward those invalid packets to one of the virtual machine's network interfaces, causing problems in icmp_send().

The LKML message hinted the REJECT policy could be at fault, so I opened up /etc/sysconfig/iptables and switched to a DROP policy:

:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -m physdev  --physdev-is-bridged -j ACCEPT
# ... a bunch of forwarding rules for the shared network VMs
-A FORWARD -o virbr0 -j DROP # <-- THIS ONE
-A FORWARD -i virbr0 -j DROP # <-- THIS ONE
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
# ... a bunch of ACCEPT rules for the server
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j DROP # <-- THIS ONE

And that was all it took! I did not experience a hang after that. According to the guys in #netfilter, newer kernels reassemble packets from scratch when forwarding them, so if a machine is sent an invalid packet this problem is averted. However, CentOS 5 uses an older kernel and I guess this hasn't been backported.

TL;DR: If you using CentOS 5 (or any other distro with an older kernel), use a DROP policy in your iptables configuration instead of REJECT!


An update on the CentOS 5 server setup series

I posted back in May about the start of my the CentOS 5 Server Setup series and I wanted to give a quick update on it. Since writing the first guide back in May, I have made a few changes and additions to the original getting started and mail server guides in addition to posting several new guides:

Enjoy! If you have any questions or comments I would be happy to hear some feedback. You can reach me at s.adam@diffingo.com.


First part of CentOS 5 server setup howto series now available

After much research, experimentation, testing and tweaking I'm happy to announce that I have completed the first part of my CentOS 5 server setup howto series!

As of today, you'll notice a new CentOS 5 Howtos link on the where I have listed the first two parts of the howto series, the getting started howto which will help you setup a basic system environment and more importantly, the mail server howto which documents how to setup a secure mail server offering POP3/IMAP/SMTP with virtual users stored in a MySQL database.

I'm very happy with this setup because it uses virtual users that cam be mapped to system users and also keeps the software set relatively small; Dovecot is used for SASL authentication (both for POP3/IMAP and SMTP) and for postfix's local delivery agent, so with only 2 servers we've got it all covered (of course technically it's 3 servers with an extra transport if you take amavisd and response-lmtpd into account).

The virtual user database is currently only used in this tutorial for the mail server, but I have plans to introduce (with an upgrade path) a new database structure that will unify several authentication data pools and make managing clients for a shared hosting server easier... But I'll talk more about that later once I've finished posting my other guides. I plan on adding ones for other services such as DNS & Web, although I cannot promise when those will be finished. The mail server tutorial alone is 16 printed pages (!) so it does take me quite some time to ensure that the tutorial is well documented and that the configurations listed work properly.

I still have to add some notes here and there about the implementation, but the core material is there. Enjoy!


Cryptic MySQL error

Today as I was attempting to test one of my PHP applications, I received this error after attempting to connect to a MySQL database:

Warning:  mysql_connect() [function.mysql-connect]: OK packet 6 bytes shorter
than expected in index.php on line 29

Warning:  mysql_connect() [function.mysql-connect]: mysqlnd cannot connect to
MySQL 4.1+ using old authentication in index.php on line 29

The script giving the error was running on OS X 10.6.4 with the stock PHP 5.3.1. After doing a bit of searching and reading the MySQL documentation on the old password format, I was a bit confused because I ran this on the server:
[user@host ~]# rpm -q mysql mysql-server

Both the server and client should support the new authentication version, which was introduced all the way back in MySQL 4.1. So why wouldn't it connect?

It turns out that CentOS 5 disables the new password hashes by default in favour of remaining compatible with 3.x (and earlier) MySQL clients. All you have to do is edit /etc/my.cnf and comment the old_passwords=1 line. After restarting the server, you should notice that running SELECT PASSWORD('foobar'); in a MySQL prompt will return 41-character hashes, not the old-style 16 character hashes. Reset the user passwords to start using the new hashes and you'll be good to go.