CentOS

Avoiding kernel crashes when under DDoS attacks with CentOS 5

iWeb Technologies has recently been the victim of several [1 2 3 4] distributed denial of service (DDoS) attacks over the past three or so weeks and it's become a rather irritating issue for iWeb's customers. Not all of their server or their customer's co-located servers are affected in each attack, but often during the attacks there is a general slowdown on their network and a minority of the servers (both hosted and co-located) experience some packet loss.

The server hosting this website was, unfortunately, was one the servers that went completely offline during the DDoS attack on October 14th. iWeb has been a great host so far and their staff looked into the issue immediately after I submitted a support ticket, so within 30 minutes I had a KVM/IP attached to my server.

After an hour or so iWeb had the DDoS attacks under control but my server's kernel would panic within roughly 10 minutes of the network going up. The time between each kernel panic was inconsistent, but would it would crash every time the network went up given a bit of time. I found this message repeated in my server logs every time before a hang:

ipt_hook: happy cracking.

That led me to this post dating back to November 2003 on the Linux Kernel Mailing Lists (LKML) about the same message. I jumped on irc.freenode.net and joined #netfilter to ask what they thought about the message. While I waited for a response from #netfilter, I looked into installing kexec in order to get a backtrace since the kernel oops messages were not being logged (This tutorial proved especially handy).

The backtrace revealed that the problem was indeed in the netfilter/iptables kernel modules, but seemed to be triggered by QEMU-KVM:

Process qemu-kvm (pid: 3911, threadinfo ffff8101fc74c000, task ffff8101fe4ba100)
Stack:  0001d7000001d600 0001d9000001d800 0001db000001da00 0001dd000001dc00
0001df000001de00 0001e1000001e000 0001e3000001e200 0001e5000001e400
0001e7000001e600 0001e9000001e800 ffff81020001ea00 ffff8101fe5d8bc0
Call Trace:
<IRQ>  [<ffffffff80236459>] dev_hard_start_xmit+0x1b7/0x28a
[<ffffffff88665bab>] :ip_tables:ipt_do_table+0x295/0x2fa
[<ffffffff886c7b2c>] :bridge:br_nf_post_routing+0x17c/0x197
[<ffffffff80034077>] nf_iterate+0x41/0x7d
[<ffffffff886c7816>] :bridge:br_nf_local_out_finish+0x0/0x9b
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c7816>] :bridge:br_nf_local_out_finish+0x0/0x9b
[<ffffffff8000f470>] __alloc_pages+0x78/0x308
[<ffffffff886c85f9>] :bridge:br_nf_local_out+0x23f/0x25e
[<ffffffff80034077>] nf_iterate+0x41/0x7d
[<ffffffff886c3192>] :bridge:br_forward_finish+0x0/0x51
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c3192>] :bridge:br_forward_finish+0x0/0x51
[<ffffffff8025042d>] rt_intern_hash+0x474/0x4a0
[<ffffffff886c3367>] :bridge:__br_deliver+0xb4/0xfc
[<ffffffff886c2294>] :bridge:br_dev_xmit+0xc7/0xdb
[<ffffffff80236459>] dev_hard_start_xmit+0x1b7/0x28a
[<ffffffff8002f76f>] dev_queue_xmit+0x1f3/0x2a3
[<ffffffff80031e19>] ip_output+0x2ae/0x2dd
[<ffffffff8025359a>] ip_forward+0x24f/0x2bd
[<ffffffff8003587a>] ip_rcv+0x539/0x57c
[<ffffffff80020c21>] netif_receive_skb+0x470/0x49f
[<ffffffff886c3ef9>] :bridge:br_handle_frame_finish+0x1bc/0x1d3
[<ffffffff886c801b>] :bridge:br_nf_pre_routing_finish+0x2e9/0x2f8
[<ffffffff886c7d32>] :bridge:br_nf_pre_routing_finish+0x0/0x2f8
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c7d32>] :bridge:br_nf_pre_routing_finish+0x0/0x2f8
[<ffffffff886c8c18>] :bridge:br_nf_pre_routing+0x600/0x61c
[<ffffffff80034077>] nf_iterate+0x41/0x7d
[<ffffffff886c3d3d>] :bridge:br_handle_frame_finish+0x0/0x1d3
[<ffffffff800565d5>] nf_hook_slow+0x58/0xbc
[<ffffffff886c3d3d>] :bridge:br_handle_frame_finish+0x0/0x1d3
[<ffffffff886c407e>] :bridge:br_handle_frame+0x16e/0x1a4
[<ffffffff800a4e3f>] ktime_get_ts+0x1a/0x4e
[<ffffffff80020b34>] netif_receive_skb+0x383/0x49f
[<ffffffff8003055c>] process_backlog+0x89/0xe7
[<ffffffff8000ca51>] net_rx_action+0xac/0x1b1
[<ffffffff80012562>] __do_softirq+0x89/0x133
[<ffffffff8005e2fc>] call_softirq+0x1c/0x28
<EOI>  [<ffffffff8006d636>] do_softirq+0x2c/0x7d
[<ffffffff8004de57>] netif_rx_ni+0x19/0x1d
[<ffffffff887b951d>] :tun:tun_chr_writev+0x3b4/0x402
[<ffffffff887b956b>] :tun:tun_chr_write+0x0/0x1f
[<ffffffff800e3307>] do_readv_writev+0x172/0x291
[<ffffffff887b956b>] :tun:tun_chr_write+0x0/0x1f
[<ffffffff80041ef3>] do_ioctl+0x21/0x6b
[<ffffffff8002fff5>] vfs_ioctl+0x457/0x4b9
[<ffffffff800b9c60>] audit_syscall_entry+0x1a8/0x1d3
[<ffffffff800e34b0>] sys_writev+0x45/0x93
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

Code: c3 41 56 41 55 41 54 55 48 89 fd 53 8b 87 88 00 00 00 89 c2
RIP  [<ffffffff80268427>] icmp_send+0x5bf/0x5c0
RSP <ffff810107bb7830>

This was interesting as I do have several KVM virtual machines running on the server, some with bridged networking and others with shared networking. The #netfilter guys confirmed that the happy cracking message was due to the attempted creation of malformed packets by root. My guest guess was that some of the packets from the DDoS attacks were hitting my server and so the bridge was faithfully attempting to forward those invalid packets to one of the virtual machine's network interfaces, causing problems in icmp_send().

The LKML message hinted the REJECT policy could be at fault, so I opened up /etc/sysconfig/iptables and switched to a DROP policy:

*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -m physdev  --physdev-is-bridged -j ACCEPT
# ... a bunch of forwarding rules for the shared network VMs
-A FORWARD -o virbr0 -j DROP # <-- THIS ONE
-A FORWARD -i virbr0 -j DROP # <-- THIS ONE
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
# ... a bunch of ACCEPT rules for the server
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j DROP # <-- THIS ONE
COMMIT

And that was all it took! I did not experience a hang after that. According to the guys in #netfilter, newer kernels reassemble packets from scratch when forwarding them, so if a machine is sent an invalid packet this problem is averted. However, CentOS 5 uses an older kernel and I guess this hasn't been backported.

TL;DR: If you using CentOS 5 (or any other distro with an older kernel), use a DROP policy in your iptables configuration instead of REJECT!

Rating: 

An update on the CentOS 5 server setup series

I posted back in May about the start of my the CentOS 5 Server Setup series and I wanted to give a quick update on it. Since writing the first guide back in May, I have made a few changes and additions to the original getting started and mail server guides in addition to posting several new guides:

Enjoy! If you have any questions or comments I would be happy to hear some feedback. You can reach me at s.adam@diffingo.com.

Rating: 

First part of CentOS 5 server setup howto series now available

After much research, experimentation, testing and tweaking I'm happy to announce that I have completed the first part of my CentOS 5 server setup howto series!

As of today, you'll notice a new CentOS 5 Howtos link on the where I have listed the first two parts of the howto series, the getting started howto which will help you setup a basic system environment and more importantly, the mail server howto which documents how to setup a secure mail server offering POP3/IMAP/SMTP with virtual users stored in a MySQL database.

I'm very happy with this setup because it uses virtual users that cam be mapped to system users and also keeps the software set relatively small; Dovecot is used for SASL authentication (both for POP3/IMAP and SMTP) and for postfix's local delivery agent, so with only 2 servers we've got it all covered (of course technically it's 3 servers with an extra transport if you take amavisd and response-lmtpd into account).

The virtual user database is currently only used in this tutorial for the mail server, but I have plans to introduce (with an upgrade path) a new database structure that will unify several authentication data pools and make managing clients for a shared hosting server easier... But I'll talk more about that later once I've finished posting my other guides. I plan on adding ones for other services such as DNS & Web, although I cannot promise when those will be finished. The mail server tutorial alone is 16 printed pages (!) so it does take me quite some time to ensure that the tutorial is well documented and that the configurations listed work properly.

I still have to add some notes here and there about the implementation, but the core material is there. Enjoy!

Rating: 

Cryptic MySQL error

Today as I was attempting to test one of my PHP applications, I received this error after attempting to connect to a MySQL database:

Warning:  mysql_connect() [function.mysql-connect]: OK packet 6 bytes shorter
than expected in index.php on line 29

Warning:  mysql_connect() [function.mysql-connect]: mysqlnd cannot connect to
MySQL 4.1+ using old authentication in index.php on line 29

The script giving the error was running on OS X 10.6.4 with the stock PHP 5.3.1. After doing a bit of searching and reading the MySQL documentation on the old password format, I was a bit confused because I ran this on the server:
[user@host ~]# rpm -q mysql mysql-server
mysql-5.0.77-4.el5_5.3
mysql-server-5.0.77-4.el5_5.3

Both the server and client should support the new authentication version, which was introduced all the way back in MySQL 4.1. So why wouldn't it connect?

It turns out that CentOS 5 disables the new password hashes by default in favour of remaining compatible with 3.x (and earlier) MySQL clients. All you have to do is edit /etc/my.cnf and comment the old_passwords=1 line. After restarting the server, you should notice that running SELECT PASSWORD('foobar'); in a MySQL prompt will return 41-character hashes, not the old-style 16 character hashes. Reset the user passwords to start using the new hashes and you'll be good to go.

Rating: