Policy-based routing on Linux to forward packets from a subnet or process through a VPN

In my last post, I covered how to route packages from a specific VLAN through a VPN on the USG. Here, I will show how to use policy-based routing on Linux to route packets from specific processes or subnets through a VPN connection on a Linux host in your LAN instead. You could then point to this host as the next-hop for a VLAN on your USG to achieve the same effect as in my last post.

Note that this post will assume a modern tooling including firewalld and NetworkManager, and that subnet is your LAN. This post will send packets coming from to VPN, but you could customize that as you see fit (e.g. send specific only hosts from your normal LAN subnet instead).

VPN network interface setup

First, let's create a VPN firewalld zone so we can easily apply firewall rules just to the VPN connection:

firewall-cmd --permanent --new-zone=VPN
firewall-cmd --reload

Next, create the VPN interface with NetworkManager:


# Setup VPN connection with NetworkManager
dnf install -y NetworkManager-openvpn
nmcli c add type vpn ifname vpn con-name vpn vpn-type openvpn
nmcli c mod vpn "VPN"
nmcli c mod vpn connection.autoconnect "yes"
nmcli c mod vpn ipv4.method "auto"
nmcli c mod vpn ipv6.method "auto"

# Ensure it is never set as default route, nor listen to its DNS settings
# (doing so would push the VPN DNS for all lookups)
nmcli c mod vpn ipv4.never-default "yes"
nmcli c mod vpn ipv4.ignore-auto-dns on
nmcli c mod vpn ipv6.never-default "yes"
nmcli c mod vpn ipv6.ignore-auto-dns on

# Set configuration options
nmcli c mod vpn "comp-lzo = adaptive, ca = /etc/openvpn/keys/vpn-ca.crt, password-flags = 0, connection-type = password, remote = remote.vpnhost.tld, username = $VPN_USER, reneg-seconds = 0"

# Configure VPN secrets for passwordless start
cat << EOF >> /etc/NetworkManager/system-connections/vpn

systemctl restart NetworkManager

Configure routing table and policy-based routing

Normally, a host has a single routing table and therefore only 1 default gateway. Static routes can be configured for next-hops, this is configuring the system to route based a packet's destination address, and we want to know how route based on the source address of a packet. For this, we need multiple routing tables (one for normal traffic, another for VPN traffic) and Policy Based Routing (PBR) to define rules on how to select the right one.

First, let's create a second routing table for VPN connections:

cat << EOF >> /etc/iproute2/rt_tables
100 vpn

Next, setup an IP rule to select between routing tables for incoming packets based on their source addres:

# Replace this with your LAN interface

# Route incoming packets on VPN subnet towards VPN interface
cat << EOF >> /etc/sysconfig/network-scripts/rule-$IFACE
from table vpn

Now that we can properly select which routing table to use, we need to configure routes on the vpn routing table:

cat << EOF > /etc/sysconfig/network-scripts/route-$IFACE
# Always allow LAN connectivity dev $IFACE scope link metric 98 table vpn dev $IFACE scope link metric 99 table vpn

# Blackhole by default to avoid privacy leaks if VPN disconnects
blackhole metric 100 table vpn

You'll note that nowhere do we actually define the default gateway - because we can't yet. VPN connections often dynamically allocate IPs, so we'll need to configure the default route for the VPN table to match that particular IP each time we start the VPN connection (we'll do so with a smaller metric figure than the blackhole above of 100, thereby avoiding the blackhole rule).

So, we will configure NetworkManager to trigger a script upon bringing up the VPN interface:

cat << EOF > /etc/NetworkManager/dispatcher.d/90-vpn
VPN_UUID="\$(nmcli con show vpn | grep uuid | tr -s ' ' | cut -d' ' -f2)"

if [ "\$CONNECTION_UUID" == "\$VPN_UUID" ];then
  /usr/local/bin/configure_vpn_routes "\$INTERFACE" "\$ACTION"

In that script, we will read the IP address of the VPN interface and install it as the default route. When the VPN is deactivated, we'll do the opposite and cleanup the route we added:

cat << EOF > /usr/local/bin/configure_vpn_routes
# Configures a secondary routing table for use with VPN interface


zone="\$(nmcli -t --fields c show vpn | cut -d':' -f2)"

clear_vpn_routes() {
  /sbin/ip route show via 192.168/16 table \$table | while read route;do
    /sbin/ip route delete \$route table \$table

clear_vpn_rules() {
  keep=\$(ip rule show from 192.168/16)
  /sbin/ip rule show from 192.168/16 | while read line;do
    rule="\$(echo \$line | cut -d':' -f2-)"
    (echo "\$keep" | grep -q "\$rule") && continue
    /sbin/ip rule delete \$rule

if [ "\$action" = "vpn-up" ];then
  ip="\$(/sbin/ip route get oif \$interface | head -n 1 | cut -d' ' -f5)"

  # Modify default route
  clear_vpn_routes \$vpn_table
  /sbin/ip route add default via \$ip dev \$interface table \$vpn_table

elif [ "\$action" = "vpn-down" ];then
  # Remove VPN routes
  clear_vpn_routes \$vpn_table
chmod 755 /usr/local/bin/configure_vpn_routes

Bring up the VPN interface:

nmcli c up vpn

That's all, enjoy!

Sending all packets from a user through the VPN

I find this technique particularly versatile as one can also easily force all traffic from a particular user through the VPN tunnel:

# Replace this with your LAN interface

# Send any marked packets using VPN routing table
cat << EOF >> /etc/sysconfig/network-scripts/rule-$IFACE
fwmark 0x50 table vpn

# Mark all packets originating from processes owned by this user
firewall-cmd --permanent --direct --add-rule ipv4 mangle OUTPUT 0 -m owner --uid-owner LINUXUSER -j MARK --set-mark 0x50
# Enable masquerade on the VPN zone (enables IP forwarding between interfaces)
firewall-cmd --permanent --add-masquerade --zone=VPN

firewall-cmd --reload

Note 0x50 is arbitrary, as long as it the rule and firewall rule match, you're fine.

Migrating a live server to another host with no downtime

I have had a 1U server co-located for some time now at iWeb Technologies' datacenter in Montreal. So far I've had no issues and it did a wonderful job hosting websites & a few other VMs, but because of my concern for its aging hardware I wanted to migrate away before disaster struck.

Modern VPS offerings are a steal in terms of they performance they offer for the price, and Linode's 4096 plan caught my eye at a nice sweet spot. Backed by powerful CPUs and SSD storage, their VPS is blazingly fast and the only downside is I would lose some RAM and HDD-backed storage compared to my 1U server. The bandwidth provided wit the Linode was also a nice bump up from my previous 10Mbps, 500GB/mo traffic limit.

When CentOS 7 was released I took the opportunity to immediately start modernizing my CentOS 5 configuration and test its configuration. I wanted to ensure full continuity for client-facing services - other than a nice speed boost, I wanted clients to take no manual action on their end to reconfigure their devices or domains.

I also wanted to ensure zero downtime. As the DNS A records are being migrated, I didn't want emails coming in to the wrong server (or clients checking a stale inboxes until they started seeing the new mailserver IP). I can easily configure Postfix to relay all incoming mail on the CentOS 5 server to the IP of the CentOS 7 one to avoid any loss of emails, but there's still the issue that some end users might connect to the old server and get served their old IMAP inbox for some time.

So first things first, after developing a prototype VM that offered the same service set I went about buying a small Linode for a month to test the configuration some of my existing user data from my CentOS 5 server. MySQL was sufficiently easy to migrate over and Dovecot was able to preserve all UUIDs, so my inbox continued to sync seamlessly. Apache complained a bit when importing my virtual host configurations due to the new 2.4 syntax, but nothing a few sed commands couldn't fix. So with full continuity out of the way, I had to develop a strategy to handle zero downtime.

With some foresight and DNS TTL adjustments, we can get near zero downtime assuming all resolvers comply with your TTL. Simply set your TTL to 300 (5 minutes) a day or so before the migration occurs and as your old TTL expires, resolvers will see the new TTL and will not cache the IP for as long. Even with a short TTL, that's still up to 5 minutes of downtime and clients often do bad things... The IP might still be cached (e.g. at the ISP, router, OS, or browser) for longer. Ultimately, I'm the one that ends up looking bad in that scenario even though I have done what I can on the server side and have no ability to fix the broken clients.

To work around this, I discovered an incredibly handy tool socat that can make magic happen. socat routes data between sockets, network connections, files, pipes, you name it. Installing it is as easy as: yum install socat

A quick script later and we can forward all connections from the old host to the new host:


# Stop services on this host
for SERVICE in dovecot postfix httpd mysqld;do
  /sbin/service $SERVICE stop

# Some cleanup
rm /var/lib/mysql/mysql.sock

# Map the new server's MySQL to localhost:3307
# Assumes capability for password-less (e.g. pubkey) login
ssh $NEWIP -L 3307:localhost:3306 &
socat unix-listen:/var/lib/mysql/mysql.sock,fork,reuseaddr,unlink-early,unlink-close,user=mysql,group=mysql,mode=777 TCP:localhost:3307 &

# Map ports from each service to the new host
for PORT in 110 995 143 993 25 465 587 80 3306;do
  echo "Starting socat on port $PORT..."
  socat TCP-LISTEN:$PORT,fork TCP:${NEWIP}:${PORT} &
  sleep 1

And just like that, every connection made to the old server is immediately forwarded to the new one. This includes the MySQL socket (which is automatically used instead of a TCP connection a host of 'localhost' is passed to MySQL).

Note how we establish a SSH tunnel mapping a connection to localhost:3306 on the new server to port 3307 on the old one instead of simply forwarding the connection and socket to the new server - this is done so that if you have users who are permitted on 'localhost' only, they can still connect (forwarding the connection will deny access due to a connection from a unauthorized remote host).

Update: a friend has pointed out this video to me, if you thought 0 downtime was bad enough... These guys move a live server 7km through public transport without losing power or network!

Building a home media server with ZFS and a gaming virtual machine

Work has kept me busy lately so it's been a while since my last post... I have been doing lots of research and collecting lots of information over the holiday break and I'm happy to say that in the coming days I will be posting a new server setup guide, this time for a server that is capable of running redundant storage (ZFS RAIDZ2), sharing home media (Plex Media Server, SMB, AFP) as well as a full Windows 7 gaming rig simultaneously!

Windows runs in a virtual machine and is assigned it's own real graphics card from the host's hardware using the using the brand-new VFIO PCI passthrough technique with the VGA quirks enabled. This does require a motherboard and CPU with support for IOMMU, more commonly known as VT-d or AMD-Vi.

How to Convert a GPT disk layout to a MS-DOS/MBR layout without data loss (and Gigabyte Hybrid EFI)

If you're coming here from Google searching for how to convert a GPT disk layout to MS-DOS/MBR and don't want to read through my (probably boring) story, click here ;)

Adventures with Hybrid EFI

My gaming PC has been long overdue due for a reformat. I naively allocated only 30GB to the Windows partition (and the other 120GB to 3 flavours of Linux) thinking I wouldn't use Windows for much other than Starcraft 2, but a few months back I had the urge to play Battlefield 2 again. Ever since installing and fully patching it disk space has been running pretty tight. I had to disable sleep, hibernation as well as system restore and still only had 4GB of free space, so my filesystem became fragmented easily. With the release of Windows 8 Customer Preview (download it free here), I figured it was a good time to reformat my disk and reinstall all my OSs from scratch.

I figured while I'm at it, I would make all of the big changes at once and enabled EFI booting on my Gigabyte GA-Z68A-D3H board. Little did I know that when the BIOS says "EFI," it really means Gigabyte's "Hybrid EFI" implementation and not UEFI (although in retrospect, the fact that I made the change in the BIOS should have been enough of a hint, right?). With Hybrid EFI enabled, Windows 7 and Windows 8CP installed perfectly and even created a nice GPT disk layout so reinstalled my games and activated Windows 7. Then I rebooted to play around in Windows 8CP for a bit (I do not like it, btw).

I then tried installing Fedora 16. To my surprise EFI booting failed every time, despite the all of the Fedora 16 installation media being EFI-capable. When attempting to boot from my Fedora 16 Live (x86_64) USB key I just would get a black screen with "........." printed one dot a time and then it would proceed to fall back to the next boot device (Windows boot manager on the hard disk). Upon re-examining my BIOS settings, I was disappointed to find that the setting was actually called "CD/DVD EFI Boot Option" indicating that perhaps USB EFI booting was not supported. Fair enough, I burnt the same F16 image I was using on the USB key to a CD and tried again. The same "........." text appeared.

It was then as I went back to boot Windows 7 that I discovered my attempts to set it as the default OS from Windows 8CP removed my capability of booting Windows 7 somehow. At this point it was 2AM and I was fed up with this stupid Hybrid EFI. I looked for a way to revert to a good old MS-DOS/MBR partition layout. After some Googling I stumbled across Rod Smith's website. He has extensive documentation on EFI booting, including with Gigabyte's implementation of Hybrid EFI. He says that it shares a large amount of code with EFI DUET (tianocore) and although it does work natively with Windows 7, it is not a full UEFI implementation. That would explain the problems I was having with Fedora, then.

The actual GPT to MBR conversion

Through the Rod Smith's guidance and a few dirty tricks, I was successfully able to convert my GPT partition - without data loss or deleting any partitions - and then boot Windows 7 in legacy/MBR mode. In order to do this you'll need your Windows installation media at hand as well as a copy of the Fedora 16 Live media. If you don't have a copy of Fedora 16 Live handy, you can download the Live media ISO (64-bit) from a local mirror here. See the Fedora 16 Installation Guide for details on burning this image to a CD or on creating a bootable USB key.

Keep in mind that at this point I only had 3 partitions and a bunch of unpartitioned space on the disk, so conversion was a rather straightforward process (all GPT partitions mapped directly to primary partitions). Although it is theoretically possible to convert GPT partitions with >4 partitions by defining which ones are to be logical partitions after conversion, I have not tested this.

  1. Boot your Fedora 16 Live media and wait for your session to start. If you're having troubles booting, press Tab at the boot loader screen and try booting with the nomodeset parameter added.
  2. Depending on your graphics card, you'll either be presented with the new Gnome 3 Shell or with the traditional interface. Start a terminal session by putting your mouse in the top right corner of the screen and typing "terminal" in the search (Gnome Shell) or by selecting Applications > System Tools > Terminal (traditional interface)
  3. Install gdisk:
    su -
    yum -y install gdisk

    This may take a few moments.

  4. Make a backup of your current GPT scheme:
    gdisk -b sda-preconvert.gpt /dev/sda
  5. Now we will attempt to convert your GPT disk layout to MS-DOS/MBR. Start gdisk:
    gdisk /dev/sda

    You should be prompted with:

    Command (? for help):
  6. Press r to start recovery/transformation.
  7. Press g to convert GPT to MBR.
  8. Press p to preview the converted MBR partition table.
  9. Make any modification necessary to the partition layout. See Rod Smith's Converting to or from GPT page for more details on this.
  10. When you're happy with the MS-DOS/MBR layout, press w to write changes to the disk.
  11. Shutdown Fedora 16 and boot from the Windows 7 installation media
  12. Enter your language & keyboard layout and then select the option to repair your computer in the bottom left corner.
  13. From the available options, select Startup Repair. Windows will ask for a reboot.
  14. Follow the previous three steps again to boot the Windows 7 installation and run startup repair
  15. Once again, boot the Windows 7 installation media but this time opt to open a command prompt instead of choosing startup repair. Type:
    bootrec /scanos
    bootrec /rebuildbcd
    bootrec /fixmbr
    bootrec /fixboot
  16. Close the command prompt and run Startup Repair one last time.

That's it! You should now have a bootable installation of Windows 7 on a MBR partition layout.


Installing the Darwin Calendar Server 2.4 on Fedora 13 or Fedora 14

As I mentioned in my last post, I've been playing with the Darwin Calendar Server (DCS) on Linux... Today I was able to re-test my setup notes to see if they worked properly, so below I've written a tutorial on how to get your own DCS server going on Fedora 13 or 14.

Installing Dependencies

Since we will be installing CalendarServer directly from the 2.4 branch subversion repository, the first thing to do is to install subversion and the dependencies for DCS:

su -
# Required to check out the source code from the repository
yum install subversion
# Dependencies
yum install patch memcached krb5-devel python-zope-interface PyXML pyOpenSSL python-kerberos
# Requirements for compiling xattr
yum install python-setuptools gcc gcc-c++ python-devel

Enable extended file attributes (xattrs)

DCS requires user extended file attributes so the user_xattr mount option must be enabled for the partition on which CalendarServer will be storing its documents and data (in this case, /srv). If you have not already enabled this option (it is disabled by default), edit /etc/fstab and add the user_xattr mount option after defaults, for example:

/dev/mapper/VolGroup-lv_root /                       ext4    defaults,user_xattr        1 1

Grab DCS from SVN and run auto-setup

Once these packages have been installed and extended file attributes have been enabled, we will begin setting up the CalendarServer as your regular, non-root user.

# Directory to hold CalendarServer checkout and its dependencies
mkdir CalendarServer
cd CalendarServer
# Checkout the code from the repo
svn checkout CalendarServer-2.4
cd CalendarServer-2.4
# Start auto-setup
./run -s

Auto-setup will now attempt to grab any missing dependencies for CalendarServer an will unpack and patch them accordingly. You may find that the download for PyDirector stalls - if so, hit to abort setup and download it manually:

pushd ..
tar xfz pydirector-1.0.0.tar.gz
# Resume unpacking
./run -s

Prepare for installation

Since DCS bundles a modified version of Twisted as well as a few other projects (such as pydirector), we will now prepare an installation root folder to avoid conflicts with system libraries (i.e., Twisted if it has been installed from the Fedora repos). This code will be run as root.

su -
# setup data & document roots
mkdir -p /srv/CalendarServer/{Data,Documents}
chown -R daemon:daemon /srv/CalendarServer/
# setup installation root
mkdir -p /opt/CalendarServer/etc/caldavd
mkdir -p /opt/CalendarServer/var/run/caldavd
mkdir -p /opt/CalendarServer/var/log/caldavd

Install DCS and configure the server instance

The last step is to install DCS from the Subversion checkout we made earlier into the installation root. Replace /home/regularuser with the actual path to the home directory of your regular user.

# install DCS to installation root
cd /home/regularuser/CalendarServer/CalendarServer-2.4
./run -i /opt/CalendarServer
rm -rf /opt/CalendarServer/usr/caldavd/caldavd.plist
# copy sample configuration files
cp conf/servertoserver-test.xml /opt/CalendarServer/etc/caldavd/servertoserver.xml
cp conf/auth/accounts.xml /opt/CalendarServer/etc/caldavd/accounts.xml
cp conf/caldavd-test.plist /opt/CalendarServer/etc/caldavd/caldavd.plist
cp conf/sudoers.plist /opt/CalendarServer/etc/caldavd/sudoers.plist
# change permissions; passwords are stored plaintext!
chmod 600 /opt/CalendarServer/etc/caldavd/*

I have reported bugs #390 and #391 about problems with the setup script on 64-bit machines as well as a problem if a custom destination installation directory is used (which we did). This bit of code works around both of the bugs:

# 64-bit fix - see
sitelib="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
sitearch="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib(1))')"
if [ "$sitelib" != "$sitearch" ];then
  mv /opt/CalendarServer"${sitelib}"/twisted/plugins/* /opt/CalendarServer"${sitearch}"/twisted/plugins
  # PYTHONPATH fix for 64-bit - see
  sed -i.orig 's|PYTHONPATH="'"${sitelib}"'|DESTDIR=/opt/CalendarServer\nPYTHONPATH="${DESTDIR}'"${sitelib}"':${DESTDIR}'"${sitearch}"':|' /opt/CalendarServer/usr/bin/caldavd
  # PYTHONPATH fix for 32-bit - see
  sed -i.orig 's|PYTHONPATH="'"${sitelib}"'|DESTDIR=/opt/CalendarServer\nPYTHONPATH="${DESTDIR}'"${sitelib}"':|' /opt/CalendarServer/usr/bin/caldavd

If you would like your server to use SSL (highly recommended), you will need to generate a certificate. If you have a certificate and key ready to install, place it in /opt/CalendarServer/etc/tls. If not, you can easily generate a free self-signed one:

# Generate SSL keys
mkdir /opt/CalendarServer/etc/tls
openssl req -new -newkey rsa:1024 -days 365 -nodes -x509 -keyout -out

Now, edit /opt/CalendarServer/etc/caldavd/caldavd.plist in your favorite editor and configure the server as follows:

    <!-- Network host name [empty = system host name] -->
    <string></string> <!-- The hostname clients use when connecting -->

# Data roots
    <!-- Data root -->
    <!-- Document root -->

# Test accounts configuration
    <!-- XML File Directory Service -->

# Sudoers configuration
    <!-- Principals that can pose as other principals -->

# Delete this section
<!-- Wikiserver authentication (Mac OS X) -->

# logging

    <!-- Apache-style access log -->

    <!-- Server activity log -->

    <!-- Log levels -->
    <string>info</string> <!-- debug, info, warn, error -->
# a bit further down…
    <!-- Global server stats -->
# <snip>
    <!-- Server statistics file -->
    <!-- Server process ID file -->

# SSL 
    <!-- Public key -->
    <!-- Private key -->

# Privilege drop
        Process management

# iSchedule server-to-server settings
      <!-- iSchedule protocol options -->

# Communication socket
    <!-- A unix socket used for communication between the child and master processes.
         An empty value tells the server to use a tcp socket instead. -->

# Twisted

# Load balancer
        Python Director





Try starting the server!

/opt/CalendarServer/usr/bin/caldavd -T /opt/CalendarServer/usr/bin/twistd -f /opt/CalendarServer/etc/caldavd/caldavd.plist -X

If all goes well, press to kill the process and then daemonize it:

/opt/CalendarServer/usr/bin/caldavd -T /opt/CalendarServer/usr/bin/twistd -f /opt/CalendarServer/etc/caldavd/caldavd.plist