Explaining My Configs: nftables

Explaining the configuration files I use in detail, this time: nftables

 ·  10 minutes read

This post is part of my Explaining My Configs series where I explain the configuration files (and options) I use in detail.

This post could either be read as a whole, or as a reference (click on a line to jump to its explanation).

This post covers nftables the next generation packet filtering subsystem of the Linux kernel. It's meant to replace the netfilter (iptables) subsystem in the kernel, and has many advantages. Not long ago I decided to decipher my iptables rules and migrate to nftables. This configuration is the result of that effort. The resulting nftables rules are more readable, maintainable and less redundant than the previous IPv4 and IPv6 iptables equivalent, and if only because of that, I feel like the migration was worth it.

I implemented a rather basic firewall. I use it to protect my servers, and I think it suffices. If you feel otherwise, please let me know. I hope to make these posts live examples of my configurations and would adjust as I encounter new scenarios that I need to protect against.

Keep in mind, since this is a firewall configuration, in this post I assume some basic understanding of networking and firewalls.

Edit: this config has evolved since it was first published. While investigating nftables related issues I also came across new resources and got useful suggestions from people on the nftables IRC channel, namely: arturo and evilman.

What is this config for?

Unlike many other configurations, a firewall usually has many goals and touches many areas. For example, my firewall is configured to make my server more secure by filtering some kinds of traffic, but also implements port forwarding (see my previous post about VPN port forwarding) and a NAT gateway for my VPN server. Therefore the why of this configuration is less clear, but I hope my usecases become more clear as I go through the config.

The config file

Click on a line to jump to its explanation.

 

Reviewing the config

flush ruleset

Clears the previous ruleset. This flushes out all the tables, chains and rules so we can start clean. This is not done automatically so without this, previously added rules would still be in effect.

table inet filter {

This defines a table. A table is a container for chains (and sets). This line defines a table with the family inet and name filter. inet is a dummy family that means internet address (both IPv4 and IPv6). I wanted a shared config between the two so I chose this one, alternatively I could have restricted it to either by using ip or ip6.

    set tcp_accepted {
        type inet_service; flags interval;
        elements = {
            http, https,
            ssh,
            xmpp-client, xmpp-server,
        }
    }
    set udp_accepted {
        type inet_service; flags interval;
        elements = {
            openvpn,
            60000-61000, # mosh
        }
    }

This creates two sets of type inet_service (port number or range). The flags interval directive enables ranges (like the mosh one). After that, you just set elements to add members to the set. Having a set is a clean and efficient way to later reference all of these values in the rules. We can use port numbers, service names and port ranges.

Sets provide a performant way to use generic sets of data, that can be dynamically manipulated, so they are very suitable for tasks like IP blocking. More about sets on the nftables wiki page.

These two are the lists of allowed ports that will be used later.

    chain base_checks {

This directive creates a chain called base_checks. A chain is a container for rules.

This chain does a few basic checks I wanted to reuse without having to repeat myself. Unlike, for example, the input chain, this chain doesn't naturally receive packets for processing, it has to be called explicitly with either jump or goto.

        # allow established/related connections
        ct state {established, related} accept

This is our first rule, and it includes a lot of new syntax to review. Let's first start with what it does. This rule is here to allow already established or related connections through. If the connection has already been established, it probably means it was already allowed by us earlier and we can just continue allowing it.

ct is used to tap into the connection tracking entry associated with the packet. In this case, we are accessing the state of the connection, and checking if it's in the set {established, related}. If it is in it, accept the packet, otherwise, continue to the next rule.

        # early drop of invalid connections
        ct state invalid drop

This is similar to the previous line, but this time, instead of checking if the state is within a set, we only check if the connection state is invalid and if so, we drop the connection. That is, we just ignore the packet as if it never came in.

    }

This is a curly bracket. You know what it does. I had to put it here because I promised to explain every line.

    chain input {
        type filter hook input priority 0; policy drop;

Like with the base checks chain, this defines a chain with the name input. The name doesn't matter, but I chose input to stick with the already familiar iptables convention.

Unlike the base checks chain, in this one we tell nftables what kind of packets we would like to accept and what we would like to do with them by default.

With the type statement, we tell nftables our chain will be of type filter (filtering packets), and it will do so on input packets (incoming packets). We also set a priority of zero, although I read that priorities aren't currently used, so I will skip explaining them for now.

The last thing that we do in this line is declare the default policy for this chain. That means all packets that are not handled are dropped by default. In this chain in specific, we later reject all packets so this would never be used, but I like having a strict policy set for safety.

        jump base_checks

There are two ways to move the flow of the rule processing to another chain: jump and goto. The only difference between them is that in case the target chain doesn't decide what to do with the packet (e.g. accept), jump will return to the previous position and continue processing, while goto will just decide based on the chain's default policy.

I use jump here because I want to continue processing after the base checks.

        # allow from loopback
        iifname lo accept

Packets from the loopback interface are generally safe, so just accept everything coming from there.

        # allow icmp
        ip protocol icmp icmp type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable } accept
        ip6 nexthdr icmpv6 icmpv6 type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable, packet-too-big, nd-router-advert, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld-listener-query } accept

These two commands let some ICMP packets go through. This list may not be complete, but it has served me well so far.

As you may remember, our table type is inet which means IPv4 or IPv6. However, ICMP is different to ICMPv6. This means that we have to do our checks in with version specific directives. The ip and ip6 directives do that. After more version specific checks, we match the version specific types.

If everything matches, we accept, otherwise, we move on.

        # allow ports
        tcp dport @tcp_accepted accept
        udp dport @udp_accepted accept

These two rules are in charge of accepting incoming connections. One starts with tcp and one starts with udp to restrict based on the protocol. We then match the dport (destination port) against the sets we defined earlier to check if we would like to allow it.

        # everything else
        reject with icmpx type port-unreachable

Here we reject everything else with a port-unreachable ICMP message.

    chain forward {
        type filter hook forward priority 0; policy drop;

This is very similar to the input chain, however this time we will be filtering packets with the forward hook, that is, packets that are going to be forwarded by our firewall. This is only useful if your firewall is meant to be forwarding packets, like if for example it's used as a gateway.

Please take a look at the packet forwarding extra note. It contains more actions needed for this to work.

        # Allow coming out of the vpn
        ip saddr 192.168.87.0/24 iifname tun0 accept

Here we allow packets to be forwarded from the VPN to the rest of the network. My VPN device is called tun0 and 192.168.87.0/24 is my VPN's netmask.

First of all, because again I'm dealing with IPv4 specific information (the netmask) in an inet table, I have to prefix the directive with ip. Then I check if the IP is in the VPN range and the packet came from the VPN interface, if so, I will accept it for forwarding.

        # Allow connecting to home_srv.
        ip daddr home_srv ct status dnat accept

Here I allow forwarding all the traffic directed to my home server. We rely on DNS (at the time of rule loading). Make sure to hardcode this hostname in your /etc/hosts or have another way to ensure that the DNS can't be manipulated by an attacker. The name is resolved at the time of rule loading.

ct status dnat makes sure we only allow packets that have had dnat done on them. We use that because we want to only forward packets that have been NATed by us.

    chain output {
        type filter hook output priority 0; policy accept;

I think this behaviour is already the default, but I include it here for completeness. This chain accepts all outgoing packets.

table ip nat {

This table will take care of all of the NAT. Since this is an IPv4 NAT, this table's family is set to ip.

    chain prerouting {
        type nat hook prerouting priority 0;

Yet another chain, this time called prerouting. Unlike the chains before, this chain is of type nat and not filter. This type means we will be changing packets instead of deciding their fate like we did before. Also, this time, we are using the prerouting hook. All packets entering the system are processed by this hook and it's invoked before everything else. We are using this because we would like to modify the packets, and only then pass them on to the rest of the rules for processing.

        tcp dport 2222 dnat home_srv # ssh
        udp dport 61001-62000 dnat home_srv # mosh

These two lines take care of port forwarding. Like in the filter rules, we check if the packet is of a specific protocol and destination port, but this time, instead of accepting, dropping or rejecting, we dnat (destination nat), where we change the destination address from the server's address to the home server's one (see comment about name resolution) so the packet is forwarded there.

    chain postrouting {
        type nat hook postrouting priority 100;

This chain is very similar to the prerouting chain, but it instead hooks on postrouting. This is the hook on the other end of prerouting, this processes all packets that leave the system, after all the decisions have been made.

        oifname {ens3, tun0} masquerade

Before I explain this line, let me explain what it solves. Computers behind a NAT, for example home_srv, are not aware of the NAT or their internet facing IP address, so when they send packets, the source IP is their IP. For example, when home_srv sends a packet to 8.8.8.8, the source address will be 192.168.87.10 and the destination will be 8.8.8.8. The main problem with that, is that when 8.8.8.8 replies, it will reply to 192.168.87.10 which won't be routed back to home_srv because it's a private address.

To solve this problem you would want to use something called source NAT. In the previous section we modified the destination address (destination NAT), in this one, we want to change the source address to be that of our external, internet routeable, ip address so in my case 149.154.152.35.

This means we could have just used oifname ens3 snat 149.154.152.35 to make it work. However, sometimes computers have multiple interfaces, or changing IP addresses, so this can become really annoying to maintain.

This is what masquerade is for. It automatically rewrites the source IP of forwarded packets to the one of the output interface.

Note: I expected just having masquerade here to work, just like it used to with iptables. However it broke connections through the lo interface. I had to add the oifname condition to filter lo out. I started a discussion about it on the nftables mailing list.

Extra notes

Rule debugging

nftables supports tracing which lets you see all the rules a packet has been evaluated against and the resulting decision.

Unfortunately I only managed to get it to work on my laptop, and not on my server. I'm still investigating.

Edit: corrected a mistake and note that I now got it to work on one of my machines.

Packet forwarding

Don't forget to set the following kernel parameters (using sysctl) to enable packet forwarding in the kernel (if you need it). Also, don't forget to make these changes permanent.

net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1

Useful reference

I recently found a useful reference: link.

Please let me know if you spotted any mistakes or have any suggestions, and follow me on Twitter or RSS for updates.

explaining configs security nftables firewall