Kmod-nft-offload

While hardware offloading is highly beneficial, it is not a silver bullet and introduces certain trade-offs:

Using hardware offload with nftables is not automatic. It requires a specific set of conditions to be met:

Frees up processor cycles. This leaves more headroom for heavy services like WireGuard VPNs, SQM bufferbloat control, network storage (NAS), or ad-blocking utilities.

Traditionally, every packet entering your router is processed by the CPU. The CPU inspects the packet, checks it against firewall rules (nftables), determines its destination (routing), and modifies its headers (NAT). For every single packet, this "software path" consumes CPU cycles. kmod-nft-offload

In a standard software-based firewall, every packet that passes through the network interface must be examined by the CPU. The CPU looks at the packet headers, compares them against the firewall rules, and decides to accept or drop them. On high-speed networks (1Gbps, 10Gbps, or higher), this consumes significant CPU resources and can create a bottleneck. kmod-nft-offload

Are you running , or a standard Linux distribution like Debian/Ubuntu ?

If you are configuring a modern router (like the NanoPi R2S or similar Rockchip-based devices), you might encounter errors if you try to manually install legacy packages like kmod-nft-nat6

First, you need to ensure the module is installed. If you're building your own OpenWrt image, you can include the package kmod-nft-offload in your .config file by enabling the following flags:

To appreciate the elegance of kmod-nft-offload , it's necessary to understand the underlying mechanisms of the Linux kernel's networking stack. While hardware offloading is highly beneficial, it is

A value of on indicates that the driver and hardware are capable of TC offload.

# Add this to /etc/nftabled.d/abc.nft (any name .nft) flowtable pft hook ingress priority filter devices = lan1, lan2, lan3, lan4, lan5 flags offload counter

To understand the benefit of kmod-nft-offload , you must understand how traffic is usually handled:

When a connection (like a video stream or a large download) is established, most packets in that stream are predictable. Rather than checking every single packet against every firewall rule, the module "offloads" these established flows to a specialized flow table. udp flow add @f

Significantly reduces CPU load by bypassing the L3 network stack for established packets. YouTube Guide Latency/Jitter

: In supported setups, it can significantly increase throughput (e.g., jumping from ~260Mbps to ~680Mbps in certain speed tests ) by bypassing intensive CPU-bound processing for established connections. Usage and Troubleshooting

By shifting the packet processing load to specialized hardware components (which are faster at packet switching than a general-purpose CPU), you can achieve near-line rate speeds on high-bandwidth connections, often exceeding

By following the installation and configuration steps outlined in this guide, you are now equipped to harness the power of hardware offloading in your own network.

nft add table ip filter nft add flowtable ip filter f hook ingress priority filter + 1 devices = lan0, lan1, lan2, lan3, eth1 counter\; flags offload\; nft add chain ip filter forward type filter hook forward priority filter\; policy accept\; nft add rule ip filter forward ip protocol tcp, udp flow add @f