-
Notifications
You must be signed in to change notification settings - Fork 32
Description
Is your feature request related to a problem? Please describe.
In high-end deployments moving gigabits of GTPU data, it is really important to balance the received GTPU traffic over the available set of CPUs in order to decaps/forward different tunnels in parallel.
Some NICs support configuring use of RSS hash based on GTPU TEID, done via ethtool, eg. "ethtool -N rx-flow-hash gtpu4". See:
https://git.kernel.org/pub/scm/network/ethtool/ethtool.git/commit/?id=4eef0687909a176472c111b5be7e85e97190ee88
This is the case for instance of Intel ice driver (Intel(R) Ethernet Controller 800):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a6d63bbf2c52d0a9d1550cd9a5ba58ea6371991b
However, other high end NICs (eg. Mellanox) don't support RSS hash based on GTPU TEID.
As a result, since GTPU traffic has a fixed UDP port, distribution happens only based on src-ipaddr+dst-ipaddr.
And dst-ipaddr in the case of eupf is usually fixed, so in general all traffic coming from same peer (eg. eNodeb, SGWu) will end up in the same transmit queue, ie handled by the same CPU.
This will also be the case if user has multiple CPUs but uses a lower-end NIC which supports a small amount of rx queues / interrupts, and hence the amount of CPUs handling packets is only a subset (eg 1) of the available CPUs.
Describe the solution you'd like
These sort of limitations can be overcome by implementing RSS at XDP level, by using anb ebpf CPUMAP to distribute load from an N-CPU subset to an M-CPU set of handlers. This is well explained here:
https://developers.redhat.com/blog/2021/05/13/receive-side-scaling-rss-with-ebpf-and-cpumap#redirecting_into_a_cpumap
So an idea would be to allow configuring eupf to have N CPUs doing the RSS processing, and another M CPUs doing the actual GTPU handling (matching, forwarding, decaps, etc.)