-
Notifications
You must be signed in to change notification settings - Fork 1.4k
zebra: add dplane helpers provide interface speed #19412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
A couple of questions: |
Speed is always included in Grout’s interface notifications, updated when the interface goes up or down. In practice we receive all attributes needed for DPLANE_OP_INTF_INSTALL/DELETE, including speed. Netlink is different: it does not carry speed for RTM_NEWLINK notfication, so FRR must call ethtool/ioctl—and often retry—to obtain a stable value. Both of your proposals align with the kernel model where RTM_NEWLINK lacks speed. Your first option also removes the ioctl from the zebra main thread by doing it on the dplane thread, which is a better design overall. This patch is the minimal change to consume Grout-provided speed. It's why this patch is a draft. Anyway, introducing a dedicated API like DPLANE_OP_INTF_SPEED_UPDATE would require:
For Grout, this would mean sending DPLANE_OP_INTF_UPDATE (to set link up for example) plus DPLANE_OP_INTF_UPDATE_SPEED (to set speed link). It’s not optimal, but acceptable if we want a unified API used by kernel and Grout. |
|
I'm not trying to get into a "linux vs grout" struggle. I just want the dplane api to be something reasonably neutral, so if it's going to change, I'd rather it change in ways that could be somewhat general. Assuming that there's never a speed available at creation seems limiting, as you say. But assuming that there's never a need to query is also limiting. I don't think you have to force "grout" to change to accomodate that - I'd just like there be a path to moving the existing work into the dataplane - and making it available to plugins to decide what to do (if anything).
|
|
This code change WILL break bonds/lags under linux. It is common for a bond/LAG to be slow coming up while FRR is already started and the interface speed will change. This is the specific scenario I put this code in for: a) System start With this code change you are proposing Zebra will now no longer be able to handle this What you need to do is either have a |
donaldsharp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm NAK'ing this because in this current format, this will fundamentally break bonds being able to have the correct speed under linux. Which is no bueno :(
First, this PR is a draft; it is not intended to be merged. Last point: I think it would be better to run the ioctl on the zebra dplane thread rather than the main thread—unless I’m missing something? We could add DPLANE_OP_INTF_UPDATE_SPEED to obtain speed from the dplane and update it on the main thread (as Mjtsapp suggested). That should handle bond interfaces properly. How the dplane obtains link speed (polling via ioctl or by notification) is an implementation detail that should remain in the dplane backend. |
004beab to
032e64e
Compare
|
Still work in progress, just back from holiday, didn't have time to test it (yet). |
mjstapp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, no, I think maybe some of the discussion wasn't clear?
the way the dplane works is by sending events down from zebra towards the dataplane(s), and sending results and other events "up" towards zebra from the dataplane(s).
it'd be fine to include a field for interface speed, and apply that if it's present
the interface update timer could emit a dplane event for processing. the kernel dplane would make the appropriate system call. some other dataplane plugin might just ignore the event. if the current notion of the speed were in the event, and if that was unchanged, the kernel plugin could also avoid making any change.
a dplane event with a speed update would be processed in the zebra main pthread context, which is the only pthread allowed to touch the zebra structs.
I don't think it's worth adding a header file just to rename a couple of status codes - is there a real benefit to changing those labels?
032e64e to
3350136
Compare
Thanks for the feedback. |
812ba4f to
8481e29
Compare
|
For dplane_intf_speed, should we reuse dplane_intf_update_internal(ifp, DPLANE_OP_INTF_SPEED) or build own ctx dplane ? In this case, should we have own dplane stats (i.e. .dg_intfs_in/errors) ? thanks |
8481e29 to
576f554
Compare
|
This new api is used in a PR on grout side: DPDK/grout@9e24a62 |
df8b4ef to
15e016b
Compare
15e016b to
03a48b7
Compare
bcecb95 to
8555af7
Compare
|
The PR has been updated to resolve conflicts. It's ready for review. |
zebra/interface.c
Outdated
| &zebra_if->speed_update); | ||
| event_ignore_late_timer(zebra_if->speed_update); | ||
|
|
||
| zebra_if_schedule_speed_update(zebra_if, 15); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, that 15 sticks out - do you know where that comes from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dc7b3ca it comes from this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, the 15s timer is needed when FRR is starting. FRR probes all interfaces from the kernel. If an interface is already up but the link negotiation is not finished yet, if_up() will not be called, and the speed will never be updated. Since the speed is only queried when the interface transitions to up in if_up(), we can’t rely on that during startup. That’s why a timer is scheduled to query the speed again 15 seconds later in if_zebra_new_hook. Maybe we could start this timer only at startup, but it doesn’t seem we have a startup field in this hook.
Donald could probably shed some light on this point since he added the initial code — @donaldsharp ?
8555af7 to
2752d78
Compare
caa8a18 to
a727763
Compare
2b44b62 to
9a4179e
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
9a4179e to
432aaea
Compare
432aaea to
43c5b09
Compare
|
ci:rerun |
Add dplane ctx helpers to carry interface speed. Signed-off-by: Maxime Leroy <maxime@leroys.fr>
kernel_get_speed() doesn't need the full interface object; it only needs the interface name and the VRF id to open the right socket/ioctl. Update the prototype and callers accordingly. This will be used in the next commit. Signed-off-by: Maxime Leroy <maxime@leroys.fr>
This introduces DPLANE_OP_INTF_SPEED_GET so link speed is resolved in the dataplane via ethtool and reported to zebra. Zebra no longer performs synchronous speed reads; it simply applies the value provided by the dataplane. If speed is already known during interface creation or modification, it can be included in INTF_INSTALL/INTF_UPDATE and zebra will use it directly. If speed is not provided, the zebra main thread issues a follow-up INTF_SPEED_GET to request the dataplane to fetch the speed asynchronously. For dataplane providers that implement only INTF_INSTALL/INTF_UPDATE and do not support INTF_SPEED_GET, zebra relies on any speed value provided by install/update. If speed is missing, zebra attempts a single INTF_SPEED_GET query and stops if the operation is unsupported or fails. Signed-off-by: Maxime Leroy <maxime@leroys.fr>
The 15-second timer used to re-query interface speed is currently scheduled in if_zebra_new_hook() for every newly created interface. However, at that point the interface may not yet exist in the OS, and in some cases it may never be created. Because of this, the speed query will usually fail (e.g. INTERFACE_SPEED_ERROR_READ) since the interface doesn't exist. There is also a race condition: even if the interface is created, the timer may run before the RTM_NEWLINK message is processed. As a result, ifp->ifp_index can remain IFINDEX_INTERNAL (0). When if_add_update() calls zebra_ns_link_ifp(), the interface tree is updated with this incorrect index. If this happens for multiple interfaces, the tree can end up with duplicate keys, eventually causing a zebra crash. A check was added to zebra_ns_link_ifp() to avoid adding an interface with an IFINDEX_INTERNAL index, but the root cause remained. This change fixes the underlying issue by scheduling the speed-update timer only when a valid RTM_NEWLINK has been received. The scheduling logic is moved from if_zebra_new_hook() to zebra_if_dplane_ifp_handling(), and runs only once the interface has a correct ifindex. Fixes: dc7b3ca ("zebra: Add one-shot thread to recheck speed") Signed-off-by: Maxime Leroy <maxime@leroys.fr>
skip_kernel was only applied in the netlink batch send path. As a result, operations processed by dedicated handlers (notably DPLANE_OP_INTF_SPEED_GET) were still executed by the kernel provider even when a previous provider plugin requested to skip kernel updates. Handle skip_kernel early in kernel_dplane_process_func() so it applies to all kernel provider operations, and remove the scattered checks from the netlink helpers. Signed-off-by: Maxime Leroy <maxime@leroys.fr>
43c5b09 to
d0c8451
Compare
|
I have updated this PR by removing the interface speed flag, as requested by Mark, to avoid introducing a Linux-specific / special-case behavior. As a result, a DPLANE plugin (such as the Grout one) must now return either SUCCESS or FAILURE for DPLANE_OP_INTF_SPEED_GET, even though the speed is already reported via DPLANE_OP_INTF_NEW/UPDATE. While doing this, I noticed that for DPLANE_OP_INTF_SPEED_GET the skip-kernel flag was ignored by the kernel provider. I therefore added an extra commit to fix this. With this change, skip-kernel applies to all kernel dplane operations (not only the netlink batch path), restoring the behavior from the original skip-kernel introduction (commit 9677961, “zebra: support skip-kernel for dataplane updates”). This updated API has also been tested with Grout (see DPDK/grout PR #292: DPDK/grout#292). |
Add dplane ctx helpers to carry interface speed and skip ethtool polling when the dataplane provides a value. Avoid scheduling speed update timers in this case.