Skip to content

Ip full ip/v3/p9#1

Open
numansiddique wants to merge 9 commits intomainfrom
IP_FullIP/v3/p9
Open

Ip full ip/v3/p9#1
numansiddique wants to merge 9 commits intomainfrom
IP_FullIP/v3/p9

Conversation

@numansiddique
Copy link
Owner

No description provided.

local_binding_add_child(lbinding, child);
if (b_ctx_out->tracked_dp_bindings) {
tracked_binding_datapath_lport_add(
b_ctx_out->tracked_dp_bindings, pb, false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not calling "local_datapath_update_lport_count()" here to increase the port count in local datapath?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

runtime_data_ovs_interface_handler(struct engine_node *node, void *data)
{
struct ed_type_runtime_data *rt_data = data;
struct ed_type_runtime_tracked_data *tracked_data = node->tracked_data;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call
en_runtime_clear_tracked_data(node->tracked_data);

here as well

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. engine will take care of clearing the tracked data.

struct binding_ctx_in b_ctx_in;
struct binding_ctx_out b_ctx_out;
init_binding_ctx(node, rt_data, &b_ctx_in, &b_ctx_out);
tracked_data->tracked = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call
en_runtime_clear_tracked_data(node->tracked_data);

here as well


struct ed_type_runtime_tracked_data *tracked_data = node->tracked_data;
tracked_data->tracked = true;
b_ctx_out.tracked_dp_bindings = &tracked_data->tracked_dp_bindings;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why u missed
b_ctx_out.local_lports_changed = &tracked_data->local_lports_changed;

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because it is meant to track ovs interfaces with iface-id set.
This is required to update the condition monitoring.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, in the below sequence ..
1)ovn-nbctl ls-add sw0
2)ovs-vsctl add-port br-int p1 --
set Interface p1 external_ids:iface-id=sw0-port1
3)ovn-nbctl lsp-add sw0 sw0-port1

We step 3 will add the local binding and claim the port.
Step 2 will run runtime_data_ovs_interface_handler, but local binding is not created as port doesn't exist in SBDB
Step 3 will result in runtime_data_sb_port_binding_handler, which adds local binding. So in this case, we need to update b_ctx_out.local_lports_changed as this flag is used in flow_output_runtime_data_handler

Am I missing anything here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_binding is created in step 2.

…and binding_ctx_out.

No functional changes are introduced in this patch.

Signed-off-by: Numan Siddique <numans@ovn.org>
… state.

This patch adds a new data structure - 'struct local_binding' which represents
a probable local port binding. A new object of this structure is creared for
each interface present in the integration bridge (br-int) with the
external_ids:iface-id set. This struct has 2 main fields
 - 'struct ovsrec_interface *' and 'struct sbrec_port_binding *'. These fields
are updated during port claim and release.

A shash of the local bindings is maintained with the 'iface-id' as the hash key
in the runtime data of the incremental processing engine.

This patch helps in the upcoming patch to add incremental processing support
for OVS interface and SB port binding changes.

Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique and others added 7 commits April 15, 2020 14:00
…data.

This patch handles SB port binding changes and OVS interface changes
incrementally in the runtime_data stage which otherwise would have
resulted in calls to binding_run().

Prior to this patch, port binding changes were handled differently.
The changes were only evaluated to see if they need full recompute
or they can be ignored.

With this patch, the runtime data is updated with the changes (both
SB port binding and OVS interface) and ports are claimed/released in
the handlers itself, avoiding the need to trigger recompute of runtime
data stage.

Signed-off-by: Numan Siddique <numans@ovn.org>
This patch adds partial support of incremental processing of datapath binding.
If a datapath is deleted, then a full recompute is triggered if that
datapath is present in the 'local_datapaths' hmap of runtime data.

Signed-off-by: Numan Siddique <numans@ovn.org>
… actions have changed.

If ofctrl_check_and_add_flow(F') is called where flow F' has match-actions (M, A2)
and if there already exists a flow F with match-actions (M, A1) in the desired flow
table, then this function  doesn't update the existing flow F with new actions
actions A2.

This patch is required for the upcoming patch in this series which
adds incremental processing for OVS interface in the flow output stage.
Since we will be not be clearing the flow output data if these changes
are handled incrementally, some of the existing flows will be updated
with new actions. One such example would be flows in physical
table OFTABLE_LOCAL_OUTPUT (table 33). And this patch is required to
update such flows. Otherwise we will have incomplete actions installed.

Signed-off-by: Numan Siddique <numans@ovn.org>
…put stage.

This patch handles ct zone changes and OVS interface changes incrementally
in the flow output stage.

Any changes to ct zone can be handled by running physical_run() instead of running
flow_output_run(). And any changes to OVS interfaces can be either handled
incrementally (for OVS interfaces representing VIFs) or just running
physical_run() (for tunnel and other types of interfaces).

To better handle this, a new engine node 'physical_flow_changes' is added which
handles changes to ct zone and OVS interfaces.

Signed-off-by: Numan Siddique <numans@ovn.org>
With this patch, an engine node which is an input to another engine node
can provide the tracked data. The parent of this engine node can handle
this tracked data incrementally.

At the end of the engine_run(), the tracked data of the nodes are
cleared.

Signed-off-by: Numan Siddique <numans@ovn.org>
In order to handle runtime data changes incrementally, the flow outut
runtime data handle should know the changed runtime data.
Runtime data now tracks the changed data for any OVS interface
and SB port binding changes. The tracked data contains a hmap
of tracked datapaths (which changed during runtime data processing.

The flow outout runtime_data handler in this patch doesn't do much
with the tracked data. It returns false if there is tracked data available
so that flow_output run is called. If no tracked data is available
then there is no need for flow computation and the handler returns true.

Next patch in the series processes the tracked data incrementally.

Co-Authored-by: Venkata Anil <anilvenkata@redhat.com>
Signed-off-by: Venkata Anil <anilvenkata@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
This patch processes the logical flows of tracked datapaths
and tracked logical ports. To handle the tracked logical port
changes, reference of logical flows to port bindings is maintained.

Co-Authored-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Venkata Anil <anilvenkata@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique pushed a commit that referenced this pull request Oct 22, 2020
The 'nexthop' that passed to ic_route_hash() is not fully initialized in
get_nexthop_from_lport_addresses(). 'nexthop' has type of 'struct v46_ip' which
contains a union to share space for ipv4 and ipv6 address.  If only ipv4
initialized where is a plenty of uninitialized space that goes to
hash_bytes(nexthop, sizeof *nexthop, basis).

Impact: there are two places where this function is called.

1. In add_to_routes_ad(), the nexthop is initialized in parse_route() before
   calling get_nexthop_from_lport_addresses(), luckily.

2. In add_network_to_routes_ad(), we are unlucky.  When a directly connected
network of a router is found to be advertised, if the route already existed in
the global IC-SB, it may not be found due to the hash difference, and results
in the existing route being deleted and the same one recreated, unnecessarily.

This patch fixes the problem by initializing the struct to zero before setting
the fields.

From Ilya's report:
> Report from MemorySanitizer:
>
> ==3074629==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x67177e in mhash_add__ ovs/./lib/hash.h:66:9
>     #1 0x671668 in mhash_add ovs/./lib/hash.h:78:12
>     #2 0x6701e9 in hash_bytes ovs/lib/hash.c:38:16
>     ovn-org#3 0x524b4a in add_network_to_routes_ad ic/ovn-ic.c:1095:5
>     ovn-org#4 0x51eea3 in route_run ic/ovn-ic.c:1424:21
>     ovn-org#5 0x51887b in main ic/ovn-ic.c:1674:17
>     ovn-org#6 0x7fd4ce7871a2 in __libc_start_main
>     ovn-org#7 0x49c90d in _start (ic/ovn-ic+0x49c90d)
>
>   Uninitialized value was created by an allocation of 'nexthop' in the
>   stack frame of function 'add_network_to_routes_ad'
>     #0 0x5245f0 in add_network_to_routes_ad ic/ovn-ic.c:1069

Reported-by: Ilya Maximets <i.maximets@ovn.org>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/376160.html
Fixes: 57b347c ("ovn-ic: Route advertisement.")
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Han Zhou <hzhou@ovn.org>
numansiddique pushed a commit that referenced this pull request Nov 24, 2020
'child_port_list' is an array of pointers that should be freed too.

  Direct leak of 30 byte(s) in 6 object(s) allocated from:
    #0 0x501fff in malloc (/tests/ovstest+0x501fff)
    #1 0x6227e6 in xmalloc /lib/util.c:138:15
    #2 0x6228b8 in xmemdup0 /lib/util.c:168:15
    ovn-org#3 0x8183d6 in parse_fwd_group_action /lib/actions.c:3374:30
    ovn-org#4 0x814b6e in parse_action /lib/actions.c:3610:9
    ovn-org#5 0x8139ef in parse_actions /lib/actions.c:3637:14
    ovn-org#6 0x8136a3 in ovnacts_parse /lib/actions.c:3672:9
    ovn-org#7 0x813c80 in ovnacts_parse_string /lib/actions.c:3699:5
    ovn-org#8 0x53a979 in test_parse_actions /tests/test-ovn.c:1372:21
    ovn-org#9 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17
    ovn-org#10 0x537c75 in test_ovn_main /tests/test-ovn.c:1630:5
    ovn-org#11 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17
    ovn-org#12 0x537359 in main /tests/ovstest.c:133:9
    ovn-org#13 0x7f06978f21a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

CC: Manoj Sharma <manoj.sharma@nutanix.com>
Fixes: edb2400 ("Forwarding group to load balance l2 traffic with liveness detection")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique pushed a commit that referenced this pull request Nov 24, 2020
'dsts' should be freed in case of any error.

  Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x502378 in realloc (/tests/ovstest+0x502378)
    #1 0x622826 in xrealloc /lib/util.c:149:9
    #2 0x8194f4 in parse_select_action /lib/actions.c:1185:20
    ovn-org#3 0x814f49 in parse_set_action /lib/actions.c:3499:13
    ovn-org#4 0x814341 in parse_action /lib/actions.c:3554:9
    ovn-org#5 0x8139ef in parse_actions /lib/actions.c:3643:14
    ovn-org#6 0x8136a3 in ovnacts_parse /lib/actions.c:3678:9
    ovn-org#7 0x813c80 in ovnacts_parse_string /lib/actions.c:3705:5
    ovn-org#8 0x53a4e8 in test_parse_actions /tests/test-ovn.c:1321:17
    ovn-org#9 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17
    ovn-org#10 0x537c75 in test_ovn_main /tests/test-ovn.c:1630:5
    ovn-org#11 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17
    ovn-org#12 0x537359 in main /tests/ovstest.c:133:9
    ovn-org#13 0x7f9ce05ba1a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

CC: Han Zhou <hzhou@ovn.org>
Fixes: 85b3544 ("ovn-controller: A new action "select".")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique pushed a commit that referenced this pull request Nov 24, 2020
'parse_ofp_meter_mod_str' allocates space for meter.bands that
should be freed.

  Direct leak of 448 byte(s) in 7 object(s) allocated from:
    #0 0x52100f in malloc (/controller/ovn-controller+0x52100f)
    #1 0x7523a6 in xmalloc /lib/util.c:138:15
    #2 0x6fd079 in ofpbuf_init /lib/ofpbuf.c:123:26
    ovn-org#3 0x6cba27 in parse_ofp_meter_mod_str /lib/ofp-meter.c:779:5
    ovn-org#4 0x5705b8 in add_meter_string /controller/ofctrl.c:1674:19
    ovn-org#5 0x56f736 in ofctrl_put /controller/ofctrl.c:2105:13
    ovn-org#6 0x59aebb in main /controller/ovn-controller.c:2627:25
    ovn-org#7 0x7f07873251a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

CC: Guoshuai Li <ligs@dtdream.com>
Fixes: c25094b ("ovn: OVN Support QoS meter")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique pushed a commit that referenced this pull request Nov 24, 2020
'smap_clear()' doesn't free allocated memory, but 'smap_clone()'
re-initializes hash map clearing internal pointers and leaking
this memory.  'smap_destroy()' should be used instead.

Also, all the records and array of datapaths should be freed on
destruction of a cache entry.

  Direct leak of 16 byte(s) in 2 object(s) allocated from:
    #0 0x5211c7 in calloc (/controller/ovn-controller+0x5211c7)
    #1 0x752364 in xcalloc /lib/util.c:121:31
    #2 0x576e76 in sync_dns_cache /controller/pinctrl.c:2517:25
    ovn-org#3 0x5758fb in pinctrl_run /controller/pinctrl.c:3158:5
    ovn-org#4 0x59b06c in main /controller/ovn-controller.c:2642:25
    ovn-org#5 0x7fb570fc11a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

  Indirect leak of 26 byte(s) in 2 object(s) allocated from:
    #0 0x52100f in malloc (/controller/ovn-controller+0x52100f)
    #1 0x7523d6 in xmalloc /lib/util.c:138:15
    #2 0x7524a8 in xmemdup0 /lib/util.c:168:15
    ovn-org#3 0x73d8fc in smap_clone /lib/smap.c:314:45
    ovn-org#4 0x576e2f in sync_dns_cache /controller/pinctrl.c:2513:13
    ovn-org#5 0x5758fb in pinctrl_run /controller/pinctrl.c:3158:5
    ovn-org#6 0x59b06c in main /controller/ovn-controller.c:2642:25
    ovn-org#7 0x7fb570fc11a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

Fixes: 6b72068 ("ovn-controller: Add a new thread in pinctrl module to handle packet-ins.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique pushed a commit that referenced this pull request Nov 24, 2020
shash contains pointers to the data that should be freed.

  Direct leak of 32 byte(s) in 2 object(s) allocated from:
    #0 0x52100f in malloc (/controller/ovn-controller+0x52100f)
    #1 0x752436 in xmalloc /lib/util.c:138:15
    #2 0x5a2f0b in add_pending_ct_zone_entry /controller/ovn-controller.c:548:45
    ovn-org#3 0x5a2d1d in update_ct_zones /controller/ovn-controller.c:668:9
    ovn-org#4 0x59d8c6 in en_ct_zones_run /controller/ovn-controller.c:1495:5
    ovn-org#5 0x5dade4 in engine_run /lib/inc-proc-eng.c:377:9
    ovn-org#6 0x59adf4 in main /controller/ovn-controller.c
    ovn-org#7 0x7f0799ef41a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

CC: xu rong <xu.rong@zte.com.cn>
Fixes: 252e164 ("ovn-controller: pending_ct_zones should be destroyed")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
@numansiddique numansiddique force-pushed the master branch 6 times, most recently from 38a5351 to 5be9e1e Compare November 24, 2020 13:20
numansiddique pushed a commit that referenced this pull request Nov 24, 2020
When a port binding of type "l3gateway" is claimed its remote peer
port_binding is also stored in local_datapath.peer_ports[].remote.

If the remote peer port_binding is deleted first (i.e., before the local
"l3gateway" one) then we need to remove the complete
local_datapath.peer_ports[] entry in order to avoid ending up using
dangling pointers to already freed port bindings.

Also, properly reset local_datapath->has_local_l3gateway in
remove_pb_from_local_datapath().

Ilya reported this issue found by AddressSanitizer during his testing:

==1816017==ERROR: AddressSanitizer: heap-use-after-free on address 0x6140000cb170 at pc 0x0000005ab574 bp 0x7fff68925a30 sp 0x7fff68925a28
READ of size 8 at 0x6140000cb170 thread T0
    #0 0x5ab573 in put_replace_chassis_mac_flows git/ovn/controller/physical.c:550:9
    #1 0x5a65eb in consider_port_binding git/ovn/controller/physical.c:1168:13
    #2 0x5a8764 in physical_run git/ovn/controller/physical.c:1607:9
    ovn-org#3 0x5a0064 in flow_output_physical_flow_changes_handler git/ovn/controller/ovn-controller.c:2127:9
    ovn-org#4 0x5db423 in engine_compute git/ovn/lib/inc-proc-eng.c:306:18
    ovn-org#5 0x5dae1f in engine_run_node git/ovn/lib/inc-proc-eng.c:352:14
    ovn-org#6 0x5dac74 in engine_run git/ovn/lib/inc-proc-eng.c:377:9
    ovn-org#7 0x59ad64 in main git/ovn/controller/ovn-controller.c
    ovn-org#8 0x7f39fa6491a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
    ovn-org#9 0x480b2d in _start (git/ovn/controller/ovn-controller+0x480b2d)

0x6140000cb170 is located 304 bytes inside of 408-byte region [0x6140000cb040,0x6140000cb1d8)
freed by thread T0 here:
    #0 0x520d07 in free (git/ovn/controller/ovn-controller+0x520d07)
    #1 0x712de7 in ovsdb_idl_db_track_clear git/ovs/lib/ovsdb-idl.c:1984:21
    #2 0x59b5cd in main git/ovn/controller/ovn-controller.c:2762:9
    ovn-org#3 0x7f39fa6491a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

Reported-by: Ilya Maximets <i.maximets@ovn.org>
Fixes: 354bdba ("ovn-controller: I-P for SB port binding and OVS interface in runtime_data.")
Tested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
@numansiddique numansiddique force-pushed the master branch 4 times, most recently from 176890d to 031af9a Compare November 24, 2020 15:53
numansiddique pushed a commit that referenced this pull request May 9, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
numansiddique pushed a commit that referenced this pull request May 9, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
numansiddique pushed a commit that referenced this pull request May 9, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
numansiddique pushed a commit that referenced this pull request May 9, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
numansiddique pushed a commit that referenced this pull request May 9, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
numansiddique pushed a commit that referenced this pull request May 15, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
numansiddique pushed a commit that referenced this pull request May 15, 2024
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in ovn-org#3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 2609cd9)
@numansiddique numansiddique force-pushed the main branch 5 times, most recently from d95180c to 0d7e867 Compare June 6, 2024 21:12
@numansiddique numansiddique force-pushed the main branch 3 times, most recently from e4222db to cf55c6c Compare July 8, 2024 21:21
numansiddique added a commit that referenced this pull request Sep 19, 2024
Commit 8d13579 creates a chassisredirect port for a logical switch
port (of type 'router') in certain scenarios and 'op->nbsp' can be NULL.
The following crash is reported by Sanitizer.

==84927==ERROR: AddressSanitizer: SEGV on unknown address
==84927==The signal is caused by a READ memory access.
==84927==Hint: address points to the zero page.
    #0 0x57ab3854f04a in hmap_first_with_hash ovn/ovs/./include/openvswitch/hmap.h:360:38
    #1 0x57ab3854fedf in smap_find__ ovn/ovs/lib/smap.c:421:5
    #2 0x57ab3854f7b8 in smap_get_node ovn/ovs/lib/smap.c:217:12
    ovn-org#3 0x57ab3854f74f in smap_get_def ovn/ovs/lib/smap.c:208:30
    ovn-org#4 0x57ab3854f726 in smap_get ovn/ovs/lib/smap.c:200:12
    ovn-org#5 0x57ab3854f862 in smap_get_int ovn/ovs/lib/smap.c:240:25
    ovn-org#6 0x57ab383222eb in ovn_port_assign_requested_tnl_id ovn/northd/northd.c:4372:27
    ovn-org#7 0x57ab383072fa in build_ports ovn/northd/northd.c:4454:9
    ovn-org#8 0x57ab38301457 in ovnnb_db_run ovn/northd/northd.c:18023:5
    ovn-org#9 0x57ab3841d99e in en_northd_run ovn/northd/en-northd.c:137:5
    ovn-org#10 0x57ab384599b2 in engine_recompute ovn/lib/inc-proc-eng.c:411:5
    ovn-org#11 0x57ab38459d6e in engine_run_node ovn/lib/inc-proc-eng.c:473:9
    ovn-org#12 0x57ab38459ec3 in engine_run ovn/lib/inc-proc-eng.c:524:9
    ovn-org#13 0x57ab38430c5d in inc_proc_northd_run ovn/northd/inc-proc-northd.c:420:5
    ovn-org#14 0x57ab3841bb2f in main ovn/northd/ovn-northd.c:970:32

This patch fixes this issue.  It also corrects some typo introduced by
the commit - changes the reference from "chassisresident" to
"chassisredirect".

Fixes: 8d13579 ("Add support for centralize routing for distributed gw ports.")
Reported-by: Felix Huettner <felix.huettner@mail.schwarz>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2024-August/416264.html
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
numansiddique added a commit that referenced this pull request Sep 20, 2024
Commit 8d13579 creates a chassisredirect port for a logical switch
port (of type 'router') in certain scenarios and 'op->nbsp' can be NULL.
The following crash is reported by Sanitizer.

==84927==ERROR: AddressSanitizer: SEGV on unknown address
==84927==The signal is caused by a READ memory access.
==84927==Hint: address points to the zero page.
    #0 0x57ab3854f04a in hmap_first_with_hash ovn/ovs/./include/openvswitch/hmap.h:360:38
    #1 0x57ab3854fedf in smap_find__ ovn/ovs/lib/smap.c:421:5
    #2 0x57ab3854f7b8 in smap_get_node ovn/ovs/lib/smap.c:217:12
    ovn-org#3 0x57ab3854f74f in smap_get_def ovn/ovs/lib/smap.c:208:30
    ovn-org#4 0x57ab3854f726 in smap_get ovn/ovs/lib/smap.c:200:12
    ovn-org#5 0x57ab3854f862 in smap_get_int ovn/ovs/lib/smap.c:240:25
    ovn-org#6 0x57ab383222eb in ovn_port_assign_requested_tnl_id ovn/northd/northd.c:4372:27
    ovn-org#7 0x57ab383072fa in build_ports ovn/northd/northd.c:4454:9
    ovn-org#8 0x57ab38301457 in ovnnb_db_run ovn/northd/northd.c:18023:5
    ovn-org#9 0x57ab3841d99e in en_northd_run ovn/northd/en-northd.c:137:5
    ovn-org#10 0x57ab384599b2 in engine_recompute ovn/lib/inc-proc-eng.c:411:5
    ovn-org#11 0x57ab38459d6e in engine_run_node ovn/lib/inc-proc-eng.c:473:9
    ovn-org#12 0x57ab38459ec3 in engine_run ovn/lib/inc-proc-eng.c:524:9
    ovn-org#13 0x57ab38430c5d in inc_proc_northd_run ovn/northd/inc-proc-northd.c:420:5
    ovn-org#14 0x57ab3841bb2f in main ovn/northd/ovn-northd.c:970:32

This patch fixes this issue.  It also corrects some typo introduced by
the commit - changes the reference from "chassisresident" to
"chassisredirect".

Fixes: 8d13579 ("Add support for centralize routing for distributed gw ports.")
Reported-by: Felix Huettner <felix.huettner@mail.schwarz>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2024-August/416264.html
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
(cherry picked from commit 9fbda4a)
numansiddique pushed a commit that referenced this pull request Feb 27, 2025
When connecting two LRPs directly with each other the LRP->peer value is
set to the LRP->name value of the respective other LRP.

If we have LRP1 and LRP2. Previously LRP1->peer = LRP2 && LRP2->peer = ""
would have been processed by northd as if LRP1 was connect to LRP2 but not
the other way round.

Additionally it was possible to set
LRP1->peer = LRP2 && LRP2->peer = LRP3 && LRP3->peer = LRP1

Both of these options are invalid and in the past have produced
use-after-frees in northd, as reported by ASAN below.

The issue is present at least back until 2bdf112. But probably it is
older than that.

==845947==ERROR: AddressSanitizer: heap-use-after-free on address 0x5120000060f8 at pc 0x585dd65a92ab bp 0x7fffffc7fab0 sp 0x7fffffc7faa8
WRITE of size 8 at 0x5120000060f8 thread T0
    #0 0x585dd65a92aa in ovn_port_cleanup [...]/ovn/northd/northd.c:1240:26
    #1 0x585dd65a868c in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1257:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

0x5120000060f8 is located 184 bytes inside of 312-byte region [0x512000006040,0x512000006178)
freed by thread T0 here:
    #0 0x585dd64f4eb2 in free.part.0 asan_malloc_linux.cpp.o
    #1 0x585dd65a88f3 in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1263:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

previously allocated by thread T0 here:
    #0 0x585dd64f6199 in calloc ([...]/ovn/northd/ovn-northd+0xab7199) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)
    #1 0x585dd6df15e2 in xcalloc__ [...]/ovn/ovs/lib/util.c:125:31
    #2 0x585dd6df16b4 in xzalloc__ [...]/ovn/ovs/lib/util.c:135:12
    ovn-org#3 0x585dd6df1a19 in xzalloc [...]/ovn/ovs/lib/util.c:169:12
    ovn-org#4 0x585dd65ac63c in ovn_port_create [...]/ovn/northd/northd.c:1207:27
    ovn-org#5 0x585dd667d344 in join_logical_ports [...]/ovn/northd/northd.c:2337:31
    ovn-org#6 0x585dd659540e in build_ports [...]/ovn/northd/northd.c:4236:5
    ovn-org#7 0x585dd658f77d in ovnnb_db_run [...]/ovn/northd/northd.c:18539:5
    ovn-org#8 0x585dd66b5835 in en_northd_run [...]/ovn/northd/en-northd.c:129:5
    ovn-org#9 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#10 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#11 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#12 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#13 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#14 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#15 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#16 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#17 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
numansiddique pushed a commit that referenced this pull request Mar 24, 2025
When connecting two LRPs directly with each other the LRP->peer value is
set to the LRP->name value of the respective other LRP.

If we have LRP1 and LRP2. Previously LRP1->peer = LRP2 && LRP2->peer = ""
would have been processed by northd as if LRP1 was connect to LRP2 but not
the other way round.

Additionally it was possible to set
LRP1->peer = LRP2 && LRP2->peer = LRP3 && LRP3->peer = LRP1

Both of these options are invalid and in the past have produced
use-after-frees in northd, as reported by ASAN below.

The issue is present at least back until 2bdf112. But probably it is
older than that.

==845947==ERROR: AddressSanitizer: heap-use-after-free on address 0x5120000060f8 at pc 0x585dd65a92ab bp 0x7fffffc7fab0 sp 0x7fffffc7faa8
WRITE of size 8 at 0x5120000060f8 thread T0
    #0 0x585dd65a92aa in ovn_port_cleanup [...]/ovn/northd/northd.c:1240:26
    #1 0x585dd65a868c in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1257:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

0x5120000060f8 is located 184 bytes inside of 312-byte region [0x512000006040,0x512000006178)
freed by thread T0 here:
    #0 0x585dd64f4eb2 in free.part.0 asan_malloc_linux.cpp.o
    #1 0x585dd65a88f3 in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1263:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

previously allocated by thread T0 here:
    #0 0x585dd64f6199 in calloc ([...]/ovn/northd/ovn-northd+0xab7199) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)
    #1 0x585dd6df15e2 in xcalloc__ [...]/ovn/ovs/lib/util.c:125:31
    #2 0x585dd6df16b4 in xzalloc__ [...]/ovn/ovs/lib/util.c:135:12
    ovn-org#3 0x585dd6df1a19 in xzalloc [...]/ovn/ovs/lib/util.c:169:12
    ovn-org#4 0x585dd65ac63c in ovn_port_create [...]/ovn/northd/northd.c:1207:27
    ovn-org#5 0x585dd667d344 in join_logical_ports [...]/ovn/northd/northd.c:2337:31
    ovn-org#6 0x585dd659540e in build_ports [...]/ovn/northd/northd.c:4236:5
    ovn-org#7 0x585dd658f77d in ovnnb_db_run [...]/ovn/northd/northd.c:18539:5
    ovn-org#8 0x585dd66b5835 in en_northd_run [...]/ovn/northd/en-northd.c:129:5
    ovn-org#9 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#10 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#11 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#12 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#13 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#14 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#15 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#16 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#17 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
(cherry picked from commit c1780ed)
numansiddique pushed a commit that referenced this pull request Apr 8, 2025
When connecting two LRPs directly with each other the LRP->peer value is
set to the LRP->name value of the respective other LRP.

If we have LRP1 and LRP2. Previously LRP1->peer = LRP2 && LRP2->peer = ""
would have been processed by northd as if LRP1 was connect to LRP2 but not
the other way round.

Additionally it was possible to set
LRP1->peer = LRP2 && LRP2->peer = LRP3 && LRP3->peer = LRP1

Both of these options are invalid and in the past have produced
use-after-frees in northd, as reported by ASAN below.

The issue is present at least back until 2bdf112. But probably it is
older than that.

==845947==ERROR: AddressSanitizer: heap-use-after-free on address 0x5120000060f8 at pc 0x585dd65a92ab bp 0x7fffffc7fab0 sp 0x7fffffc7faa8
WRITE of size 8 at 0x5120000060f8 thread T0
    #0 0x585dd65a92aa in ovn_port_cleanup [...]/ovn/northd/northd.c:1240:26
    #1 0x585dd65a868c in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1257:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

0x5120000060f8 is located 184 bytes inside of 312-byte region [0x512000006040,0x512000006178)
freed by thread T0 here:
    #0 0x585dd64f4eb2 in free.part.0 asan_malloc_linux.cpp.o
    #1 0x585dd65a88f3 in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1263:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

previously allocated by thread T0 here:
    #0 0x585dd64f6199 in calloc ([...]/ovn/northd/ovn-northd+0xab7199) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)
    #1 0x585dd6df15e2 in xcalloc__ [...]/ovn/ovs/lib/util.c:125:31
    #2 0x585dd6df16b4 in xzalloc__ [...]/ovn/ovs/lib/util.c:135:12
    ovn-org#3 0x585dd6df1a19 in xzalloc [...]/ovn/ovs/lib/util.c:169:12
    ovn-org#4 0x585dd65ac63c in ovn_port_create [...]/ovn/northd/northd.c:1207:27
    ovn-org#5 0x585dd667d344 in join_logical_ports [...]/ovn/northd/northd.c:2337:31
    ovn-org#6 0x585dd659540e in build_ports [...]/ovn/northd/northd.c:4236:5
    ovn-org#7 0x585dd658f77d in ovnnb_db_run [...]/ovn/northd/northd.c:18539:5
    ovn-org#8 0x585dd66b5835 in en_northd_run [...]/ovn/northd/en-northd.c:129:5
    ovn-org#9 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#10 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#11 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#12 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#13 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#14 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#15 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#16 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#17 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
(cherry picked from commit c1780ed)
numansiddique pushed a commit that referenced this pull request Apr 8, 2025
When connecting two LRPs directly with each other the LRP->peer value is
set to the LRP->name value of the respective other LRP.

If we have LRP1 and LRP2. Previously LRP1->peer = LRP2 && LRP2->peer = ""
would have been processed by northd as if LRP1 was connect to LRP2 but not
the other way round.

Additionally it was possible to set
LRP1->peer = LRP2 && LRP2->peer = LRP3 && LRP3->peer = LRP1

Both of these options are invalid and in the past have produced
use-after-frees in northd, as reported by ASAN below.

The issue is present at least back until 2bdf112. But probably it is
older than that.

==845947==ERROR: AddressSanitizer: heap-use-after-free on address 0x5120000060f8 at pc 0x585dd65a92ab bp 0x7fffffc7fab0 sp 0x7fffffc7faa8
WRITE of size 8 at 0x5120000060f8 thread T0
    #0 0x585dd65a92aa in ovn_port_cleanup [...]/ovn/northd/northd.c:1240:26
    #1 0x585dd65a868c in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1257:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

0x5120000060f8 is located 184 bytes inside of 312-byte region [0x512000006040,0x512000006178)
freed by thread T0 here:
    #0 0x585dd64f4eb2 in free.part.0 asan_malloc_linux.cpp.o
    #1 0x585dd65a88f3 in ovn_port_destroy_orphan [...]/ovn/northd/northd.c:1263:5
    #2 0x585dd65ac254 in ovn_port_destroy [...]/ovn/northd/northd.c:1276:9
    ovn-org#3 0x585dd658d602 in destroy_datapaths_and_ports [...]/ovn/northd/northd.c:18362:9
    ovn-org#4 0x585dd658ccad in northd_destroy [...]/ovn/northd/northd.c:18443:5
    ovn-org#5 0x585dd66b56b3 in en_northd_run [...]/ovn/northd/en-northd.c:122:5
    ovn-org#6 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#7 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#8 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#9 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#10 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#11 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#12 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#13 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#14 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

previously allocated by thread T0 here:
    #0 0x585dd64f6199 in calloc ([...]/ovn/northd/ovn-northd+0xab7199) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)
    #1 0x585dd6df15e2 in xcalloc__ [...]/ovn/ovs/lib/util.c:125:31
    #2 0x585dd6df16b4 in xzalloc__ [...]/ovn/ovs/lib/util.c:135:12
    ovn-org#3 0x585dd6df1a19 in xzalloc [...]/ovn/ovs/lib/util.c:169:12
    ovn-org#4 0x585dd65ac63c in ovn_port_create [...]/ovn/northd/northd.c:1207:27
    ovn-org#5 0x585dd667d344 in join_logical_ports [...]/ovn/northd/northd.c:2337:31
    ovn-org#6 0x585dd659540e in build_ports [...]/ovn/northd/northd.c:4236:5
    ovn-org#7 0x585dd658f77d in ovnnb_db_run [...]/ovn/northd/northd.c:18539:5
    ovn-org#8 0x585dd66b5835 in en_northd_run [...]/ovn/northd/en-northd.c:129:5
    ovn-org#9 0x585dd6780411 in engine_recompute [...]/ovn/lib/inc-proc-eng.c:430:5
    ovn-org#10 0x585dd6781796 in engine_compute [...]/ovn/lib/inc-proc-eng.c:469:17
    ovn-org#11 0x585dd677e75e in engine_run_node [...]/ovn/lib/inc-proc-eng.c:518:14
    ovn-org#12 0x585dd677d871 in engine_run [...]/ovn/lib/inc-proc-eng.c:543:9
    ovn-org#13 0x585dd6719cf3 in inc_proc_northd_run [...]/ovn/northd/inc-proc-northd.c:470:5
    ovn-org#14 0x585dd66a13e5 in main [...]/ovn/northd/ovn-northd.c:1055:32
    ovn-org#15 0x7822ced45e07 in __libc_start_call_main /usr/src/debug/glibc/glibc/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ovn-org#16 0x7822ced45ecb in __libc_start_main /usr/src/debug/glibc/glibc/csu/../csu/libc-start.c:360:3
    ovn-org#17 0x585dd6409d74 in _start ([...]/ovn/northd/ovn-northd+0x9cad74) (BuildId: 7943a43509efc7074c04a1d7d0a37692f00fa1c4)

Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
(cherry picked from commit c1780ed)
numansiddique pushed a commit that referenced this pull request Nov 4, 2025
When running a sequence of commands out of which some fail the
prerequisites check, ovn-dbctl didn't completely initialize the rest
of the commands.  This in turn caused the cleanup code to try to
access random memory.

Caught by AddressSanitizer when running:
  ovn-nbctl -- set logical_switch_port ln tag-request 100 -- show

  ==1857349==ERROR: AddressSanitizer: SEGV on unknown address (...)
  ==1857349==The signal is caused by a READ memory access.
  ==1857349==Hint: this fault was caused by a dereference of a high value address...
      #0 0x000000403d5b in __asan::Allocator::Deallocate(...)
      #1 0x0000004a4a9f in free (utilities/ovn-nbctl+0x4a4a9f)
      #2 0x00000095855b in ds_destroy ovs/lib/dynamic-string.c:371:5
      ovn-org#3 0x0000004ea6e1 in ovn_dbctl_main utilities/ovn-dbctl.c:246:13
      ovn-org#4 0x0000004f5bee in main utilities/ovn-nbctl.c:9096:12
      ovn-org#5 0x7fe6873e5574 in __libc_start_call_main (/lib64/libc.so.6+0x3574)
      ovn-org#6 0x7fe6873e5627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627)
      ovn-org#7 0x0000004011c4 in _start (utilities/ovn-nbctl+0x4011c4)

Fixes: 8fb54e1 ("ovn-nbctl: Cleanup allocated memory to keep valgrind happy.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ales Musil <amusil@redhat.com>
numansiddique pushed a commit that referenced this pull request Dec 22, 2025
This allow easy bitmap resizing to avoid the following address sanitizer
crashes if we are exceeding bitmap allocated size. This issue can occurs
since IP lflow support for logical routers does not take into account
ovn_lflow bitmap resize.

=================================================================
==52991==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7bce82430100 at pc 0x0000004eae83 bp 0x7ffd7306f980 sp 0x7ffd7306f978
READ of size 8 at 0x7bce82430100 thread T0
    #0 0x0000004eae82 in bitmap_is_set ovn/ovs/lib/bitmap.h:91
    #1 0x0000004eae82 in lflow_table_add_lflow northd/lflow-mgr.c:765
    #2 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#3 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#4 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#5 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#6 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#7 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#8 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#9 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#10 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#11 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#12 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#13 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

0x7bce82430100 is located 0 bytes after 16-byte region [0x7bce824300f0,0x7bce82430100)
allocated by thread T0 here:
    #0 0x7fae83ee68a3 in calloc (/lib64/libasan.so.8+0xe68a3) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
    #1 0x00000077873e in xcalloc__ lib/util.c:125
    #2 0x00000077873e in xzalloc__ lib/util.c:135
    ovn-org#3 0x00000077873e in xzalloc lib/util.c:169
    ovn-org#4 0x0000004ea05b in bitmap_allocate ovn/ovs/lib/bitmap.h:51
    ovn-org#5 0x0000004ea05b in ovn_lflow_init northd/lflow-mgr.c:901
    ovn-org#6 0x0000004ea05b in do_ovn_lflow_add northd/lflow-mgr.c:1028
    ovn-org#7 0x0000004ea05b in lflow_table_add_lflow northd/lflow-mgr.c:730
    ovn-org#8 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#9 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#10 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#11 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#12 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#13 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#14 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#15 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#16 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#17 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#18 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#19 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

Fixes: 9ec96d0 ("northd: Add and delete logical routers in en-lflow engine node.")
Fixes: f22a005 ("northd: Convert datapath array into vector.")
Reported-at: https://issues.redhat.com/browse/FDP-2643
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
numansiddique pushed a commit that referenced this pull request Dec 22, 2025
This allow easy bitmap resizing to avoid the following address sanitizer
crashes if we are exceeding bitmap allocated size. This issue can occurs
since IP lflow support for logical routers does not take into account
ovn_dp_group bitmap resize.

=================================================================
==52991==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7bce82430100 at pc 0x0000004eae83 bp 0x7ffd7306f980 sp 0x7ffd7306f978
READ of size 8 at 0x7bce82430100 thread T0
    #0 0x0000004eae82 in bitmap_is_set ovn/ovs/lib/bitmap.h:91
    #1 0x0000004eae82 in lflow_table_add_lflow northd/lflow-mgr.c:765
    #2 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#3 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#4 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#5 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#6 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#7 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#8 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#9 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#10 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#11 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#12 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#13 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

0x7bce82430100 is located 0 bytes after 16-byte region [0x7bce824300f0,0x7bce82430100)
allocated by thread T0 here:
    #0 0x7fae83ee68a3 in calloc (/lib64/libasan.so.8+0xe68a3) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
    #1 0x00000077873e in xcalloc__ lib/util.c:125
    #2 0x00000077873e in xzalloc__ lib/util.c:135
    ovn-org#3 0x00000077873e in xzalloc lib/util.c:169
    ovn-org#4 0x0000004ea05b in bitmap_allocate ovn/ovs/lib/bitmap.h:51
    ovn-org#5 0x0000004ea05b in ovn_lflow_init northd/lflow-mgr.c:901
    ovn-org#6 0x0000004ea05b in do_ovn_lflow_add northd/lflow-mgr.c:1028
    ovn-org#7 0x0000004ea05b in lflow_table_add_lflow northd/lflow-mgr.c:730
    ovn-org#8 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#9 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#10 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#11 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#12 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#13 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#14 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#15 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#16 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#17 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#18 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#19 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

Fixes: 9ec96d0 ("northd: Add and delete logical routers in en-lflow engine node.")
Fixes: f22a005 ("northd: Convert datapath array into vector.")
Reported-at: https://issues.redhat.com/browse/FDP-2643
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
numansiddique pushed a commit that referenced this pull request Jan 27, 2026
There was a theoretical (but unlikely to happen in practice) integer
overflow in sparse_array_len() in the case when
dynamic_bitmap_last_set() would return LONG_MAX.

Found by coverity:
  overflow: The expression idx + 1L is considered to have possibly
  overflowed.

  CID 501214: (#1 of 1): Overflowed return value (INTEGER_OVERFLOW)
  return_overflow: (idx > -1L) ? idx + 1L : 0L, which might have
  overflowed, is returned from the function

Fixes: 39eb5b5 ("Add sparse array.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ales Musil <amusil@redhat.com>
numansiddique pushed a commit that referenced this pull request Jan 27, 2026
Coverity complained that there's a chance that
ovn_datapath_type_from_string() returns a value >= DP_MAX in
lflow_table_sync_to_sb().

That's actually not possible, the report is a false positive, but to
make it explicit add an assert (as we do in other places that work on
datapaths that have been sanitized).

Coverity report:
 CID 501213: (#1 of 1): Out-of-bounds read (OVERRUN)
 overrun-local: Overrunning array of 336 bytes at byte offset
 336 by dereferencing pointer &dps[dp_type].

Fixes: fd4762a ("lflow-mgr: Use an array for lflow dp groups.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ales Musil <amusil@redhat.com>
numansiddique pushed a commit that referenced this pull request Jan 27, 2026
When running a sequence of commands out of which some fail the
prerequisites check, ovn-dbctl didn't completely initialize the rest
of the commands.  This in turn caused the cleanup code to try to
access random memory.

Caught by AddressSanitizer when running:
  ovn-nbctl -- set logical_switch_port ln tag-request 100 -- show

  ==1857349==ERROR: AddressSanitizer: SEGV on unknown address (...)
  ==1857349==The signal is caused by a READ memory access.
  ==1857349==Hint: this fault was caused by a dereference of a high value address...
      #0 0x000000403d5b in __asan::Allocator::Deallocate(...)
      #1 0x0000004a4a9f in free (utilities/ovn-nbctl+0x4a4a9f)
      #2 0x00000095855b in ds_destroy ovs/lib/dynamic-string.c:371:5
      ovn-org#3 0x0000004ea6e1 in ovn_dbctl_main utilities/ovn-dbctl.c:246:13
      ovn-org#4 0x0000004f5bee in main utilities/ovn-nbctl.c:9096:12
      ovn-org#5 0x7fe6873e5574 in __libc_start_call_main (/lib64/libc.so.6+0x3574)
      ovn-org#6 0x7fe6873e5627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627)
      ovn-org#7 0x0000004011c4 in _start (utilities/ovn-nbctl+0x4011c4)

Fixes: 8fb54e1 ("ovn-nbctl: Cleanup allocated memory to keep valgrind happy.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ales Musil <amusil@redhat.com>
(cherry picked from commit 45d7831)
numansiddique pushed a commit that referenced this pull request Jan 27, 2026
This allow easy bitmap resizing to avoid the following address sanitizer
crashes if we are exceeding bitmap allocated size. This issue can occurs
since IP lflow support for logical routers does not take into account
ovn_lflow bitmap resize.

=================================================================
==52991==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7bce82430100 at pc 0x0000004eae83 bp 0x7ffd7306f980 sp 0x7ffd7306f978
READ of size 8 at 0x7bce82430100 thread T0
    #0 0x0000004eae82 in bitmap_is_set ovn/ovs/lib/bitmap.h:91
    #1 0x0000004eae82 in lflow_table_add_lflow northd/lflow-mgr.c:765
    #2 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#3 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#4 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#5 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#6 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#7 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#8 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#9 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#10 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#11 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#12 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#13 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

0x7bce82430100 is located 0 bytes after 16-byte region [0x7bce824300f0,0x7bce82430100)
allocated by thread T0 here:
    #0 0x7fae83ee68a3 in calloc (/lib64/libasan.so.8+0xe68a3) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
    #1 0x00000077873e in xcalloc__ lib/util.c:125
    #2 0x00000077873e in xzalloc__ lib/util.c:135
    ovn-org#3 0x00000077873e in xzalloc lib/util.c:169
    ovn-org#4 0x0000004ea05b in bitmap_allocate ovn/ovs/lib/bitmap.h:51
    ovn-org#5 0x0000004ea05b in ovn_lflow_init northd/lflow-mgr.c:901
    ovn-org#6 0x0000004ea05b in do_ovn_lflow_add northd/lflow-mgr.c:1028
    ovn-org#7 0x0000004ea05b in lflow_table_add_lflow northd/lflow-mgr.c:730
    ovn-org#8 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#9 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#10 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#11 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#12 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#13 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#14 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#15 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#16 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#17 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#18 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#19 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

Fixes: 9ec96d0 ("northd: Add and delete logical routers in en-lflow engine node.")
Fixes: f22a005 ("northd: Convert datapath array into vector.")
Reported-at: https://issues.redhat.com/browse/FDP-2643
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
numansiddique pushed a commit that referenced this pull request Jan 27, 2026
This allow easy bitmap resizing to avoid the following address sanitizer
crashes if we are exceeding bitmap allocated size. This issue can occurs
since IP lflow support for logical routers does not take into account
ovn_dp_group bitmap resize.

=================================================================
==52991==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7bce82430100 at pc 0x0000004eae83 bp 0x7ffd7306f980 sp 0x7ffd7306f978
READ of size 8 at 0x7bce82430100 thread T0
    #0 0x0000004eae82 in bitmap_is_set ovn/ovs/lib/bitmap.h:91
    #1 0x0000004eae82 in lflow_table_add_lflow northd/lflow-mgr.c:765
    #2 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#3 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#4 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#5 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#6 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#7 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#8 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#9 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#10 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#11 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#12 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#13 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

0x7bce82430100 is located 0 bytes after 16-byte region [0x7bce824300f0,0x7bce82430100)
allocated by thread T0 here:
    #0 0x7fae83ee68a3 in calloc (/lib64/libasan.so.8+0xe68a3) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
    #1 0x00000077873e in xcalloc__ lib/util.c:125
    #2 0x00000077873e in xzalloc__ lib/util.c:135
    ovn-org#3 0x00000077873e in xzalloc lib/util.c:169
    ovn-org#4 0x0000004ea05b in bitmap_allocate ovn/ovs/lib/bitmap.h:51
    ovn-org#5 0x0000004ea05b in ovn_lflow_init northd/lflow-mgr.c:901
    ovn-org#6 0x0000004ea05b in do_ovn_lflow_add northd/lflow-mgr.c:1028
    ovn-org#7 0x0000004ea05b in lflow_table_add_lflow northd/lflow-mgr.c:730
    ovn-org#8 0x00000046c075 in build_adm_ctrl_flows_for_lrouter northd/northd.c:13759
    ovn-org#9 0x00000046c075 in build_lswitch_and_lrouter_iterate_by_lr northd/northd.c:18591
    ovn-org#10 0x000000479805 in lflow_handle_northd_lr_changes northd/northd.c:19299
    ovn-org#11 0x00000049bcc0 in lflow_northd_handler northd/en-lflow.c:158
    ovn-org#12 0x000000508895 in engine_compute lib/inc-proc-eng.c:474
    ovn-org#13 0x000000508895 in engine_run_node lib/inc-proc-eng.c:546
    ovn-org#14 0x000000508895 in engine_run lib/inc-proc-eng.c:575
    ovn-org#15 0x0000004e0cff in inc_proc_northd_run northd/inc-proc-northd.c:602
    ovn-org#16 0x0000004077b4 in main northd/ovn-northd.c:1104
    ovn-org#17 0x7fae83523574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#18 0x7fae83523627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
    ovn-org#19 0x00000040ae74 in _start (ovn/northd/ovn-northd+0x40ae74) (BuildId: 5373e3f3823362d2c1a220082395cf8a9aae28bc)

Fixes: 9ec96d0 ("northd: Add and delete logical routers in en-lflow engine node.")
Fixes: f22a005 ("northd: Convert datapath array into vector.")
Reported-at: https://issues.redhat.com/browse/FDP-2643
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants