Summary
When two Pico W boards communicate over UDP in AP+STA mode with both sides transmitting in a ping-pong manner, the CYW43 firmware enters an unrecoverable state after a period of sustained traffic.
Environment
- SDK version: 2.2.0
- Board: Raspberry Pi Pico W (both units), but I also tested with a custom PCB using RM2
- Mode: One Pico W as AP (
cyw43_arch_enable_ap_mode), one as STA (cyw43_arch_enable_sta_mode)
- Architecture:
pico_cyw43_arch_poll (explicit poll mode)
- Power management:
CYW43_NONE_PM set on both sides at the appropriate respective times
- Protocol: UDP, fixed-size packets, fixed send interval
Steps to Reproduce
- Clone my example repo and build (it will build for AP and Beacon): https://github.com/mitchellcairns/udp-range-issue-pico-w
- Place both Pico W devices about 6 feet apart. It doesn't seem to matter if there's direct line of sight or not. Honestly at this distance it shouldn't matter. It doesn't for a bit, and then it suddenly does!
- Flash both Pico W devices with the sample firmwares.
- Open both devices using a serial monitor to see when the service degrades (You'll see the ping-pong timeout).
To reproduce faster, set interval to 1–2ms.
Expected Behaviour
The signal strength is maintained from boot time through the entire lifecycle of the device use during a single session.
Actual Behaviour
After a period of sustained bidirectional traffic (Can range from 5-10 minutes at 16ms intervals, faster at higher rates), the quality of the wireless link severely degrades.
If the beacon reboots, the AP still exhibits the same reduced range. When I've been able to get the beacon to re-connect while rebooting the AP, the connection strength seems to be restored, making me think this is an issue with the AP side of the firmware or SDK end.
Key Observations
1. pbuf pools are not exhausted
lwIP memory stats (LWIP_STATS=1) were monitored throughout. pbuf_pool used/avail remained stable and did not climb toward exhaustion before or during the stall. This rules out a lwIP-level memory leak as the cause.
2. Eventually there is a full failure
Keeping the connection running for 20-30 mins means that the connection eventually fails completely, and it doesn't matter how close the devices are, the connection is unable to recover.
3. Time is the enemy...
In a previous issue it seemed like some quick testing was done. At a quick glance the issue is not obvious, until it actually is and it makes the wireless link unusable . This is not a problem that can be tested in 30 seconds or so, but the issue consistently shows up.
Relevant Issues
I'm making this a new issue because while the other issues describe one aspect of the failure, I believe after a lot of troubleshooting that this is a QoS issue and not necessarily a throughput issue. It could still be related to throughput somehow, if something is getting stuck in the CYW43 module or there's a bug there that I cannot see.
With my test code I do not get any UDP stall errors or other driver errors.
Summary
When two Pico W boards communicate over UDP in AP+STA mode with both sides transmitting in a ping-pong manner, the CYW43 firmware enters an unrecoverable state after a period of sustained traffic.
Environment
cyw43_arch_enable_ap_mode), one as STA (cyw43_arch_enable_sta_mode)pico_cyw43_arch_poll(explicit poll mode)CYW43_NONE_PMset on both sides at the appropriate respective timesSteps to Reproduce
To reproduce faster, set interval to 1–2ms.
Expected Behaviour
The signal strength is maintained from boot time through the entire lifecycle of the device use during a single session.
Actual Behaviour
After a period of sustained bidirectional traffic (Can range from 5-10 minutes at 16ms intervals, faster at higher rates), the quality of the wireless link severely degrades.
If the beacon reboots, the AP still exhibits the same reduced range. When I've been able to get the beacon to re-connect while rebooting the AP, the connection strength seems to be restored, making me think this is an issue with the AP side of the firmware or SDK end.
Key Observations
1. pbuf pools are not exhausted
lwIP memory stats (
LWIP_STATS=1) were monitored throughout.pbuf_poolused/avail remained stable and did not climb toward exhaustion before or during the stall. This rules out a lwIP-level memory leak as the cause.2. Eventually there is a full failure
Keeping the connection running for 20-30 mins means that the connection eventually fails completely, and it doesn't matter how close the devices are, the connection is unable to recover.
3. Time is the enemy...
In a previous issue it seemed like some quick testing was done. At a quick glance the issue is not obvious, until it actually is and it makes the wireless link unusable . This is not a problem that can be tested in 30 seconds or so, but the issue consistently shows up.
Relevant Issues
I'm making this a new issue because while the other issues describe one aspect of the failure, I believe after a lot of troubleshooting that this is a QoS issue and not necessarily a throughput issue. It could still be related to throughput somehow, if something is getting stuck in the CYW43 module or there's a bug there that I cannot see.
With my test code I do not get any UDP stall errors or other driver errors.