Skip to content

cpu/esp32: critcial section handling changed in FreeRTOS adaptation layer#11947

Merged
benpicco merged 4 commits intoRIOT-OS:masterfrom
gschorcht:cpu/esp32/freertos-critcial-sections
Sep 5, 2019
Merged

cpu/esp32: critcial section handling changed in FreeRTOS adaptation layer#11947
benpicco merged 4 commits intoRIOT-OS:masterfrom
gschorcht:cpu/esp32/freertos-critcial-sections

Conversation

@gschorcht
Copy link
Contributor

Contribution description

This PR changes the critical section handling of the FreeRTOS adaptation layer that is required by ESP32 SDK libraries for the WiFi interface.

Background

ESP32 SDK libraries for the WiFi interface require a number of FreeRTOS functions to work, e.g., crittical section handling, queues, mutexes and semaphores. To provide the ESP SDK libraries with these required FreeRTOS functions, an adaptation layer module in cpu/esp32/freertos realizes a mapping of these functions to RIOT's mechansims.

Critical section handling

The critical section handling of FreeRTOS for ESP32 realized by the portENTER_CRITICAL, taskENTER_CRITICAL, portEXIT_CRITICAL and taskEXIT_CRITICAL macros/functions uses mutexes. However, since these macros/functions are also called in interrupt context, the mutex based implementation of critical section handling does not work in RIOT.

This led to a number of instability problems when the WiFi interface was used (modules esp_wifi and/or esp_now), e.g., the problem with the multiheap corruption described in issue #11941. Therefore, the former mutex based implementation of critical section handling had to be changed.

Change

The mutex that is given as parameter for critical section handling macros/function is not used any longer. Instead, the basic default FreeRTOS mechanism for critical sections is realized by simply disabling interrupts. Once the interrupt is disabled by calling portENTER_CRITICAL or taskENTER_CRITICAL, the execution in critical section can't be interrupted. Since context switches for the ESP32 are also based on interrupts, there is no possibility that another thread will enter the critical section once the interrupts are disabled, even if an action within the critical section would trigger a thread_yield_higher.

I have tested the changes with ESP32 border router test configuration as described in #10929. While the CORRUPT multiheap error message as described in #11941 occured within several minutes without these changes, the configuration is working with the changes since more than one day without any error message.

Testing procedure

Setup an ESP-NOW network of two ESP32 nodes. For that purpose compile and flash two nodes with:

USEMODULE=esp_now make BOARD=esp32-wroom-32 -C examples/gnrc_networking flash

After that, connect to the nodes with a terminal program and use some commands like ifconfig and ping6 and wait. Without the changes in this PR, a CORRUPT message should come after several minutes. With the changes the same configuration should work stable.

Issues/PRs references

Fixes issue #11941.
Makes PR #11942 obsolete.

@gschorcht gschorcht added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: ESP Platform: This PR/issue effects ESP-based platforms Area: cpu Area: CPU/MCU ports labels Aug 1, 2019
@gschorcht gschorcht changed the title Cpu/esp32/freertos critcial sections cpu/esp32/freertos critcial sections Aug 1, 2019
@gschorcht gschorcht changed the title cpu/esp32/freertos critcial sections cpu/esp32: critcial section handling changed in FreeRTOS adaptation layer Aug 1, 2019
@gschorcht
Copy link
Contributor Author

@JulianHolzwarth May I ask you to take look on this? This might be interesting for your work.

@JulianHolzwarth
Copy link
Contributor

JulianHolzwarth commented Aug 1, 2019

Thank you for letting me know.
It is interesting for my work and this change seems good to me.
Do you intend to change it to a spin lock in the future? (this comment: /* mutex_lock(mux); */ /* TODO should be only a spin lock */)
If not you could remove the parameter mux because it is not used and freertos does not have it in their official code (as far as I can see latest official release FreeRTOSv10.2.1/FreeRTOS/Source/tasks.c line 4220 void vTaskEnterCritical( void )in amazon github: link to line )
same for portENTER_CRITICAL.
(Maybe not this pr because changing all function calls could be a lot of work. What do you think?)

@gschorcht
Copy link
Contributor Author

Do you intend to change it to a spin lock in the future? (this comment: /* mutex_lock(mux); */ /* TODO should be only a spin lock */)

Probably not, since a spin-lock would not improve the current mechanism and could block an ISR too long. I will remove the comment with the spin lock, but I would prefer to leave the /* mutex_lock ...*/ line and the comment above there for documentation purpose.

If not you could remove the parameter mux because it is not used and freertos does not have it in their official code (as far as I can see latest official release FreeRTOSv10.2.1/FreeRTOS/Source/tasks.c line 4220 void vTaskEnterCritical( void )in amazon github: link to line )
same for portENTER_CRITICAL.

I know, FreeRTOS doesn't define the mutex parameter. Unfortunatly, the ESP32 port of FreeRTOS does. And even more, SDK headers and SDK libraries require this interface. Therefore, I would leave this parameter there for the moment, even it is not used. When you have finished your FreeRTOS adaptation layer, we have to think about it again.

@gschorcht gschorcht added State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet and removed State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet labels Aug 5, 2019
@JulianHolzwarth
Copy link
Contributor

Ok. Thank you.

@gschorcht gschorcht force-pushed the cpu/esp32/freertos-critcial-sections branch from 4be4260 to 146ad38 Compare August 12, 2019 09:22
Using a mutex for critical section handling with portENTER_CRITICAL and portEXIT_CRITICAL does not work for RIOT, as this function can also be called in the interrupt context. Therefore, the given mutex is not used. Instead, the basic default FreeRTOS mechanism for critical sections is used by simply disabling interrupts. Since context switches for the ESP32 are also based on interrupts, there is no possibility that another thread will enter the critical section once the interrupts are disabled.
@gschorcht gschorcht force-pushed the cpu/esp32/freertos-critcial-sections branch from 146ad38 to 8c5de9b Compare August 12, 2019 14:52
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Sep 4, 2019
Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vTaskEnterCritical() / vTaskExitCritical() are now just wrappers around irq_disable()/irq_restore().

Tested together with #11997 and ping -f.

@gschorcht gschorcht added this to the Release 2019.10 milestone Sep 5, 2019
@gschorcht
Copy link
Contributor Author

@benpicco All lights are green 😄

@benpicco benpicco merged commit dbd97b7 into RIOT-OS:master Sep 5, 2019
@gschorcht
Copy link
Contributor Author

@benpicco Thanks again for reviewing, testing and merging 😄

@gschorcht gschorcht deleted the cpu/esp32/freertos-critcial-sections branch September 5, 2019 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: cpu Area: CPU/MCU ports CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: ESP Platform: This PR/issue effects ESP-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants