pkg/tlsf: Early initialization of memory pool. #12032
pkg/tlsf: Early initialization of memory pool. #12032jcarrano wants to merge 2 commits intoRIOT-OS:masterfrom
Conversation
861453a to
95e73a7
Compare
|
@haukepetersen @benpicco can you have a look? |
benpicco
left a comment
There was a problem hiding this comment.
A quick grep through the newlib source code did not reveal any uses of sbrk outside nano-mallocr.c/mallocr.c/sbrkr.c, so this should be fine.
| extern "C" { | ||
| #endif | ||
|
|
||
| #define ROUND_DOWN4(x) (((x)/4)*4) /* Is this necessary??? */ |
|
This needs a rebase now. |
This test is currently failing because of RIOT-OS#4490, RIOT-OS#5796 and RIOT-OS#12021. When using TLSF as the system allocator it should be initialized - Automatically, as that is what the user expects. - Early in the boot process, since the C library mallocs internal buffers. Failing to do so will lead to a crash as the issues and this test shows. The test is blacklisted and will be whitelisted in the next commit with the fix.
The TLSF allocator needs to be initialized before use. This is an issue when it is used as a default system allocator since the user expects to be able to call malloc right away. This is made worse by the fact that the C library uses malloc internally to create buffers and that may happen before the user's code has a chance to run. As a consequence, even doing printf when using USEMODULE=tlsf-malloc will lead to a crash. A mechanism is needed to: 1. Initialize the pool early. 2. Determine which memory should be used as a heap and reserve it. Issue (1) is solved by adding the initializer to the C library's `.preinit_array`, which is a cross-file array of function pointers that run before the library is initialized -that is before _init(). See the newlib source code for more details. Point (2) is important because TLSF dows not support growing the pool, only adding new ones. We would like to initialize it with a pool as big as possible. In native (2) is handled by defining a static array of fixed size (given by TLSF_NATIVE_HEAPSIZE). Memory is plentiful in native and we down't care about the overhead of zeroing out this array. On embedded targets using newlib (this may be working on other plaforms, I only tested ARM) `sbrk()` is used to find the start of the heap and reserve it and the `_eheap` linker symbol is used to determine the end of the usable heap. An array is a bad choice here because the size would be board dependent and hard to determine without build-system magic and because it would be zeroed by default, making the boot sequence way longer. sbrk() does nothing more than move a pointer that marks the fraction of the space between _sheap and _eheap that is reserved. Since we are using the whole heap it might be tempting to just use the symbols to derive the pool location and size and to sidestep sbrk(). Especially since the memory allocation functions are expected to be the only users of such a feature. That "trick" would make the OS impossible to debug in case the was a mistake and some of the original allocation functions slipped through non-overriden. If sbrk is used to reserve the entirety of the space then that rogue function will try to call it and fail as no more heap is available. In fact this is how I found out that I was overriding the wrong functions (put a breakpoint int sbrk and show a traceback.) If sbrk is sidestepped one would have nasty and impossible to debug memory corruption errors. A third option could be to use the heap space directly and not define sbrk. This is beyond the scope of this change, but is probably the route to go for platform that do not define this call (but first do a thoroug investigation of how the libc works in that platform). Messing with the global system allocator is not an easy thing to do. I would say that tslf-malloc is ATM _only_ supported in native and cortex-m. Testing procedure: Run `tests/pkg_tlsf_malloc`. Fixes: RIOT-OS#4490, RIOT-OS#5796. Closes: RIOT-OS#12021
95e73a7 to
6945bba
Compare
|
@benpicco done. |
|
Murdock isn't very happy yet. |
|
What is happening with this, I don't know if @jcarrano is still active. Should we close or does someone want to take this over? |
|
@MrKevinWeiss sorry fo the late response. I'm not active. I'd gladly help, but I have no time. Even if this is not fixed right away, the existence of this issue in itself is good so that when people try to use TLSF and experience issues they can find this writeup. |
|
Good to hear from you @jcarrano, will leave it open until something properly resolves it. |
Contribution description
The TLSF allocator needs to be initialized before use. This is an issue whenit is used as a default system allocator since the user expects to be able to call malloc right away. This is made worse by the fact that the C library uses malloc internally to create buffers and that may happen before the user's code has a chance to run.
As a consequence, even doing printf when using USEMODULE=tlsf-malloc will lead to a crash.
A mechanism is needed to:
Initialization
Issue (1) is solved by adding the initializer to the C library's
.preinit_array, which is a cross-file array of function pointers that run before the library is initialized -that is before _init(). See the newlib source code for more details.Getting the heap and its size
Point (2) is important because TLSF dows not support growing the pool, only adding new ones. We would like to initialize it with a pool as big as possible.
Native
In native (2) is handled by defining a static array of fixed size (given by TLSF_NATIVE_HEAPSIZE). Memory is plentiful in native and we down't care about the overhead of zeroing out this array.
Newlib
On embedded targets using newlib (this may be working on other plaforms, I only tested ARM)
sbrk()is used to find the start of the heap and reserve it and the_eheaplinker symbol is used to determine the end of the usable heap.An array is a bad choice here because the size would be board dependent and hard to determine without build-system magic and because it would be zeroed by default, making the boot sequence way longer.
sbrk() does nothing more than move a pointer that marks the fraction of the space between _sheap and _eheap that is reserved. Since we are using the whole heap it might be tempting to just use the symbols to derive the pool location and size and to sidestep sbrk(). Especially since the memory allocation functions are expected to be the only users of such a feature. That "trick" would make the OS impossible to debug in case the was a mistake and some of the original allocation functions slipped through non-overriden. If sbrk is used to reserve the entirety of the space then that rogue function will try to call it and fail as no more heap is available. In fact this is how I found out that I was overriding the wrong functions (put a breakpoint int sbrk and show a traceback.) If sbrk is sidestepped one would have nasty and impossible to debug memory corruption errors.
Other
A third option could be to use the heap space directly and not define sbrk. This is beyond the scope of this change, but is probably the route to go for platform that do not define this call (but first do a thoroug investigation of how the libc works in that platform).
Final notes
Messing with the global system allocator is not an easy thing to do. I would say that tslf-malloc is ATM only supported in native and cortex-m.
Testing procedure
Run
tests/pkg_tlsf_malloc.Issues/PRs references
Fixes: #4490
Fixes #5796
Closes: #12021
Depends on: #12031