sys/xtimer: fix xtimer_mutex_lock_timeout corner cases#6441
sys/xtimer: fix xtimer_mutex_lock_timeout corner cases#6441lebrush wants to merge 3 commits intoRIOT-OS:masterfrom
Conversation
|
Can you provide a test application testing for the cases you drew out in #6428? |
|
Will do. Please, note that #6428 is still required. |
Understood it as such ;-) |
|
Rebase is required. |
sys/include/xtimer.h
Outdated
| * @return 0, when returned after mutex was locked | ||
| * @return -1, when the timeout occcured | ||
| * @return -1, mutex can't be locked right now (timeout <= XTIMER_BACKOFF) | ||
| * @return -2, when the timeout occcured |
There was a problem hiding this comment.
Can you introduce an enum for the error values? This would make it easier to change this later on.
There was a problem hiding this comment.
Alternative use errno.h. -EINVAL and -ETIMEDOUT seem fitting.
sys/xtimer/xtimer.c
Outdated
| /* timeout lower than XTIMER_BACKOFF might cause the code to spin rather | ||
| * than set it up for interrupt */ | ||
| if (timeout <= XTIMER_BACKOFF) { | ||
| return (mutex_trylock(mutex) - 1); |
There was a problem hiding this comment.
Let's say A has the mutex and B calls xtimer_mutex_lock_timeout() with timeout <= XTIMER_BACKOFF. Now A receives an interrupt that causes it to release the mutex. B could do its spin now and lock the mutex, right?
There was a problem hiding this comment.
Yes, I've a sketch of that locally, but still not fully happy with it... Will update it asap
92f5932 to
2d8a05b
Compare
|
I still have to provide some tests for it but [I hope] all corner cases are addressed in this implementation. I removed the two error return values (-1 and -2) and now returns only 0 (success) and @miri64 using sizediffs: http://pastebin.com/bn6et69K |
jnohlgard
left a comment
There was a problem hiding this comment.
One comment on premature optimization.
sys/xtimer/xtimer.c
Outdated
| t.arg = (void *)((mutex_thread_t *)&mt); | ||
| _xtimer_set64(&t, timeout, timeout >> 32); | ||
| if (locked || (timeout == 0)) { | ||
| return (locked - 1) * ETIMEDOUT; |
There was a problem hiding this comment.
This is an unnecessary optimization in my eyes. Make it a simple if branch and return -ETIMEDOUT if it is locked, otherwise 0.
The current code hurts readability.
There was a problem hiding this comment.
I know it's ugly... however with an if branch it generates larger code for most boards:
if (locked) {
return 0;
}
else if (timeout == 0) {
return -ETIMEDOUT;
}here is the size diff: http://pastebin.com/VVPjZQfK (+4 +8 bytes larger for most boards but -10 for avr).
Removing the -ETIMEDOUT and returning -1 instead reduces the code size at least by 4 bytes for most boards (many 10 and even 16). Size diff here: http://pastebin.com/dnG7YCJ5)
if (locked || (timeout == 0)) {
return (locked - 1);
}I would prefer this solution. What do you think?
ad9ae47 to
493c6f6
Compare
|
Example output of the test |
|
For some reason when running on |
493c6f6 to
8501eee
Compare
8501eee to
6c5b42d
Compare
|
did anyone have a look at this? @OlegHahm are all your comments addressed? |
|
On it now. |
OlegHahm
left a comment
There was a problem hiding this comment.
Please also provide a pexpect script for the test.
sys/include/xtimer.h
Outdated
| * | ||
| * @note this requires core_thread_flags to be enabled | ||
| * This will try to lock a mutex. If the mutex is not available immediately or | ||
| * until a certain amount of time (timeout) the method will return -1 |
There was a problem hiding this comment.
I would rephrase to:
Tries to lock the mutex for a maximum timespan of @p timeout microseconds.
There was a problem hiding this comment.
I think your sentence is in this case misleading, it's not clear if the mutex will be locked only during timeout or if it will try to lock it during the timeout. I will use the @p in any case :-)
sys/include/xtimer.h
Outdated
| * @return 0, when returned after mutex was locked | ||
| * @return -1, when the timeout occcured | ||
| * @param[in] mutex mutex to lock | ||
| * @param[in] timeout timeout in microseconds relative |
There was a problem hiding this comment.
s/relative//
It reads a bit weird and I think it's common sense that a timeout is specified in relative numbers.
|
Test application hangs for me on native, when I tried it first at |
|
I guess that according to the last mail by @kaspar030 we can dismiss the review by @OlegHahm. @miri64 do you approve the changes? |
|
After 5/6 pings in 8 months I dismiss the review of @OlegHahm . Can someone else please have a look? @miri64 @gebart @vincent-d ? |
|
postponed |
|
@kaspar030 @gebart can you maybe have a look? |
|
Ping @kaspar030 @gebart? |
|
@lebrush I'm really sorry for what happened here. I was not really confident in reviewing this one at the time you opened it, and this stalled for no good reason. We encountered one of the issues which are solved here and a colleague opened #10872. I've run a couple of tests and when trying to reproduced I encountered another issue which is also fixed by your PR. Would you mind rebasing this then we could merge? If you don't have time, could someone else take this over? |
vincent-d
left a comment
There was a problem hiding this comment.
Tested on nucleo-f207zg. Please rebase.
ACK
|
@lebrush ping? |
| /* a timeout lower than XTIMER_BACKOFF causes the xtimer to spin rather | ||
| * than to set a timer for interrupt. Hence, we shall make the mutex_lock | ||
| * call blocking only when the interrupt didn't occur yet. */ | ||
| locked = _mutex_lock(mutex, (mt.timeout == NO_TIMEOUT)); |
There was a problem hiding this comment.
While looking at the original code @JulianHolzwarth we noticed the issue with this one, and looked for existing references. But actually this does not solve the concurrency completely.
Between the evaluation of == and the actual irq_disable in the function, the callback can be triggered.
He will provide a PR for the needed change in core to evaluate a volatile condition when the interrupt are disabled.
This would however maybe make using 'mt.timeout' not possible and may require another variable.
|
Is this still being worked on? Is this fixed? @JulianHolzwarth @kaspar030 maybe you have some insight on this. |
|
Then let's close this. |
As discussed in #6428 (comment) if the timeout given is lower than XTIMER_BACKOFF the timer might spin instead of setting an interrupt and the mutex might lock until it's released rather than applying the timeout. This tries to solve this cases (and improves the documentation)