xtimer: fix mutex unlocking in _mutex_timeout#6428
Conversation
sys/xtimer/xtimer.c
Outdated
| mt->timeout = 1; | ||
| sched_set_status(mt->thread, STATUS_PENDING); | ||
| list_remove(&mt->mutex->queue, (list_node_t *)&mt->thread->rq_entry); | ||
| if (mt->mutex->queue.next == NULL) { |
There was a problem hiding this comment.
Really good catch! 👍
I ignored that the MUTEX_LOCK is not always the tip of the list! It is only when the mutex is owned by one thread. If there are more threads waiting, the tip is NULL. I guess this is done to be able to use thread_add_to_list cleanly.
There's one race condition still: if the timer fires between setting the timer and locking the mutex might create inconsistencies in the mutex queue. Just test that list_remove really removed a node
list_node_t* node = list_remove(...)
if (node != NULL && mt->mutex->queue.next == NULL) {
// we were the last node in the queue but the mutex shall still be owned by someone else
mt->mutex->queue.next = MUTEX_LOCKED;
}
sys/xtimer/xtimer.c
Outdated
| mt->timeout = 1; | ||
| list_node_t *node = list_remove(&mt->mutex->queue, | ||
| (list_node_t *)&mt->thread->rq_entry); | ||
| if (node != NULL && mt->mutex->queue.next == NULL) { |
There was a problem hiding this comment.
Please but each condition into parentheses.
b9fd4b7 to
eacffdf
Compare
|
@OlegHahm comment addressed and squashed |
|
Please wait before merging this one
…On 19 Jan 2017 18:28, "Vincent Dupont" ***@***.***> wrote:
@OlegHahm <https://github.com/OlegHahm> comment addressed and squashed
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#6428 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABXvYvQ_K4XAXToi9dlAWxFU7aSA4Brxks5rT50hgaJpZM4LoNsW>
.
|
|
@lebrush, waiting for what? |
|
Sorry for the delay. I wanted to make sure that all bugs here are addressed, since I introduced the feature and the bugs :-) After reviewing the code once more I made a table with the following corner cases: what happens if the timer fires before calling
This patch solves the case
The implementation of xtimer might decide to spin if the timeout is to small. Hence we need to solve the if (timeout > XTIMER_BACKOFF) With these two fixes then we are:
The last consideration I could think of is the case that the timer triggers between the return of I think it is necessary to include a fix for the |
|
TL;DR I would merge this one and contemplate the other cases in i.e. #6441 |
|
What was the initial reason not to call |
|
Thread A has the mutex, Thread B waits for it. Thread C sets a mutex lock with timeout. If the callback occurs before A has released the mutex and you call |
|
Ah, I think I misunderstood |
|
Good point. Will improve the docs in #6441 |
| list_node_t *node = list_remove(&mt->mutex->queue, | ||
| (list_node_t *)&mt->thread->rq_entry); | ||
| if ((node != NULL) && (mt->mutex->queue.next == NULL)) { | ||
| mt->mutex->queue.next = MUTEX_LOCKED; |
There was a problem hiding this comment.
This has to be MUTEX_LOCKED because we assume that someone still owns the mutex, right?
There was a problem hiding this comment.
Exactly.
When the mutex is unlocked, queue.next is NULL. mutex -> NULL
When the mutex is locked by 1 thread queue.next is MUTEX_LOCKED. mutex -> LOCKED
When the mutex is locked by 2+ threads queue.next is the mt->thread->rq_entry of a thread and the last .next is NULL. mutex -> thread -> (thread -> ...) NULL
If node != NULL we were removed from the list, hence 2+ threads where in the list. At the same time if queue.next is NULL means that we were the last thread in the list and we need to set queue.next to MUTEX_LOCKED as there's now 1 thread still owning the mutex.
There was a problem hiding this comment.
Yes, otherwise you could have locked it and then the timeout never fire.
Changes proposed in #6427.