-
Notifications
You must be signed in to change notification settings - Fork 23
Single-Hart O(1) Enhancement #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate with proper statistics.
include/sys/task.h
Outdated
|
||
/* Scheduler attribution */ | ||
typedef struct sched { | ||
volatile uint32_t ready_bitmap; /* 8-bit priority bitmap */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this variable set to volatile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISR might modify this value, so I added volatile
to ensure bitmap always read directly from memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit doubtful whether this is the right approach, since it sounds like you're using volatile to handle a synchronization problem. Even with the volatile keyword, accesses to the variable are not guaranteed to be atomic, so a race condition can still occur. Should we reconsider how synchronization is handled here? Maybe something like CRITICAL_ENTER/LEAVE() would be more appropriate.
Reference: https://docs.kernel.org/process/volatile-considered-harmful.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate your feedback and agree with your point of view. CRITICAL_ENTER()/LEAVE()
method is a better approach to protect bitmap correctness. I'll make sure the bitmap manipulated in the critical section.
CRITICAL_LEAVE(); | ||
return ERR_TASK_CANT_SUSPEND; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid adding or removing blank lines unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase your branch to keep the git history clean.
|
||
/* Scheduler attribution */ | ||
typedef struct sched { | ||
uint32_t ready_bitmap; /* 8-bit priority bitmap */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the comment says this is an 8-bit bitmap, but it's declared as uint32_t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be more priority level extensions in the future? If not, I'll modify this declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if it's necessary in the future, we can easily come back and modify this.
Regarding your question in the PR description about the use of sched_t and the changes to kcb_t: I'm not sure why you split kcb_t and sched_t into two separate structures, and I don't see an explanation for this in the commit message. From my perspective, both seem to store data needed by the scheduler, though perhaps you had other reasons or considerations in mind? |
Hi, @visitorckw, The reason I split data structure |
Introduce sched_t structure to encapsulate O(1) scheduler state. sched_t maintains ready_queues separated by priority, enabling task selection in O(1) complexity. kcb now contains a sched_t instance, allowing per-hart scheduler control and future SMP extensions.
New tasks created by `mo_task_spawn()` are enqueued into their priority ready queue. The very first task is special-cased: it becomes `task_current` and its state is set to `TASK_RUNNING` (not enqueued). `sched_select_next_task()` first returns the running task to its ready queue, then finds the highest non-empty priority by scanning the `ready_bitmap`, pops one node from that queue, and assigns it to `task_current`. The corresponding priority bit position in bitmap will be cleared when the queue becomes empty. This reduces selection from O(n) to O(1) and remains existing function calls.
`sched_enqueue_task`/`sched_dequeue_task` also put/remove task from corresponding prior queue and check the length of queue after put/remove task to setup bitmap. bitmap marco used here for simple operation purpose, it can be modified for the further development. The enqueue and dequeue process in `mo_task_spawn`/`sched_select_next_task` are all replaced by `sched_enqueue_task` and `sched_dequeue_task`.
a2ebd5e
to
e43cc5a
Compare
I don't have a strong opinion on the implementation, but I have a question about the design. It appears there might be some redundancy. sched_t stores a per-hart ready_queue, while kcb_t retains the tasks list and task_current. When I looked for an explanation for this separation, I couldn't find one. Could we clarify the design? It seems we should clearly define what data is per-hart and belongs in sched_t, versus what data is shared among all harts and should be in kcb_t. |
This design also raises a question about task affinity. If we add a task to a specific hart's queue, is the intention for it to be bound to that hart, or is there a mechanism for tasks to be rebalanced across different hart queues later? |
Hi @visitorckw , Thanks for your feedback, and let me clarify my original design.
Take one heart as an example, below diagram illustrates my design more clearly: |
But since we expect to modify current_task in kcb_t into an array where each hart has its own entry, would it be more like per-hart information that belongs in sched_t? I do see that you have indeed added a current_task field in sched_t, but what are the respective roles of it and task_current in kcb_t? |
Yes, it do a performance issue. I plan tasks bound into each hart. And I will added For rebalanced problem across different hart, I think I will use IPI mechnism to rearrange tasks belonging. But the implementation detail still need be discussed. Currently, I'm focus on experiment design to prove new design is available and better than current O(n) RR scheduler. And if you think my ideal is achieveable, I'll keep working on validation and will revise code based on what we have dissussed previously. |
You are correct, I tried to context switch based each hart, which means I won't access kcb in per-hart context switch. But I found in #6, your design is more backward compatible and more efficient. I'll modify |
I'm still not sure if it's necessary to maintain separate task lists in both kcb_t and sched_t. Consider an API like find_task_by_id(), which needs to iterate over every task across all harts. To do this safely, we must prevent any hart from modifying its ready queue during the iteration to avoid race conditions. This means we would need a mechanism to lock all queues anyway. If that's the case, we can simply iterate through each per-hart list from sched_t without needing the global list in kcb_t, like this: for (i = 0; i < hart_nums; i++) {
list_for_each(...) {
if (node->id == id) {
return node;
}
}
} This approach seems to work. Am I missing a more complex use case that justifies keeping the global task list in kcb_t? |
I gauss that you mean the list in kcb_t->task_list // Redundant? Split into local_hart_task_list.
sched_t{
list local_hart_task_list; // Originally maintained by kcb->task_list, now split and maintained by local_hart_task_list
list local_hart_ready_queue; // Ready queue with tasks state of TASK_RUNNING and TASK_READY
...
} |
I'm just thinking that each task should only appear in one list at a time, and never be duplicated across two different lists. For running and ready tasks, using a per-hart data structure seems appropriate, as it would be efficient for the scheduler on each hart. I'm less certain about tasks in other states. It's unclear to me whether placing them in a single global list or in per-hart lists would be more suitable. This decision likely depends on how we intend to handle task affinity. |
Yes, but I think there still needs other list to handle remaining tasks with different task states. And the ready queue needs external node to linked which task state is READY or RUNNING. I will suggest In order to solve task affinity, I think maybe we can add |
I thought a running task would not appear in the ready queue, but instead each hart would keep track of its own currently running task node. Sorry if I misunderstood your description — could you clarify or correct my misunderstanding?
I am not necessarily against this, but I don't see why we need to keep the task id in sorted order or what benefit we gain from it. Do we have any feature that depends on maintaining this ordering?
No, I didn't mean adding a feature that lets the user specify which hart a task must run on — that should be relatively easy. What I meant is that the scheduler decides when a task will be placed onto a particular hart's ready queue, which determines which hart it will run on. |
Once the RUNNING task need back to READY, there must process "enqueue" function to ensure it back to the last of ready queue. This is redundant and cycle costly process because it will use To solve this issue, there is a pointer tracing the current RUNNING task and this pointer will go to next node use API
When the task want to be deleted, I think each hart can just delete task from local ready queue without further operation if tasks list held at global. |
I don't think this is the reason why we must keep the running task in the ready queue. If we are really concerned about the efficiency of list_push_back, we can simply add a pointer to keep track of both ends of the list to make the insertion operation have an O(1) time complexity.
Still confused. I still don't see the reason why the list needs to be kept sorted. At the same time, I'm also not quite clear on why this approach allows us to delete a task by only removing it from each hart's queue. |
Thanks for your suggestion, I'll follow this idea.
Yes, currently, there do not have relative feature depends on task list is sorted or not, but I'm just thinking that maximum keep the same structure as current version and focus on one hart scheduler case. Maybe the design that tasks held by local hart or global can make experiments to figure out the performance difference later. |
Hi @jserv ,
This is still a draft. I’ve implemented O(1) task selection using a
priority bitmap and per-priority queues. Key design points:
mo_task_spawn()
enqueues new tasks, except the very first one whichis assigned to
task_current
with stateTASK_RUNNING
.sched_select_next_task()
reinserts the running task into its readyqueue, then scans
ready_bitmap
to select the highest-priority readyqueue in O(1).
Could you confirm if this approach aligns with the project’s design
expectations?
In particular, I’d like feedback on:
sched_t
data structure.sched_t
orkcb
would be preferred to bettersupport future SMP design.