Skip to content

Conversation

@amken3d
Copy link

@amken3d amken3d commented Nov 16, 2025

This PR proposes

  • Support for CPU core pinning and affinity for tasks and goroutines.
  • Updated the scheduler to respect affinity constraints with separate queues for pinned and shared tasks.
  • Added new runtime API functions LockToCore, UnlockFromCore, GetAffinity, and CurrentCPU.
  • Example program demonstrates core pinning and unpinned execution behavior.

API Functions

runtime.NumCPU() int

Returns the number of CPU cores available (returns 2 on RP2040/RP2350).

runtime.CurrentCPU() int

Returns the current CPU core number (0 or 1).

runtime.LockToCore(core int)

Pins the current goroutine to the specified core:

  • core = 0 - Pin to core 0
  • core = 1 - Pin to core 1
  • core = -1 - Unpin (allow running on any core)

Panics if core is invalid (not -1, 0, or 1).

runtime.UnlockFromCore()

Unpins the current goroutine, allowing it to run on any core.
Equivalent to runtime.LockToCore(-1).

runtime.GetAffinity() int

Returns the current goroutine's CPU affinity:

  • Returns -1 if not pinned (can run on any core)
  • Returns 0 or 1 if pinned to that specific core

Example program included in the examples directory

  • Tested on both pico and pico2

  • Output of example program

=== Core Pinning Example ===                                             
Number of CPU cores: 2                                                   
Main starting on core: 0                                                 
                                                                         
Main pinned to core: 0

Core 0 (main): 0 on CPU 0
Worker pinned to core: 1
  Core 1 (worker): 0 on CPU 0
Unpinned worker starting, affinity: 0
    Unpinned worker: 0 on CPU 0
Core 0 (main): 1 on CPU 0
    Unpinned worker: 1 on CPU 0
Core 0 (main): 2 on CPU 0
  Core 1 (worker): 2 on CPU 1
    Unpinned worker: 2 on CPU 0                                                 
Core 0 (main): 3 on CPU 0                                                       
  Core 1 (worker): 3 onCPU 1                                                    
Core 0 (main): 4 on CPU 0                                                       
  Core 1 (worker): 4 on CPU 1                                                   
    Unpinned worker: 3 on CPU 0                                                 
Core 0 (main): 5 on CPU 0                                                       
  Core 1 (worker): 5 on CPU 1                                                   
    Unpinned worker: 4 on CPU 0                                                 
Core 0 (main): 6 on CPU 0                                                       
  Core 1 (worker): 6 on CPU 1                                                   
Core 0 (main): 7 on CPU 0                                                       
  Core 1 (worker): 7 on CPU 1                                                   
    Unpinned worker: 5 on CPU 0                                                 
Core 0 (main): 8 on CPU 0                                                       
  Core 1 (worker): 8 on CPU 1                                                   
    Unpinned worker: 6 on CPU 0                                                 
Core 0 (main): 9 on CPU 0                                                       
  Core 1 (worker): 9 on CPU 1                                                   
    Unpinned worker: 7 on CPU 0                                                 
                                                                                
Main unpinned, affinity: -1                                                     
Unpinned main on CPU 0                                                          
  Core 1 worker finished                                                        
Unpinned main on CPU 0                                                          
    Unpinned worker: 8 on CPU 0                                                 
Unpinned main on CPU 0                                                          
    Unpinned worker: 9 on CPU 0                                                 
Unpinned main on CPU 0                                                          
Unpinned main on CPU 1                                                          
    Unpinned worker finished                                                    
                                                                                
Example complte!   

- Introduced support for CPU core pinning and affinity for tasks and goroutines.
- Updated the scheduler to respect affinity constraints with separate queues for pinned and shared tasks.
- Added new runtime API functions `LockToCore`, `UnlockFromCore`, `GetAffinity`, and `CurrentCPU`.
- Example program demonstrates core pinning and unpinned execution behavior.
@eliasnaur
Copy link
Contributor

Do you actually care about the particular core? If not, are the existing runtime.LockOSThread and runtime.UnlockOSThread calls sufficient to lock/unlock a goroutine to a core?

@amken3d
Copy link
Author

amken3d commented Nov 17, 2025

This is what I see for LockOsThread

// LockOSThread wires the calling goroutine to its current operating system thread.
// Stub for now
// Called by go1.18 standard library on windows, see golang/go#49320
func LockOSThread() {
}

// UnlockOSThread undoes an earlier call to LockOSThread.
// Stub for now
func UnlockOSThread() {
}

There seems to be no implementation behind it.

For the RP2, since it is symmetrical multi processor, it probably doesn't matter which exact core. But for something like StM32h7, it would matter which core you pin to. (I know we don't support multicore on it yet)

@eliasnaur
Copy link
Contributor

I know. What I'm saying is to change LockOSThread to mean "lock the current goroutine to a core" (when using the cores scheduler).

@amken3d
Copy link
Author

amken3d commented Nov 17, 2025

Fair point. That seems reasonable to me. I can use those function names instead.
The only issue I see is that Lock OsThread can't take any arguments. I'd like to be able to tell which core to lock to.

@eliasnaur
Copy link
Contributor

Right. So LockOSThread is enough for use cases where you only care about exclusive access to a some core. For heterogeneous cores, I suggest:

  • Move the API to package machine.
  • Drop CurrentCPU - it's racy (its result may be invalidated at any time).
  • Drop GetAffinity - it seems that code that pin itself to a particular core should know what it's doing.
  • Drop the -1 special case from LockToCore
  • Rename LockToCore to LockCore to mimic LockOSThread naming. Rename UnlockFromCore to UnlockCore for the same reason.

An important issue to think about is what happens if the requested core is busy? LockOSThread doesn't have this problem (some core must be running it).

- Dropped CurrentCPU
- Dropped GetAffinity
- Renamed LockToCore to LockCore to mimic LockOSThread naming.
- Updated examples program
@amken3d
Copy link
Author

amken3d commented Nov 18, 2025

Looks like it passed all checks except the macos(13) test with this error

This is a scheduled macos-13 brownout. The macOS-13 based runner images are being deprecated. For more details, see actions/runner-images#13046.

@deadprogram
Copy link
Member

@amken3d I just created #5093 to address the macOS 13 runner deprecation.

Copy link
Contributor

@eliasnaur eliasnaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've commented on the implementation, but I'm still not a fan of the LockCore API, because it may block indefinitely if a long-running goroutine is running on the target core. In a sense, LockCore acts as a per-core mutex that some arbitrary other goroutine may have taken, with the usual deadlock risks.

One way of getting around this issue is by requiring LockCore to be called before any other goroutine has started. A good place would be in an init function.

// Stub for now
// On microcontrollers with multiple cores (e.g., RP2040/RP2350), this pins the
// goroutine to the core it's currently running on.
// Called by go1.18 standard library on windows, see https://github.com/golang/go/issues/49320
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While here, remove this now irrelevant comment.

Comment on lines +101 to +102
// On microcontrollers with multiple cores (e.g., RP2040/RP2350), this pins the
// goroutine to the core it's currently running on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it more precise to say "with the "cores" scheduler"?


const numCPU = 2 // RP2040 and RP2350 both have 2 cores

// LockCore implementation for the cores scheduler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a more detailed description. For example, it doesn't say what happens if the target core is busy. I believe LockCore returns. If so, this is surprising to me; I would expect that once LockCore returns, the calling goroutine is running on the target core.

…behavior, and limitations with the "cores" scheduler. Updated LockOSThread and UnlockOSThread comments to reflect core pinning behavior on RP2040/RP2350.
Copy link
Contributor

@eliasnaur eliasnaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aykevl should probably take a look.

//
// Valid core values are 0 and 1. Panics if core is out of range.
//
// Only available on RP2040 and RP2350 with the "cores" scheduler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous comment. The function is only available on rpxxxx by build tags.

// After calling UnlockCore, the scheduler is free to schedule the goroutine on
// any core for automatic load balancing.
//
// Only available on RP2040 and RP2350 with the "cores" scheduler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous comment.

Comment on lines +16 to +18
// To avoid potential blocking on a busy core, consider calling LockCore in an
// init function before any other goroutines have started. This guarantees the
// target core is available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a hard requirements; that is, LockCore should panic if any other goroutine has started.

Comment on lines +10 to +14
// Important: LockCore sets the affinity but does not immediately migrate the
// goroutine to the target core. The actual migration happens at the next
// scheduling point (e.g., channel operation, time.Sleep, or Gosched). After
// that point, the goroutine will wait in the target core's queue if that core
// is busy running another goroutine.
Copy link
Contributor

@eliasnaur eliasnaur Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing this implementation detail seems like an unnecessary burden on the caller. Why can't LockCore switch the goroutine over to the target core before returning? For instance, why can't LockCore call Gosched?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants