Skip to content

Designing a way to check for pending signals with detached thread state #68

@zackw

Description

@zackw

gh-133465 proposes that there should be a way for a C extension module to check for pending signals that doesn’t require the caller to have an attached thread state. During development of patches to implement that proposal, several API design issues came up that require careful handling, and I was asked to pull them out and file them here.

Background

Compiled-code modules that implement time-consuming operations that don’t require manipulating Python objects, are supposed to call PyErr_CheckSignals frequently throughout each such operation, so that if the user interrupts the operation with control-C, it is cancelled promptly.

In the most frequent case, with no signals pending, PyErr_CheckSignals is cheap. However, callers must have an attached thread state. Compiled-code modules are also supposed to detach the thread state during all time-consuming operations that don’t require manipulating Python objects, especially in traditional builds where “has an attached thread state” means “holds the global interpreter lock” and no other threads can use the C-API. The overhead of re-attaching a thread state just to call PyErr_CheckSignals, and then releasing it again afterward, sufficiently often for reasonable user responsiveness, can be substantial: see https://research.owlfolio.org/pubs/2025-pyext-ctrlc-talk/#/18/0/2 and subsequent slides for hard numbers (generated using Python 3.12).

The actual check for pending signals can be carried out safely without an attached thread state. Only the work that PyErr_CheckSignals does when there is a pending signal (i.e. running handlers, which may be implemented in Python) requires an attached thread state. Therefore, a variant of PyErr_CheckSignals whose internal structure was something like

int PyErr_CheckSignalsDetached()
{
    int status = 0;
    if (signals_pending()) {
        Py_BLOCK_THREADS
        // have to check again after attaching the thread state
        if (signals_pending()) {
            status = run_handlers();
        }
        Py_UNBLOCK_THREADS
    }
    return status;
}

would eliminate one of the reasons why lots of compiled-code modules currently don’t include as many signal checks as they ought to.

API Design Issues

The sketch implementation of PyErr_CheckSignalsDetached won’t work as shown, for several reasons. Addressing all of those reasons requires making the following design decisions:

  1. Should the new function take a PyThreadState* argument?

    If it does, it's substantially easier to implement, but callers need to supply that argument, which can be difficult, both because it might need to be passed down through several layers of intermediate functions and because Py_BEGIN_ALLOW_THREADS does not officially give one access to a PyThreadState (the _state variable it declares is documented, but only as an implementation detail, and no existing API expects it to be used directly).

  2. Should one be allowed to call the new function with or without an attached thread state, or only without?

    Again this is a tradeoff between implementation and usage convenience. Implementation would be easier if we required users of the new function to not have an attached thread state. As far as I can tell, the only function with the semantics of "give me an attached thread state if I don't already have one, do nothing if I do" is PyGILState_Ensure, which, per PEP 788, we'd rather not introduce new uses of. PyEval_RestoreThread is documented to deadlock if called by a thread that already has an attached thread state; PyThreadState_Swap is documented to require the outgoing thread state to be attached.

    However, it would be substantially easier for extension module authors to use the new API if they didn't have to worry about whether they did or didn't hold an attached thread state at each point where they want to introduce a signals check/cancellation point.

  3. Should the new function run the cycle collector?

    PyErr_CheckSignals only actually checks for signals if it's running in the main interpreter on the main thread. However, regardless of which interpreter or thread it's running on, it may run the cycle collector (the "do we need to run the cycle collector soon?" flag is also checkable without an attached thread state). This is not currently documented.

    If the new function shouldn't try to run the cycle collector, it's easier to implement, because it would have no work to do at all unless it was invoked from the main thread for the main interpreter. However, the rationale for PyErr_CheckSignals having the side effect of running the cycle collector is

    Opportunistically check if the GC is scheduled to run and run it if we have a request. This is done here because native code needs to call this API if is going to run for some time without executing Python code to ensure signals are handled. Checking for the GC here allows long running native code to clean cycles created using the C-API even if it doesn't run the evaluation loop

    which would seem to apply equally to the new PyErr_CheckSignalsDetached.

Prior and concurrent discussion

I gave a talk about the challenges involved in writing reliably interruptible extension modules at PyCon US 2025: slides and notes, video.

I raised all of the above questions already in PR #133466. Discussion focused on question #‌1, which is the most important, but did not reach consensus; #‌2 and #‌3 were only briefly considered.

My personal preference for #‌1 and #‌2 would be to maximize caller convenience: allow the new API to be used with or without an attached thread state, and do not require caller to supply a thread state. However, it is not clear to me that this is possible: note in particular the nasty example in this comment, involving subinterpreters.

I am largely indifferent to the answer to #‌3, but note that the argument here, for why it is unnecessary to call the cycle collector from the new API applies only if the answer to #‌2 is “new API can only be used without an attached thread state.”

I have opened a thread on discuss.python.org https://discuss.python.org/t/ergonomics-of-signal-checks-with-detached-thread-state/94355, soliciting feedback from authors of extension modules re what would make the hypothetical new API most ergonomic for them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions