- 
        Couldn't load subscription status. 
- Fork 10.6k
[6.2] Add support for TBI masking optimization for nonisolated(nonsending) #84986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[6.2] Add support for TBI masking optimization for nonisolated(nonsending) #84986
Conversation
| @swift-ci test | 
…inTypeKind. This is an NFC change. The idea is that this will make it easier to add new BuiltinTypeKinds. I missed this code when adding Builtin.ImplicitIsolationActor. By adding a covered switch, we can make sure this code is updated in the future. (cherry picked from commit 05536d7)
…piling for a triple that supports AArch64 TBI. Just breaking down layers of a larger patch to make it easier to review. (cherry picked from commit bfecaa3)
With some changes that I am making, we will no longer need this flag at the SIL level, so we can just make it an IRGen flag (which it really should have been in the first place). (cherry picked from commit 361e63c)
…run. Specifically, I renamed the old llvm-as which printed with optimization as llvm-as-opt and made llvm-as just test IRGen. That is really the true reason that I created this utility many years ago: to test how IRGen lowers specific SIL instruction sequences. How LLVM optimizes such sequences once lowered is a different problem. I updated the one test that used sil-llvm-gen and required optimizations to use llvm-as-opt so that the tests output did not change. (cherry picked from commit 5c8a95f)
(cherry picked from commit ca2fb09)
Specifically, we ignore the -Xllvm bit and put the option into FrontendOptions.LLVMArgs so that parts of CompilerInvocation that assumes that -Xllvm flags will be placed there are actually there. We still pass through the argument to llvm::cl so it gets set that way as well. (cherry picked from commit d0d2163)
(cherry picked from commit 3a96e99)
This is currently not wired up to anything. I am going to wire it up in subsequent commits. The reason why we are introducing this new Builtin type is to represent that we are going to start stealing bits from the protocol witness table pointer of the Optional<any Actor> that this type is bitwise compatible with. The type will ensure that this value is only used in places where we know that it will be properly masked out giving us certainty that this value will not be used in any manner without it first being bit cleared and transformed back to Optional<any Actor>. (cherry picked from commit 2fa3908) Conflicts: lib/AST/ASTPrinter.cpp lib/IRGen/IRGenModule.cpp
…tional<any Actor> as an expected executor. We want SILGen to have a simplified view of its executor and know that whenever one sees an Actor, it is an actual actor instead of a Builtin.Executor. This just simplifies code. Also, we should eventually have an invariant that Builtin.Executor should only be allowed in LoweredSIL after LowerHopToExecutor has run. But that is a change for another day. (cherry picked from commit 81885a6)
This instruction converts Builtin.ImplicitActor to Optional<any Actor>. In the process of doing so, it masks out the bits we may have stolen from the witness table pointer of Builtin.ImplicitActor. The bits that we mask out are the bottom two bits of the top nibble of the TBI space on platforms that support TBI (that is bit 60,61 on arm64). On platforms that do not support TBI, we just use the bottom two tagged pointer bits (0,1). By using an instruction, we avoid having to represent the bitmasking that we are performing at the SIL level and can instead just make the emission of the bitmasking an IRGen detail. It also allows us to move detection if we are compiling for AArch64 to be an IRGen flag instead of a LangOpts flag. The instruction is a guaranteed forwarding instruction since we want to treat its result as a borrowed projection from the Builtin.ImplicitActor. (cherry picked from commit fe9c21f)
…changing the pass. Specifically, I am refactoring out the code that converts actor/Optional<any Actor> to an executor in preparation for adding code to LowerHopToExecutor that handles Builtin.ImplicitIsolationActor. The only actual functional change is that I made getExecutorForOptionalActor support being invoked when generating code (i.e. when its SILBuilder has an insertion point at the end of the block). It previously assumed that it would always have a real SILInstruction as an insertion point. The changes can be seen in the places where we now check if the insertion point equals the end of a block. Its very minor and due to conditional control flow doesn't have any actual impact given the manner that the code today is generated. This came up in a subsequent commit when I reuse this code to generate a helper function for converting Builtin.ImplicitIsolationActor to Builtin.Executor. (cherry picked from commit 21915ae) Conflicts: lib/SILOptimizer/Mandatory/LowerHopToActor.cpp
…represent the implicit isolated parameter. NOTE: We are not performing any bitmasking at all now. This is so that we can transition the code base/tests to expect Builtin.ImplicitActor instead of Optional<any Actor>. NOTE: The actual test changes are in the next commit. I did this to make it easier to review the changes. This should not have any user visible changes. (cherry picked from commit 788abd0)
(cherry picked from commit dc19306) Conflicts: test/Concurrency/isolation_macro_sil.swift test/Concurrency/nonisolated_nonsending_optimize_hoptoexecutor.swift
…ed pointer bits otherwise. Specifically, when TBI is available we use the bottom two bits of the top nibble (bits 60,61). On platforms without TBI, we use the bottom two tagged pointer bits (bits 0, 1). rdar://156525771 (cherry picked from commit 390afe3)
baa0b97    to
    abc2e30      
    Compare
  
    | I am expecting a failure on the IR test (I am testing on my own machine on the side) due to a difference in between main/release/6.2. I am going to let it fail so I can make sure that on the bots it is the only issue. | 
| We got through all of the x86_64 tests... we hadn't run the arm64 tests yet, so the test did not hit. Better to just start it now. | 
Description: In this change, we introduce bit masking of the witness table pointer of any Actor. We are doing this so that in a future release, we can steal those bits and use it to signal information to the runtime to dynamically eliminate hops in nonisolated(nonsending) code. Currently these bits are not used (bits 60,61 in the TBI byte if we have it or bits 0,1 of the tagged pointer bits otherwise), so there is very small risk. We are doing it this way so that we can enable the optimization that we want to enable here in a future release. I describe the scheme/what we are doing here in more detail below.
LONGER EXPLANATION: For nonisolated(nonsending) we use a different ABI than we normally use for async functions. Normally async functions upon entrance/exit to a function do not have any guarantee around which actor they are so they must hop in the callee to get on the correct actor and hop in the caller on return. In contrast, nonisolated(nonsending) functions can avoid these hops since by ABI they assume on entry they are already on the appropriate actor and guarantee that they will still be on that same actor upon return.
There is one drawback though: if we hop out of the nonisolated(nonsending) world to another actor, in certain cases hops that we used to be able to eliminate we can no longer eliminate. Specifically this occurs in cases where we would eliminate hops at function exit since we would know that our caller was guaranteed to have a hop so no hop was necessary there. We need that trailing hop now since we guarantee that we on return will be on the original isolation domain of the function.
To recover this optimization dynamically, we want to allow for callers to signal to a nonisolated(nonsending) callee that even though the caller is not going to hop as part of the convention for returning from the callee, the caller has some other hop that is unrelated to the callee that is going to occur before any isolated work needs to occur. In such a case, the callee can see the information and know that it can avoid performing the trailing hop.
In order to do this safely, we need to ensure that we mask out these bits in all places where we can access the implicit actor value. This only occurs in a few ways: the
#isolationmacro inline or on a default argument. To enable us to do this optimization later, we insert masking code in these parts of the codebase. This comes down to just a single and instruction against an immediate which is very small from a code-size perspective.Scope/Impact: Minimal. Only masks bits that are already not used anywhere already and only masks them when we use
#isolation.Risk: Low. This can only mask bits that we already know are never set by the compiler today. That being said, we are changing 6.2, so there is always some risk.
Testing: CI testing/Added tests. Checked the released SDK for usages of nonisolated(nonsending) that would expose the bits without
Reviewed by: @rjmccall
Original PR: #83346
Radar: rdar://156525771