-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Open
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-SIMDArea: SIMD (Single Instruction Multiple Data)Area: SIMD (Single Instruction Multiple Data)A-codegenArea: Code generationArea: Code generationA-target-featureArea: Enabling/disabling target features like AVX, Neon, etc.Area: Enabling/disabling target features like AVX, Neon, etc.C-bugCategory: This is a bug.Category: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.O-ArmTarget: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 stateTarget: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 stateT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
While working on armv7 neon support for simdutf8 I ran across inlining problems for functions with #[target_feature(enable = "neon")]
. One of them is that get_unchecked()
is never inlined in such functions.
More info:
- It also is not inlined with the armv7-linux-androideabi target
- It is inlined with the thumbv7neon-unknown-linux-gnueabihf target
- It is inlined if compiled with RUSTFLAGS="-Ctarget_feature=+neon"
#![feature(arm_target_feature)]
#[target_feature(enable = "neon")]
pub unsafe fn get_unchecked_range(x: &[u8]) -> &[u8] {
// do more neon stuff
unsafe { x.get_unchecked(3..)}
// do more neon stuff
}
#[target_feature(enable = "neon")]
pub unsafe fn get_unchecked_one(x: &[u8]) -> &u8 {
// do more neon stuff
unsafe { x.get_unchecked(3)}
// do more neon stuff
}
The root cause might be the same as for #102220.
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-SIMDArea: SIMD (Single Instruction Multiple Data)Area: SIMD (Single Instruction Multiple Data)A-codegenArea: Code generationArea: Code generationA-target-featureArea: Enabling/disabling target features like AVX, Neon, etc.Area: Enabling/disabling target features like AVX, Neon, etc.C-bugCategory: This is a bug.Category: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.O-ArmTarget: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 stateTarget: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 stateT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.