Skip to content

Commit 0d2fed9

Browse files
zhaozhouchenglulu326
authored andcommitted
LoongArch: optimize half of vector copy for V4DFmode.
Repalce xvpermi to xvbsrl when vector of V4DFmode high 64 bits copy to low 64 bits, reduce 2 insn delays. gcc/ChangeLog: * config/loongarch/lasx.md (lasx_xvbsrl_d_f): New template. * config/loongarch/loongarch.cc (emit_reduc_half): Replace insn. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vec_reduc_half.c: New test.
1 parent 522f07d commit 0d2fed9

File tree

3 files changed

+21
-1
lines changed

3 files changed

+21
-1
lines changed

gcc/config/loongarch/lasx.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2702,6 +2702,16 @@
27022702
[(set_attr "type" "simd_shift")
27032703
(set_attr "mode" "<MODE>")])
27042704

2705+
(define_insn "lasx_xvbsrl_d_f"
2706+
[(set (match_operand:V4DF 0 "register_operand" "=f")
2707+
(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")
2708+
(match_operand 2 "const_uimm5_operand" "")]
2709+
UNSPEC_LASX_XVBSRL_V))]
2710+
"ISA_HAS_LASX"
2711+
"xvbsrl.v\t%u0,%u1,%2"
2712+
[(set_attr "type" "simd_shift")
2713+
(set_attr "mode" "V4DF")])
2714+
27052715
(define_insn "lasx_xvbsll_<lasxfmt>"
27062716
[(set (match_operand:ILASX 0 "register_operand" "=f")
27072717
(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")

gcc/config/loongarch/loongarch.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10092,7 +10092,7 @@ emit_reduc_half (rtx dest, rtx src, int i)
1009210092
if (i == 256)
1009310093
tem = gen_lasx_xvpermi_d_v4df (dest, src, GEN_INT (0xe));
1009410094
else
10095-
tem = gen_lasx_xvpermi_d_v4df (dest, src, const1_rtx);
10095+
tem = gen_lasx_xvbsrl_d_f (dest, src, GEN_INT (0x8));
1009610096
break;
1009710097
case E_V32QImode:
1009810098
case E_V16HImode:
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
/* { dg-do compile } */
2+
/* { dg-options "-O2 -ffast-math -mlasx" } */
3+
4+
double
5+
foo_1 (double *a, double *b)
6+
{
7+
return a[0] * b[0] + a[1] * b[1] + a[2] * b[2] + a[3] * b[3];
8+
}
9+
10+
/* { dg-final { scan-assembler-times "xvpermi.d" 1} } */

0 commit comments

Comments
 (0)