microblaze: Fix -Os right shift optimization is allowed into delay slot #37

alpsayin · 2024-10-18T16:53:54Z

In picolibc testing it's found that this produces code that compiler
squeezes into a single delay slot. And thus only the first instruction
emitted by this optimization is run and the rest is skipped.
Optimization is generated by

  [(set (match_operand:SI 0 "register_operand" "=&d")
       (ashiftrt:SI (match_operand:SI 1 "register_operand"  "d")
                   (match_operand:SI 2 "immediate_operand" "I")))]
  "(INTVAL (operands[2]) > 5 && optimize_size)"
  {
    operands[3] = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);

    output_asm_insn ("ori\t%3,r0,%2", operands);
    if (REGNO (operands[0]) != REGNO (operands[1]))
        output_asm_insn ("addk\t%0,%1,r0", operands);

    output_asm_insn ("addik\t%3,%3,-1", operands);
    output_asm_insn ("bneid\t%3,.-4", operands);
    return "sra\t%0,%0";
  }
  [(set_attr "type"    "arith")
  (set_attr "mode"    "SI")
  (set_attr "length"  "20")]

But arith type is not disallowed from going into delay slot (somehow)

gcc/gcc/config/microblaze/microblaze.md

Line 466 in 428d8d7

 [(and (eq_attr "type" "!branch,call,jump,icmp,multi,no_delay_arith,no_delay_load,no_delay_store,no_delay_imul,no_delay_move,darith") 

Optimization generated code is between [191b8-191c8]

   191a8:	bc830194 	bgti	r3, 404		// 1933c
    if (subnormal_y) { /* subnormal y */
   191ac:	b0007ff0 	imm	32752
   191b0:	a47c0000 	andi	r3, r28, 0
   191b4:	be2301ec 	bneid	r3, 492		// 193a0
   191b8:	a2400014 	ori	r18, r0, 20
   191bc:	131e0000 	addk	r24, r30, r0
   191c0:	3252ffff 	addik	r18, r18, -1
   191c4:	be32fffc 	bneid	r18, -4		// 191c0
   191c8:	93180001 	sra	r24, r24
...
        iy = (hy >> 20) - 1023;
   193a0:	b810fe40 	brid	-448		// 191e0
   193a4:	3318fc01 	addik	r24, r24, -1023

where operands are:

operands[0] = r24
operands[1] = r29
operands[2] = 20
operands[3] = r18

As a result this code returns a iy (r24) value of whatever was in r24 - 1023`

The fix is simple. I've redeclated size-optimization as multi which is

not delay-slot allowed
Also the same type for other shift optimizations (they're left shift optimizations)
gcc/gcc/config/microblaze/microblaze.md

Line 2070 in 428d8d7

[(set_attr "type" "multi")
gcc/gcc/config/microblaze/microblaze.md

Line 2483 in 428d8d7

[(set_attr "type" "multi")

Currently under test via zephyrproject-rtos/sdk-ng#647

In picolibc testing it's found that this produces code that compiler squeezes into a single delay slot. And thus only the first instruction emitted by this optimization is run and the rest is skipped. Signed-off-by: Alp Sayin <[email protected]>

This was referenced Oct 18, 2024

microblaze: pull in gcc/bintuils/gdb patches from meta-xilinx zephyrproject-rtos/sdk-ng#647

Open

MicroBlaze Port zephyrproject-rtos/zephyr#53576

Draft

alpsayin force-pushed the zephyr-gcc-12.2.0-bad-Os-shift-optimisation branch from 39eb6ac to 0e3007a Compare October 18, 2024 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

microblaze: Fix -Os right shift optimization is allowed into delay slot #37

microblaze: Fix -Os right shift optimization is allowed into delay slot #37

alpsayin commented Oct 18, 2024

microblaze: Fix -Os right shift optimization is allowed into delay slot #37

Are you sure you want to change the base?

microblaze: Fix -Os right shift optimization is allowed into delay slot #37

Conversation

alpsayin commented Oct 18, 2024