Difference of generated code regarding GNURX for RXv3 sqrtf(x) using -mrxv2-fsqrt between with and without -mdfpu (code – 2022/10/25)
Difference of generated code regarding GNURX for RXv3 sqrtf(x) using -mrxv2-fsqrt between with and without -mdfpu (code – 2022/10/25)
Hello support team,
I’m confused about the difference of generated code regarding GNURX for RXv3 sqrtf(x) using -mrxv2-fsqrt between with and without -mdfpu as follows. Is this intended?
On the other hand, CC-RX with -library=intrinsic option (default) and ICCRX without –sqrt_must_set_errno option generate simplest code of only one FSQRT instruction. Is there any plan to support such code generation?
SOURCE:
#include <math.h>
float RXv3_FPU_sqrtf(float x)
{
return sqrtf(x);
}
CASE GNURX A: using -mrxv2-fsqrt without -mdfpu
rx-elf-gcc -std=gnu99 -O3 -misa=v3 -mrxv2-fsqrt -Wa,-adln=rxv3_fpu_sqrtf.lst -c rxv3_fpu_sqrtf.c
1 .file “rxv3_fpu_sqrtf.c”
2 .section P,”ax”
3 .global _RXv3_FPU_sqrtf
5 _RXv3_FPU_sqrtf:
6 0000 7E A7 push.l r7
7 0002 FC A3 17 fsqrt r1, r7
8 0005 FD 72 11 00 00 00 00 fcmp #0x0, r1
9 000c 27 07 bn .L5
10 .balign 8,3,1
11 .L1:
12 000e EF 71 mov.L r7, r1
13 0010 3F 77 01 rtsd #4, r7-r7
14 .L5:
15 0013 05 00 00 00 bsr _sqrtf
16 0017 2E F7 bra .L1
18 0019 FD 70 40 00 00 00 80 .ident “GCC: (GCC_Build_20220528) 8.3.0.202202-GNURX 20190222”
CASE GNURX B: using -mrxv2-fsqrt with -mdfpu
rx-elf-gcc -std=gnu99 -O3 -misa=v3 -mdfpu -mrxv2-fsqrt -Wa,-adln=rxv3_fpu_sqrtf.lst -c rxv3_fpu_sqrtf.c
1 .file “rxv3_fpu_sqrtf.c”
2 .section P,”ax”
3 .global _RXv3_FPU_sqrtf
5 _RXv3_FPU_sqrtf:
6 0000 75 B0 01 dpushm.d dr0-dr1
7 0003 60 40 sub #4, r0
8 0005 F9 03 03 00 00 00 00 dmov.D #0, drh0
9 000c FD 77 81 1A ftod r1, dr1
10 0010 FC A3 15 fsqrt r1, r5
11 0013 76 90 18 10 dcmpun dr0, dr1
12 0017 75 90 1B mvfdr
13 001a 21 12 bnz .L1
14 001c 76 90 08 61 dcmple dr1, dr0
15 0020 75 90 1B mvfdr
16 0023 11 bz .L1
17 0024 E3 05 mov.L r5, [r0]
18 0026 05 00 00 00 bsr _sqrtf
19 002a EC 05 mov.L [r0], r5
20 .balign 8,3,1
21 .L1:
22 002c EF 51 mov.L r5, r1
23 002e 62 40 add #4, r0
24 0030 75 B8 01 dpopm.d dr0-dr1
25 0033 02 rts
27 0034 76 10 01 00 .ident “GCC: (GCC_Build_20220528) 8.3.0.202202-GNURX 20190222”
CASE CC-RX: with -library=intrinsic option (default)
00000000 _RXv3_FPU_sqrtf:
00000000 FCA311 FSQRT R1, R1
00000003 02 RTS
CASE ICCRX: without –sqrt_must_set_errno option
\ _RXv3_FPU_sqrtf:
\ 000000 FC A3 11 FSQRT R1,R1
\ 000003 02 RTS
Best regards,
NoMaY
Hello NoMaY-san,
There seems to be a bug in the inline expansion of the sqrtf function, thank you for bringing it to our attention.
The -mdfpu option also makes doubles 8 byte long, but it should not affect the sqrtf function.
We have raised an internal bug ticket and we hope to fix this in a future release.
Regarding the generation of simpler code, optimizations are always in the scope of future releases.
__
Best regards,
The Open Source Tools Team
Hello NoMaY-san,
To disable the setting of the errno variable, please use the -fno-math-errno compile option.
Please let us know if we can be of further assistance.
__
Best regards,
The Open Source Tools Team
Hello support team,
Thank you for the reply. I get the following code for both case (with and without -mdfpu) using -fno-math-errno.
1 .file “rxv3_fpu_sqrtf.c”
2 .section P,”ax”
3 .global _RXv3_FPU_sqrtf
5 _RXv3_FPU_sqrtf:
6 0000 FC A3 11 fsqrt r1, r1
7 0003 02 rts
9 .ident “GCC: (GCC_Build_20220528) 8.3.0.202202-GNURX 20190222”
Best regards,
NoMaY
P.S.
I found the following pages that may help myself in the future.
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fno-math-errno
https://stackoverflow.com/questions/57676693/which-functions-are-affected-by-fno-math-errno
https://stackoverflow.com/questions/57673825/how-to-force-gcc-to-assume-that-a-floating-point-expression-is-non-negative
Best regards,
NoMaY