English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Language > Coders. C/C++

 
 
Thread Tools
Old 04 January 2022, 10:54   #1
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,002
Question Casting floats to uint (or rather not) with VBCC

Hi,

What I want to do is to have a single precision float number and endian swap it before I store it in a PCI register.

I have something like

StorePCI((ULONG)pciaddress, (float)driver->xvalue);

and StorePCI being:

StorePCI(__reg("a0") ULONG address, __reg("d0") ULONG value)="\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n";

What I need is something like:

fmove.s #1,fp0 //let's say value stored in driver->xvalue
fmove.s fp0,d0
move.l #pciaddress,a0

rol.w #8,d0
swap d0
rol.w #8,d0
move.l d0,(a0)

What I end up with is a fmove.d fp0,d0. So the integer value is stored in d0 ($1) instead of the floating point "raw" value ($3f800000)

How can I actually byteswap $3f800000 instead of $1 with VBCC. I have tried all different kinds of casting. I am a bit lost.
Hedeon is offline  
Old 04 January 2022, 11:12   #2
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,233
Does vbcc actually support single precision IEEE numbers? I'm asking, because SAS/C does not, it only supports mathffp numbers and double precisoin IEEE numbers.

However, in case it does - and check for the right math options - the following should do it:
Code:
union {
 float u_float;
 uint32_t u_int;
} u;
u.u_float = f; /* put the float value into the union */
i = (u.u_int >> 24) | ((u.u_int >> 8) & 0xff00) | ((u.u_int << 8) & 0xff0000) | ((u.u_int << 24)); /* perform endian swap */
Note that a C compiler has no way to reinterpret a float as int, a cast will always round, so you have to go through a temporary in memory. A C++-compiler has the possibility to do so via reinterpret_cast<int>(float). Also, there is no endian-swap primitive in C or C++, though a couple of modern compilers recognize the above idiom and generate ideal code, such as the GNU compiler.
Thomas Richter is offline  
Old 04 January 2022, 11:20   #3
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,002
Quote:
Originally Posted by Hedeon View Post

What I need is something like:

fmove.s driver->xvalue,fp0 //changed this to make it a bit more clear, not real code
fmove.s fp0,d0
move.l #pciaddress,a0
Even better would be:

move.l driver->xvalue, d0
move.l #pciaddress,a0

removing both fmoves.
Hedeon is offline  
Old 04 January 2022, 11:23   #4
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,002
Quote:
Originally Posted by Thomas Richter View Post
Does vbcc actually support single precision IEEE numbers? I'm asking, because SAS/C does not, it only supports mathffp numbers and double precisoin IEEE numbers.

However, in case it does - and check for the right math options - the following should do it:
Code:
union {
 float u_float;
 uint32_t u_int;
} u;
u.u_float = f; /* put the float value into the union */
i = (u.u_int >> 24) | ((u.u_int >> 8) & 0xff00) | ((u.u_int << 8) & 0xff0000) | ((u.u_int << 24)); /* perform endian swap */
Note that a C compiler has no way to reinterpret a float as int, a cast will always round, so you have to go through a temporary in memory. A C++-compiler has the possibility to do so via reinterpret_cast<int>(float). Also, there is no endian-swap primitive in C or C++, though a couple of modern compilers recognize the above idiom and generate ideal code, such as the GNU compiler.
Thanks Thomas, I'll give it a try. It is old source that used to compile with gcc2.95 m68k and PPC. Those seem to generate code even skipping the fmove.s opcodes and load d0 directly with the $3f800000 value.

Now I am trying to compile with VBCC and the result is different. I did not change the source except the asm macro.

(Same with PPC where the old compiled program with gcc uses stfs and vbcc uses stfd (single versus double)) or it just get loaded with a lwz.

Last edited by Hedeon; 04 January 2022 at 12:51.
Hedeon is offline  
Old 04 January 2022, 12:57   #5
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,002
So the differences:

If float calculations are done, GCC in the end stores it in d0 using a fmove.s. When the value/parameter is directly used with no calculations it just uses move.l to d0

VBCC after calculations does a fmove.d to store the result in d0. When the value is taken directly, it casts using fmove.d to d0
Hedeon is offline  
Old 04 January 2022, 13:56   #6
vbc
Registered User
 
Join Date: Jan 2021
Location: Germany
Posts: 18
Quote:
Originally Posted by Hedeon View Post
Hi,
[...]
StorePCI(__reg("a0") ULONG address, __reg("d0") ULONG value)="\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n";
[...]

How can I actually byteswap $3f800000 instead of $1 with VBCC. I have tried all different kinds of casting. I am a bit lost.
You have to declare value as float rather than ULONG to prevent a float=>int conversion. Unfortunately, vbcc does not allow to specify a data register for a float value when generating code for FPU. (I am not sure what the rationale was originally. I did a quick test to change that and it seems to work, but maybe there is some code path in the backend that does not expect it and has to be adapted.)

Anyway, why not use one of those:

Best for soft-float only:

Code:
StorePCI(__reg("a0") ULONG address, __reg("d0") float value)="\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n";
Best for FPU only:

Code:
StorePCI(__reg("a0") ULONG address, __reg("fp0") float  value)="\tfmove.s\tfp0,d0\n\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n";
Slightly less efficient, but should work in both cases:

Code:
StorePCI(__reg("a0") ULONG address, float  value)="\tmove.l\t(a7),d0\n\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n";
Or use the most efficient one based on the selected target:
Code:
#if __FPU__>68000 
void StorePCI(__reg("a0") ULONG address, __reg("fp0") float value)="\tfmove.s\tfp
0,d0\n\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n"; 
#else 
void StorePCI(__reg("a0") long address, __reg("d0") float value)="\trol.w\t#8,d0\
n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n"; 
#endif

vbc is offline  
Old 04 January 2022, 14:07   #7
vbc
Registered User
 
Join Date: Jan 2021
Location: Germany
Posts: 18
Quote:
Originally Posted by Thomas Richter View Post
Does vbcc actually support single precision IEEE numbers? I'm asking, because SAS/C does not, it only supports mathffp numbers and double precisoin IEEE numbers.
Yes, unless code for kickstart 1.x is generated.

Quote:

However, in case it does - and check for the right math options - the following should do it:
Code:
union {
 float u_float;
 uint32_t u_int;
} u;
u.u_float = f; /* put the float value into the union */
i = (u.u_int >> 24) | ((u.u_int >> 8) & 0xff00) | ((u.u_int << 8) & 0xff0000) | ((u.u_int << 24)); /* perform endian swap */
The union hack is not correct C. It is only allowed to read the member of a union that has been written most recently. On higher optimization levels some compilers (including vbcc) will make use of C aliasing rules and code like this may not work.

What you can do is using a char * pointer to the float variable and read out the individual bytes. Note that only char * is allowed here without breaking aliasing rules.
vbc is offline  
Old 04 January 2022, 14:47   #8
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,107
Quote:
Originally Posted by vbc View Post
Yes, unless code for kickstart 1.x is generated.

The union hack is not correct C. It is only allowed to read the member of a union that has been written most recently. On higher optimization levels some compilers (including vbcc) will make use of C aliasing rules and code like this may not work.

What you can do is using a char * pointer to the float variable and read out the individual bytes. Note that only char * is allowed here without breaking aliasing rules.

Isn't it explicitly allowed in C99 and later? ยง6.5.2.3.3:


Quote:
A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member, 82) and is an lvalue if the first e xpression is an lv alue. If the first e xpression has qualified type, the result has the so-qualified version of the type of the designated member.
Where footnote 82 says:
Quote:
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
Some discussion on stackoverflow: https://stackoverflow.com/a/25672839/786653


P.S. In C++ reinterpret_cast won't work, you either have to use memcpy or std::bit_cast (from C++20)
paraj is offline  
Old 04 January 2022, 14:48   #9
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
Quote:
Originally Posted by Hedeon View Post
What I want to do is to have a single precision float number and endian swap it before I store it in a PCI register.

I have something like

StorePCI((ULONG)pciaddress, (float)driver->xvalue);
The m68k has no instruction to transfer the IEEE single precision value from an FPU register directly into a data register. fmove to Dn always means convert to integer.

So, as Thomas correctly pointed out, you have to go over a temporary memory location. In C this is usually done with a union, as shown in his example.

If you want it as an assembler inline function, it could look like this:
Code:
void StorePCI(__reg("a0") void *addr, __reg("fp0") float val) =
  "\tfmove.s\tfp0,-(sp)\n"
  "\tmove.l\t(sp)+,d0\n"
  "\trol.w\t#8,d0\n"
  "\tswap\td0\n"
  "\trol.w\t#8,d0\n"
  "\tmove.l\td0,(a0)";
Quote:
Originally Posted by Thomas Richter View Post
Does vbcc actually support single precision IEEE numbers? I'm asking, because SAS/C does not, it only supports mathffp numbers and double precisoin IEEE numbers.
Yes it supports single precision IEEE, which is ideal for OS2/3. But for the Kickstart 1.x target I had to implement conversion routines from/to mathffp.

Quote:
Originally Posted by Hedeon View Post
If float calculations are done, GCC in the end stores it in d0 using a fmove.s. When the value/parameter is directly used with no calculations it just uses move.l to d0
AFAIK gcc's m68k backend only knows the V.4-ABI, which makes a function always return float results in data registers (d0 for single, d0/d1 for double precision), while the AmigaOS-ABI prefers to use fp0 when compiled with an FPU-option. There is the -no-fp-return option to switch vbcc to V.4-ABI for floating point return values.

Last edited by phx; 04 January 2022 at 14:52. Reason: EDIT: My answer was delayed by a 1h phone call. You can safely ignore it and refer to Volker's posting!
phx is offline  
Old 04 January 2022, 15:03   #10
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,233
Quote:
Originally Posted by vbc View Post
The union hack is not correct C. It is only allowed to read the member of a union that has been written most recently. On higher optimization levels some compilers (including vbcc) will make use of C aliasing rules and code like this may not work.
*Cough* Aliasing between members of unions does not apply. There is a specific clause for that in C. The result is, of course, undefined, but what else can the C standard say?


See, for example:


https://stackoverflow.com/questions/...through-unions
Thomas Richter is offline  
Old 04 January 2022, 15:04   #11
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,233
Quote:
Originally Posted by phx View Post
The m68k has no instruction to transfer the IEEE single precision value from an FPU register directly into a data register.
Sure it does. "fmove.s fpx,dy" works nicely. It transfers the bit-pattern of the source fpu register, rounded to 32-bit single precision, to the data register dy. Would be rather bad if this wouldn't work because the processor libraries are full of such code. (-;
Thomas Richter is offline  
Old 04 January 2022, 15:21   #12
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
Quote:
Originally Posted by Thomas Richter View Post
"fmove.s fpx,dy" works nicely. It transfers the bit-pattern of the source fpu register, rounded to 32-bit single precision, to the data register dy.
Indeed! Completely forgot about it.
phx is offline  
Old 04 January 2022, 15:35   #13
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,002
I made stuff more complicated to try to compile for both ppc and m68k. The ppc does not have direct fpu register to general register modes. So yesterday I ended up with sending the (ULONG*)&driver->xvalue to StorePCI and made value in the asm macro also ULONG* and added a normal load from memory to the data register first. Compiled for at least 68020 and fpu 68881. This also worked for PPC (with -lm).

However, while the speed from the vbcc m68k generated was comparable to gcc (around 5% slower on my 68060), the ppc generated one took a 40% speed hit. I guessed because for 68k the number of opcodes in the macro went from 4 to 5, while the ppc one went from 1 (a single stwbrx) to 2. Also, when looking at the differences in the code where vbcc uses more (double precision) fpu opcodes while gcc uses more general registers directly, I wondered if the mistake was in the casting as I really wanted to ditch those extra opcodes in the asm macros again for speed reasons. This function is used a lot. Hence the post, but focused on m68k while there is more expertise there.

Looking at the answers it looks like that for ppc and speed wise I have to revert to gcc2.95 anyway. Sorry for the maybe a bit misleading first post.
Hedeon is offline  
Old 04 January 2022, 16:32   #14
vbc
Registered User
 
Join Date: Jan 2021
Location: Germany
Posts: 18
Quote:
Originally Posted by paraj View Post
Isn't it explicitly allowed in C99 and later? ยง6.5.2.3.3:
This footnote is not in my copy of the standard and it was apparently added later through a defect report. While I agree that the wording is somewhat misleading, the defect report suggests that it was intended as clarification, perhaps to allow a trap representation in all cases.

There still is 6.5p7 which lists all allowed types for accessing an lvalue:
Quote:

An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:73)
โ€” a type compatible with the effective type of the object,
โ€” a qualified version of a type compatible with the effective type of the object,
โ€” a type that is the signed or unsigned type corresponding to the effective type of the object,
โ€” a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
โ€” an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
โ€” a character type.
vbc is offline  
Old 04 January 2022, 16:52   #15
vbc
Registered User
 
Join Date: Jan 2021
Location: Germany
Posts: 18
Quote:
Originally Posted by Thomas Richter View Post
*Cough*
Gesundheit!

Quote:
Aliasing between members of unions does not apply. There is a specific clause for that in C.
Which clause?

Quote:
The result is, of course, undefined, but what else can the C standard say?
It could say "implementation defined result". I am not sure what clause you are refering to, but if the C standard mentions "undefined" that means the code is illegal and the compiler can ignore such a case without having to diagnose it or handling it in any meaningful way.

That seems to contain a lot of opinions without any backing. comp.std.c was the place to get decent information on such topics. Unfortunately newsgroups are not much in fashion any more.
vbc is offline  
Old 04 January 2022, 17:15   #16
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,233
Quote:
Originally Posted by vbc View Post
Which clause?

See here for the references:



https://stackoverflow.com/questions/...-what-does-not


Quote:
Originally Posted by vbc View Post
That seems to contain a lot of opinions without any backing. comp.std.c was the place to get decent information on such topics. Unfortunately newsgroups are not much in fashion any more.
I suggest then to go checking there if you don't believe me. I run into this issue a while ago, this is why I mention it. Storing in memory and going through a pointer cast is indeed not going to work, and had issues with that with, for example, the icc compiler. The "union hack", as you call it, solves that type of problem.
Thomas Richter is offline  
Old 04 January 2022, 17:56   #17
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,107
Quote:
Originally Posted by vbc View Post
This footnote is not in my copy of the standard and it was apparently added later through a defect report. While I agree that the wording is somewhat misleading, the defect report suggests that it was intended as clarification, perhaps to allow a trap representation in all cases.

There still is 6.5p7 which lists all allowed types for accessing an lvalue:
Sorry, I should have mentioned that I was looking at N1256 (final draft of TC3 from 2007).

I have to admit the issue is less clear than I remembered it, and it might be the case that the standard technically doesn't require type-punning through unions to be supported (this stackoverflow answer makes a persuasive argument).

You're also right that it was added through a defect report (DR283). Following the linked discussions, I think it's quite clear (from proposal N980) that the intention was to allow it though. It's certainly widely believed to be, as evidences by this thread
paraj is offline  
Old 04 January 2022, 19:12   #18
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,002
Quote:
Originally Posted by phx View Post
AFAIK gcc's m68k backend only knows the V.4-ABI, which makes a function always return float results in data registers (d0 for single, d0/d1 for double precision), while the AmigaOS-ABI prefers to use fp0 when compiled with an FPU-option. There is the -no-fp-return option to switch vbcc to V.4-ABI for floating point return values.
Tried the following with that option:

int main(void)
{
float x;
x = 1.0;
return int(x);
}

result is moveq #1,d0 and sadly not move.l #$3f800000,d0

The union approach has me changing a lot of the code. Will take a while.
Hedeon is offline  
Old 04 January 2022, 19:49   #19
vbc
Registered User
 
Join Date: Jan 2021
Location: Germany
Posts: 18
Quote:
Originally Posted by Thomas Richter View Post
This is a discussion about details in the gcc documentation. The only mentioned parts of the C standard that I found are the non-normative footnote that was already mentioned in this thread and the part from 6.5 that I quoted. One poster even states: "What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to"

Quote:
I suggest then to go checking there if you don't believe me.
This is what I did 20-25 years ago.

Quote:
I run into this issue a while ago, this is why I mention it. Storing in memory and going through a pointer cast is indeed not going to work, and had issues with that with, for example, the icc compiler. The "union hack", as you call it, solves that type of problem.
It may make that specific code work on specific versions of specific compilers. But it still violates the C standard and relies on internals of a specific compiler without the need to do so. There are alternatives which do not cause undefined behaviour, like using char-pointers.
vbc is offline  
Old 04 January 2022, 19:54   #20
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,107
Quote:
Originally Posted by Hedeon View Post
Tried the following with that option:

int main(void)
{
float x;
x = 1.0;
return int(x);
}

result is moveq #1,d0 and sadly not move.l #$3f800000,d0

The union approach has me changing a lot of the code. Will take a while.

Is there any reason why you couldn't just do this?

Code:
#ifdef __M68K__
#ifdef __GNUC__
void StorePCIFloat(ULONG address, float f)
{
    union {
        float f;
        ULONG u;
    } u = { .f = f };
   *(volatile ULONG*)address  = __builtin_bswap32(u.u);
 }
 #else
void StorePCIFloat(__reg("a0") ULONG address, __reg("fp0") float  value)="\tfmove.s\tfp0,d0\n\trol.w\t#8,d0\n\tswap\td0\n\trol.w\t#8,d0\n\tmove.l\td0,(a0)\n";
 #endif

#else
// Insert equivalent PPC magic

#endif
(Yes you have to change every place you want to write a float, but at least any further changes will be localized).


The above generates very sensible code with both Bebbos GCC and VBCC.


GCC even manages to convert StorePCIFloat(0x1234, 1.0f) into move.l #32831,4660.w.
paraj is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
VBCC - What's going on here? deimos Coders. C/C++ 69 28 July 2018 16:14
Space Hulk (1993) - Question about ray casting & graphics Cherno Nostalgia & memories 0 27 August 2017 10:24
Integers vs floats (FFP/Sing/Doub) + printf() guy lateur Coders. Asm / Hardware 63 18 July 2017 17:57
Ray casting sandruzzo Coders. General 14 21 June 2017 01:06
AmiDevCpp and Floats AmigaEd Coders. General 0 18 January 2006 03:16

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 14:32.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12302 seconds with 15 queries