English Amiga Board


Go Back   English Amiga Board > Support > support.Other

 
 
Thread Tools
Old 12 November 2022, 16:26   #1
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Betatesting SoftIEEE FPU emulator

Folks,


as we run out of native CPU hardware, FPU-less CPUs such as the 68LC040 and 68LC060 become more popular. Unfortunately, some software depends on the presence of a FPU, and thus, I created over the past months a software based FPU emulator named SoftIEEE.


I'm looking for betatesters that would help me to identify problems or compatibility issues of this software. The software you can find here:



http://eab.abime.net/attachment.php?...1&d=1668265877


Installation: Unpack the .lha and copy the softieee.library to libs:, and SoftIEEE to C: To start the emulation, just run SoftIEEE on the command line or the startup-sequence.


Requirements: Any FPU-less machine will do. That is, not only a 68LC060 or 68LC030, also a 68030 or 68020 without a 68881/882 coprocessor will work. Even a 68010 or plain 68000 can be equipped with a virtual FPU.


How does it work: The program receives the line-F traps, decodes the FPU instructions, and then uses the mathematical algorithms in the softieee.library to perform the necessary calculations.


What is emulated: SoftIEEE and the softieee.library emulate the 68881/882 FPU completely, no compromizes made. That is, full 80 bit extended precision, selectable precision from single, double to 80 bit extended float, selectable rounding mode, packed decimal data type, emulation of FPU exceptions, proper handling of access faults ("MuForce hits") - i.e. MuForce will identify the FPU instruction as source of the fault, not the emulator core.
The precision of SoftIEEE matches the one of the real FPU for elementary algebra (0.5 ulp - 0.5 least significant bit for +,-,*,/ and sqrt), and exceeds(!) the precision of the native 68881/882 for all transcendental functions. Whereas the native FPU only guarantees double precision (approximately 8ulp - 8 least significant bits are wrong), the softieee.library provides for most functions a precision of 1ulp (i.e. at most the least significant bit is wrong).



The autodocs of the softieee.library provides more details on the mathematical algorithms used, and the implementation precision guaranteed by it. The library (softieee.library) can also be used without the SoftIEEE patch if precise floating point mathematics is required.



Restrictions: You should not expect the emulation to be fast, thus you should not really run a raytracer through it. Software emulation is clearly slow, and SoftIEEE does not do miracles. Also, due to emulation, at most 400 additional bytes of stack space are needed.


How to help: If your system lack a FPU, download the software, install it, and give it a try with programs requiring a FPU. The system should now identify the presence of a FPU, and the programs should run flawlessly, though probably (quite) slow.
Attached Files
File Type: lha SoftIEEE.lha (48.3 KB, 332 views)
Thomas Richter is offline  
Old 12 November 2022, 17:52   #2
PeterK
Registered User
 
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,365
That's really a great idea to create a soft-FPU for all those Amigas which don't have one.

Edit: Sorry, for my tests with WinUAE, but FPU exceptions are not supported by WinUAE as Toni Wilen pointed out below.


On my very first tests it already looks quite good, the results of the trigonometric functions seem to be correct, except for some sqrt results (just garbage) in my first run of IEEE-DT tests, but in the second pass sqrt was ok, too (strange).

Unfortunately, IPrefs crashed during system boot with $80000008, and then later when I tested SoftIEEE with my FPUconstants program I found a real problem. It seems that many constants are not emulated at all, like ln(2), ln(10), log10(e), all the 10^x, etc. That's why I stopped further testing, because these constants are required by some programs. Tested with 68020 + JIT under WinUAE 4.4.0.

Update: No, those FPUconstants are only missing or return random data when I use the JIT, but without JIT all constants are generated correctly.

But, now without JIT, all the IEEE-DT tests just return trash and the system freezes. Maybe this is a WinUAE problem with exceptions?

Last edited by PeterK; 12 November 2022 at 19:18.
PeterK is offline  
Old 12 November 2022, 18:40   #3
waldiamiga
morphos.pl
 
waldiamiga's Avatar
 
Join Date: Aug 2014
Location: Kraków, Poland
Posts: 104
I think a lot of people with TF1260 with LC060 will be satisfied. Among other things, my friend who put together the TF with LC060 rev.4 yesterday.
waldiamiga is offline  
Old 12 November 2022, 18:44   #4
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
JIT does not (can not and also not really worth the trouble) support correct FPU exception stack traces.

Unfortunately my cputester can't be used to test this because it needs to take over all exception vectors but emulator itself also needs them and there is no way to know which exception is the one that tester should validate and which should be ignored and forwarded back to original vector.
Toni Wilen is offline  
Old 12 November 2022, 19:36   #5
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,118
This is pretty neat. How much of the execution time is due to handling the trap? Is there any benefit in patching the offending unimplemented instruction with a more direct jump to the handling code ala CyberPatcher/OxyPatcher ?
Karlos is online now  
Old 12 November 2022, 19:49   #6
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Hi Peter,


thanks for your tests, I just saw that I flipped entries B and C in the constant ROM table, which I will fix in the next version.


I cannot reproduce the other issues here. However, if you have more concrete test cases, for example where sqrt() fails, please let me know. In particular, you may observe that the result of a real 68882 and SoftIEEE may be slightly different because SoftIEEE computes more precisely (yes, really. The 68882 has 67 bits of internal precision, SoftIEEE has 96 bits internal precision for many functions).



In general, it is advisable to test with real hardware. SoftIEEE is particularly tricky, in particular the stack and exception management is a bit involved. For example, in order to improve multitasking, SoftIEEE switches back from the exception to user mode when running most numerical algorithms of the softieee.library such that tasks can be suspended from the CPU while the algorithm is running.


The same goes for the mathieeeXXXX.libraries which read and write fpu registers through the stack. This means that SoftIEEE also needs to adjust the stack, sometimes even "under" the exception. SoftIEEE is prepared for such cases, and I did my best to test on real hardware (with an FPU ID different from the default = 1), but in how far this works under emulation I do not really know.
Thomas Richter is offline  
Old 12 November 2022, 20:02   #7
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Quote:
Originally Posted by Karlos View Post
This is pretty neat. How much of the execution time is due to handling the trap? Is there any benefit in patching the offending unimplemented instruction with a more direct jump to the handling code ala CyberPatcher/OxyPatcher ?

At this point, I cannot really tell. No matter how much time is "wasted" there, the overall execution speed of SoftIEEE is "slow". I would expect that for instructions such as "fmove" the exception overhead is the dominating part as the instruction needs to be decoded, the ea *may* be decoded (the 68LC040 helps in many, but not all cases by already decoding the ea), and the ea needs to be fetched.


For other instructions such as the transcental functions, most time is in the actual implementation (here the classical CORDIC in 96 bits).



Note that there is quite some difference between SoftIEEE and "normal" unimplemented instructions on the 040 and 060 which can be handled through MuRedox and the fpcr.resource. Thus, the existing architecture for FPU instruction trap bypass is not exactly suitable here and I need to think about something else.



The fpsp.resource has an interface where FPU values are passed through the fp0 FPU register, however for SoftIEEE, fp0 does not even exist in reality, it is rather a memory location in the (actually, "a", not "the") softieee.library base, and even instructions such as "fmove to fp0" require emulation, so the current design is not feasible for this purpose.
Thomas Richter is offline  
Old 12 November 2022, 21:41   #8
Chucky
Registered User
 
Chucky's Avatar
 
Join Date: Mar 2015
Location: Karlstad / Sweden
Age: 52
Posts: 1,210
Just tested with my TF1260 with a LC CPU. it says I already got a FPU
Chucky is offline  
Old 12 November 2022, 21:44   #9
Chucky
Registered User
 
Chucky's Avatar
 
Join Date: Mar 2015
Location: Karlstad / Sweden
Age: 52
Posts: 1,210
or maybe not. I start via a icon (as no keyboard on my lab machine) and first click nothing happens. so this errror gets on my 2nd click
Chucky is offline  
Old 12 November 2022, 22:12   #10
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Exactly. The FPU you get on the first click, and thus, if you click once more, it will not give you another FPU. (-;
Thomas Richter is offline  
Old 12 November 2022, 22:14   #11
Chucky
Registered User
 
Chucky's Avatar
 
Join Date: Mar 2015
Location: Karlstad / Sweden
Age: 52
Posts: 1,210
Sadly all FPU code I tested crashed (demos)(except sysinfo telling I got a 060 fpu) but will do more serious testings tomorrow.
Chucky is offline  
Old 12 November 2022, 22:15   #12
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Quote:
Originally Posted by Toni Wilen View Post
JIT does not (can not and also not really worth the trouble) support correct FPU exception stack traces.

Well, just trying to reproduce Peter's issue, probably not with the latest version, but Toni, if you want to have a look....


If I have an instruction such as
Code:
fsub (a7)+,fp0
then upon entry of the line-f exception stack frame, register d7 is trashed (actually with the usp value). It seems to be exclusive to this addressing mode.
Thomas Richter is offline  
Old 12 November 2022, 22:17   #13
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Quote:
Originally Posted by Chucky View Post
Sadly all FPU code I tested crashed (demos)(except sysinfo telling I got a 060 fpu) but will do more serious testings tomorrow.
FPU demos will take over the system, and most likely trash everything in the vbr and memory, including the softieee code - so I would not count on anything like that. Try system software, not demo software.
Thomas Richter is offline  
Old 12 November 2022, 22:46   #14
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
Appreciate the work, but I'm confused. Isn't this exactly how accel libs etc work already?

Quote:
Originally Posted by Thomas Richter View Post
FPU demos will take over the system, and most likely trash everything in the vbr and memory, including the softieee code - so I would not count on anything like that. Try system software, not demo software.
It's very unlikely a demo using FPU started from CLI will trash the VBR/memory. Track-loaded demos are likely to, but the combination of track-loaded + using FPU is pretty rare.
Photon is offline  
Old 12 November 2022, 22:50   #15
Matt_H
Registered User
 
Matt_H's Avatar
 
Join Date: Jul 2008
Location: Boston, MA
Posts: 943
This is beyond my ability to test, but I just want to say kudos for this vital piece of software.
Matt_H is offline  
Old 13 November 2022, 09:30   #16
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,502
Quote:
Originally Posted by Thomas Richter View Post
Code:
fsub (a7)+,fp0
Fixed, thanks. If CPU was 68060 and FPU instruction's addressing mode was -(An) or (An)+ and FPU was disabled, original An was restored twice but second restore used invalid register index which selected D7. (Second restore was done by shared FPU unimplemented instruction exception handling code which didn't expect to get already restored An)

EDIT: This only affects 68060 mode. 68040 should work correctly.

Last edited by Toni Wilen; 13 November 2022 at 15:34.
Toni Wilen is offline  
Old 13 November 2022, 12:04   #17
Flash951
Registered User
 
Join Date: Feb 2015
Location: Lier / Norway
Posts: 103
I've tested on my A3000 with Chucky 3660 060LC and EGS rtg system. Small peograms like screensaver that needed FPU works, great. TVPaint 3.0 EGS and TVPaint Jr. crashed when trying to start.
Flash951 is offline  
Old 13 November 2022, 12:38   #18
PeterK
Registered User
 
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,365
Quote:
Originally Posted by Thomas Richter View Post
... thanks for your tests, I just saw that I flipped entries B and C in the constant ROM table, which I will fix in the next version.
Confirmed! Ooops, I did not even notice that the 2. and 3. were swapped.

Quote:
I cannot reproduce the other issues here. However, if you have more concrete test cases, for example where sqrt() fails, please let me know.
No, forget about the sqrt() problem, because that function is the first in the IEEE-DT tests, and it probably failed in the first pass just because the JIT was not running yet. Some function calls later and in the second pass the JIT was already active and it will probably not generate FPU exceptions. Maybe the JIT just used the x86 FPU code, although the FPU configuration was set to "None"? Edit: The reason: forgotten to disable FPU support in the JIT settings, too.

Quote:
In particular, you may observe that the result of a real 68882 and SoftIEEE may be slightly different because SoftIEEE computes more precisely (yes, really. The 68882 has 67 bits of internal precision, SoftIEEE has 96 bits internal precision for many functions).
Oh, I didn't know that a real 68882 is less accurate than the 68881. That explains why recently some roundings on a users system with 68030 + 882 were a bit different to those on WinUAE.

Quote:
In general, it is advisable to test with real hardware.
That's certainly true, but my A2000 with A2620 died more than 10 years ago and went onto the dump already. My tests with WinUAE are meaningless, since the problems were caused by the missing support for FPU exceptions.

Last edited by PeterK; 19 November 2022 at 13:09.
PeterK is offline  
Old 13 November 2022, 13:01   #19
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Quote:
Originally Posted by PeterK View Post
Oh, I didn't know that a real 68882 is less accurate than the 68881. That explains why recently some roundings on a users system with 68030 + 882 were a bit different to those on WinUAE.
No, the 68881 and 68882 compute equally precise, but less precise than the software. That is, for elementary algebra, the two are equally precise, but for the additional functions SoftIEEE offers a higher precision.
Thomas Richter is offline  
Old 13 November 2022, 13:01   #20
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Quote:
Originally Posted by Flash951 View Post
I've tested on my A3000 with Chucky 3660 060LC and EGS rtg system. Small peograms like screensaver that needed FPU works, great. TVPaint 3.0 EGS and TVPaint Jr. crashed when trying to start.

Sorry, there was still some confusion on the exec scheduler patch for the 68060, and the patch there was wrong. The attached version should hopefully work better.


Also, please have an eye on the 68060.library. Not all libraries will, in the absence of an FPU, install all necessary system patches. The MMULib provided 68060.library is fine, however, and should be preferred.
Attached Files
File Type: lha SoftIEEE.lha (48.8 KB, 83 views)

Last edited by Thomas Richter; 13 November 2022 at 14:31.
Thomas Richter is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Demos to test FPU on SX32 MkII (020+FPU) Rochabian request.Demos 1 21 April 2020 03:03
Betatesting Amiga and C64 Forever 7 michaelz support.Amiga Forever 23 22 June 2017 16:58
[obsolete] EoB 2 Thread AGA and translations betatesting Marcuz project.Amiga Game Factory 17 21 August 2008 22:47
Frederic's Emulator inside and Emulator thread Fred the Fop Retrogaming General Discussion 22 09 March 2006 07:31

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 10:27.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.24276 seconds with 16 queries