![]() |
![]() |
#141 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,517
|
@Thomas Richter
Could a variant version that has reduced precision be faster? I appreciate this isn't the goal but it seems to me that a lot of users with faster 060s tend to use their FPU for gaming rather than anything requiring full extended precision. |
![]() |
![]() |
#142 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,326
|
I afraid you are expecting too much. Even if it would be at the speed of mathieeesingbas, it would still be too slow for gaming - please make the math yourself. You would be still below 6fps.
Anyhow, the framework is there, the architecture is open, the interface is documented. All that needs to be done is a re-implementation of the softieee.library. The complicated parts such as the FPU emulation, instruction decoding or online jitting is already taken care of by the SoftIEEE binary (not the library) and MuRedox. These binaries do not care how the math core works - and that is the softieee.library. |
![]() |
![]() |
#143 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,517
|
Fair enough. Fixed point builds on EC parts for the win then.
|
![]() |
![]() |
#144 | |
Registered User
Join Date: Nov 2018
Location: Belfast
Posts: 1,542
|
Quote:
|
|
![]() |
![]() |
#145 | |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 56
Posts: 2,050
|
Quote:
|
|
![]() |
![]() |
#146 |
Paranoid Amigoid
Join Date: Mar 2008
Location: Athens/Greece
Age: 45
Posts: 1,978
![]() |
Thomas I see version 40.6 is on Aminet (from yesterday) but the archive has version 40.5 (binary + library).
|
![]() |
![]() |
#147 |
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,888
|
@THOR - apologies upfront for my question - i'm curious how from your perspective feasible is to implement such emulation in other than MC68K ISA - so emulate in software 881/882 in additional SW/HW but still keeping your MC68K frontend - in other words - implement physical float calculation in separate solution and use such virtual 881/882 from native CPU.
Still have impression that i'm unable to express clearly my question so example: Your SoftIEEE library but float numeric is implemented in software in different HW connected to Amiga (for example one of cheap SOC using RISC V or ARM ISA if they are equipped with float co-processor and for example DSP and such SoC is running like 300...400MHz). How feasible is such hybrid implementation from your perspective? - lets skip numeric (i.e. not MC68K) part from the question. Thx! |
![]() |
![]() |
#148 |
Alien Bleed
Join Date: Aug 2022
Location: UK
Posts: 4,517
|
I think that question has been raised already. Someone asked about offloading the instruction to PPC (I think that's what they said), but certainly in that case there's a lot to contend with. Anything along the WarpOS route would be many orders of magnitude slower than the current software implementation.
Unless you have an extremely low latency way to do it, I don't think you'll get away with offloading externally. |
![]() |
![]() |
#149 | |
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,888
|
Quote:
Nowadays small SoC's are equipped with HW float (albeit 32 bit) and usually DSP, some of them also capable to do some fast low precision integer dedicated NPU. Such SoC cost 2..3$ and beside glue logic has everything to do such functionality - so this was my question - small SoC incapable to perform full MC68K emulation but capable to offload for example float calculation at a fraction of the cost of original 881/882 (not mentioning 40/60 where coprocessor interface may be not even implemented on board). Creating some API standard and separating frontend from physical implementation of the float calculation could be something interesting. Ages ago there was for example WEITEK company that produce many solutions present as simple I/O in CPU address space... so this is question about something similar performing 4..6 times faster than MC68K in float implementation. Or something like this https://micromegacorp.com/umfpu64.html easy to hook to even MC68000. Some report comparing 8 bit uC in softfloat vs such softfloat implemented externally https://micromegacorp.com/downloads/...g%20WinAVR.pdf - limitation is of course due SPI inteerface but even in such case difference is obvious - assuming different way of connecting such external HW to significantly reduce communication overhead may be sane option for replacing 881/882 with SoftIEEE and receive better results. Was curious about THOR opinion if from his perspective this is feasible to separate physical calculation implementation from frontend so he is not responsible for any foreign bugs but still can control SoftIEEE as owner and for example focus on his pure MC68K float implementation. So semi open standard. Last edited by pandy71; 09 January 2023 at 13:37. |
|
![]() |
![]() |
#150 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,326
|
|
![]() |
![]() |
#151 | ||
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,326
|
Quote:
However, the resulting solution would still not on par with a hardware FPU. Let's make a couple of computations: A hardware multiplication on the 68060 is ~2 cycles if I recall. The MuRedox call-in overhead is roughly one magnitude larger (~20 cycles), that of SoftIEEE through exception processing a lot larger (~200 cycles). To this, the softieee.library still has to forward parameters to the hardware, and perform the operation there. For example, for the 68882, you need to emulate the coprocessor interface in software (probably another 20 cycles) and then the 68882 has to execute the multiplication (which is another >20 cycles), so in the end, you are at about 60 to 100 cycles minimum. That's almost two magnitudes slower than the 68060. The softieee.library multiplication engine is probably 200 cycles (just house numbers), so it is slower, but not that much slower. This is also the reason why the 68882-based "hardware accelerator" solutions were not really working well. The communication overhead to the FPU eat up the performance improvements of the FPU. The 68882 only works well with the 68020/030 hardware interface where hardware implements the interface. Quote:
The speed would be, according to this estimate, approximately on par with the mathieeedoubbas.library. |
||
![]() |
![]() |
#152 | |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,326
|
Quote:
Pretty much. For PPC-offloading, you would be again slower than the 68882 solution because you need to communicate with the external CPU - some form of message passing is required. This does not pay off, it already killed the performance of PowerUp and WarpUp and made this hybrid PPC/68K solutions unpractical. |
|
![]() |
![]() |
#153 | |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,326
|
Quote:
See above. I'm as open as possible on the interface to make such a thing possible, and the interface of the library is as simple as it can be (two pointers to floating point numbers), but even if the actual computation would be immediate, there is still code between "your code" and the actual computation, and that is the MuRedox "trampoline code". It stores essential registers trashed by the softieee.library on the stack (d0-d2/a0-a1/a6, the ccr and the PC), loads the source operands (in the easiest case directly in the softieee.library) and calls the library. Like it or not, this type of overhead will go away, no matter how smart your hardware is, and it is already one magnitude larger than the 68060 hardware multipliation. Even with instant operation, you would be down to the speed of a 68882, and that is really a *very* optimistic estimate. |
|
![]() |
![]() |
#154 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,200
|
@pandy71 That FPU you linked has a serial interface. That would probably limit performance on an 040+. On a 68000 though.... :-)
|
![]() |
![]() |
#155 | ||||||
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,888
|
Quote:
Quote:
Original 881/882 are rather slow HW FPU's and eventual floating point FPU emulation on typical MC68k will be even slower (due for example low clock). Some hybrid solution can replace gap between high price reputable but close to unobtainable HW or salvaged or fake chips... Quote:
68882 is OK but quite slow - slower even than 80287 with twice lower clock and still 040 and 060 are subset of 881/882 instructions so eventual hybrid FPU approach may be still beneficial even if subpar with real HW FPU wired with CPU trough coprocessor interface? Quote:
Quote:
Anyway thanks for your time and hard work. Quote:
4MHz SPI can be replaced with 80MHz SPI or by parallel interface - problem with real FPU for Amiga is high price if from reputable sources or high risk of fake or faulty chip salvaged from some junk in China, India or Africa if bought in internet... MC68000 can use 881/882 as Motorola pointed in their application note AN947 and similar scheme could be used for hybrid emulation - nowadays there is many 4...6$ SoC's with HW FPU (usually single precision) but clocked at 100...400MHz. This thread triggered my curiosity - missing Amiga/Commodore documentation for this interesting topic - something like Apple SANE documentation "Apple_Numerics_Manual_Second_Edition_1988.pdf" |
||||||
![]() |
![]() |
#156 | ||||
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,326
|
Quote:
Quote:
Quote:
Quote:
The mathffp/mathtrans libraries are based on motorola library codes for math functions. |
||||
![]() |
![]() |
#157 | ||
Registered User
Join Date: Jun 2010
Location: PL?
Posts: 2,888
|
Quote:
In respect to 060 - yes, but if you have 060 then seem this package is not for you but for people with LC060 Quote:
Thx! |
||
![]() |
![]() |
#158 |
Registered User
Join Date: Nov 2022
Location: #Amigaland
Posts: 156
|
Just a heads up, with SoftIEEE enabled, MacOS crashes in Shapeshifter.
|
![]() |
![]() |
#159 |
WhatIFF? Amiga Magazine
Join Date: Feb 2021
Location: Chiba, Japan
Age: 46
Posts: 500
|
Has anyone tried this with a TF1260 LC and Lightwave 3.5 FPU version? I followed the instructions for installation but get a guru error when running Lightwave. Not sure what to do next.
|
![]() |
![]() |
#160 |
Paranoid Amigoid
Join Date: Mar 2008
Location: Athens/Greece
Age: 45
Posts: 1,978
![]() |
|
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Demos to test FPU on SX32 MkII (020+FPU) | Rochabian | request.Demos | 1 | 21 April 2020 03:03 |
Betatesting Amiga and C64 Forever 7 | michaelz | support.Amiga Forever | 23 | 22 June 2017 16:58 |
[obsolete] EoB 2 Thread AGA and translations betatesting | Marcuz | project.Amiga Game Factory | 17 | 21 August 2008 22:47 |
Frederic's Emulator inside and Emulator thread | Fred the Fop | Retrogaming General Discussion | 22 | 09 March 2006 07:31 |
|
|