24 December 2021, 15:27 | #1 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
68060 Rev.1 Superscalar bug?
The last days I was debugging a wrong FPU result, which became correct as soon as I single-stepped it with BDebug (real A3000, CSPPC-060).
Today I had the idea that it might be a superscalar issue and started inserting NOPs between instructions in the relevant part. The bug disappeared! Finally I isolated the problem to the following instructions (part of a dot-product operation, compiler-generated code): Code:
fmove.s (4,a1),fp3 fmove.s (4,a0),fp0 fmul.x fp3,fp0 fadd.x fp0,fp4 The problem also disappears when I clear bit 0 (ESS) of the PCR register, which disables superscalar. Is this a know bug for a revision 1 68060? Is it fixed in later revisions? My PCR is: $04300121. |
24 December 2021, 16:49 | #2 | |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,233
|
Quote:
If you want to, I can test this on my rev.6 68060, but it would require a more complete test sequence. The NOP stalls the pipeline long enough to avoid the result forwarding for the second fmove. Actually, this also indicates that the 68060.library you are using is not careful enough to disable superscalar execution on the rev1 cpus, which it should really do. |
|
24 December 2021, 18:34 | #3 |
Moderator
Join Date: Dec 2010
Location: Wisconsin USA
Age: 60
Posts: 841
|
Hmm...
Did you try using FNOP instead of NOP? But either way, I assume you want to keep Superscalar mode enabled. Last edited by SpeedGeek; 24 December 2021 at 20:52. |
24 December 2021, 22:30 | #4 | |||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
Quote:
Code:
fmul.x fp1,fp4 move.l (8+l691,a7),a1 move.l (88+l691,a7),a0 fmove.s (4,a1),fp3 fmove.s (4,a0),fp0 Quote:
Quote:
Is it really the recommended workaround to disable superscalar execution on rev.1 chips and downgrade the performance considerably? It's the first time I noticed a problem in decades. |
|||
24 December 2021, 22:38 | #5 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
|
25 December 2021, 10:38 | #6 | |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,233
|
Quote:
Of course you can create an NOP or FNOP in multiple places, but this downgrades the performance of your application on all CPUs, whereas disabling the load-store-buffer bypass only degrades performance on affected CPUs. I believe the GNU compiler follows the first approach, by injecting a lot of NOPs. |
|
25 December 2021, 11:51 | #7 | |||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
Quote:
Quote:
Now we are talking about disabling Superscalar (PCR bit 0), right? BTW, is there a better 68060.library you would recommend, which can be easily downloaded for my system? I may update to 3.2 in the future, but currently I want to run 3.9 for testing purposes. Quote:
|
|||
25 December 2021, 14:48 | #8 | |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
Quote:
With Superscalar enabled: Code:
BYTEmark (tm) Native Mode Benchmark ver. 2 (10/95) NUMERIC SORT: Iterations/sec.: 14.887315 Index: 0.381795 STRING SORT: Iterations/sec.: 1.533946 Index: 0.685409 BITFIELD: Iterations/sec.: 4624770.943366 Index: 0.793311 FP EMULATION: Iterations/sec.: 1.694917 Index: 0.813300 FOURIER: Iterations/sec.: 597.684677 Index: 0.679745 ASSIGNMENT: Iterations/sec.: 0.259337 Index: 0.986824 IDEA: Iterations/sec.: 38.285396 Index: 0.585565 HUFFMAN: Iterations/sec.: 25.484308 Index: 0.706680 NEURAL NET: Iterations/sec.: 0.279023 Index: 0.448229 LU DECOMPOSITION: Iterations/sec.: 9.138496 Index: 0.473421 ...done... ===========OVERALL============ INTEGER INDEX: 0.682454 FLOATING-POINT INDEX: 0.524446 (90 MHz Dell Pentium = 1.00) ============================== Code:
BYTEmark (tm) Native Mode Benchmark ver. 2 (10/95) NUMERIC SORT: Iterations/sec.: 10.661501 Index: 0.273421 STRING SORT: Iterations/sec.: 1.193346 Index: 0.533220 BITFIELD: Iterations/sec.: 2800908.938665 Index: 0.480455 FP EMULATION: Iterations/sec.: 1.370070 Index: 0.657423 FOURIER: Iterations/sec.: 576.291257 Index: 0.655414 ASSIGNMENT: Iterations/sec.: 0.209205 Index: 0.796062 IDEA: Iterations/sec.: 30.637904 Index: 0.468598 HUFFMAN: Iterations/sec.: 20.308745 Index: 0.563162 NEURAL NET: Iterations/sec.: 0.269402 Index: 0.432774 LU DECOMPOSITION: Iterations/sec.: 7.997249 Index: 0.414299 ...done... ===========OVERALL============ INTEGER INDEX: 0.515503 FLOATING-POINT INDEX: 0.489816 (90 MHz Dell Pentium = 1.00) ============================== |
|
25 December 2021, 15:25 | #9 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,233
|
Yes, my bad. I was a bit confused with super scalar vs. load/store buffer bypass. I disable the latter, but not the former on earlier CPUs.
|
25 December 2021, 17:42 | #10 |
Registered User
Join Date: Feb 2017
Location: Denmark
Posts: 1,104
|
PCR=$04300521 means rev 5 in this parlance, right? (asking because 68060UM says "The first revision is 00000000" for the Revision Number bits).
In that case let me know if you need some testing done on that revision. I have the quake data files, and can build vbcc+patches if needed. P.S.: I'm using the 68060.library from Thomas' MMUlib (helpfully linked here), since bit 5 is set I'm guessing it's not rev6. |
25 December 2021, 20:48 | #11 |
Registered User
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,581
|
This is bringing back some not-so-happy memories of the 68060 on my A3000. What a mess!
|
25 December 2021, 22:49 | #12 | |||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
Quote:
Quote:
Quote:
But, honestly I'm not sure if the bit should be set or cleared, as it is not documented in my 68060UM. On my system it is also set, although I'm running the NoBypass patch (from Simon Goodwin?) directly after SetPatch. I know that it does a bchg #5, so it would revert the workaround when the 3.9 SetPatch already did it. I have to check that some day... |
|||
25 December 2021, 23:32 | #13 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,233
|
The CPU command of 3.2 should tell you what the revision is. Concerning the load-store buffer bypass: The bypass is disabled (and by that the workaround is enabled) if the bit is 1. This problem applies also to the first revision, the 1F43G mask, not only to rev.5.
|
26 December 2021, 02:09 | #14 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
Ok, thanks.
Seems I already forgot most of the errata again. |
26 December 2021, 15:12 | #15 | |
Moderator
Join Date: Dec 2010
Location: Wisconsin USA
Age: 60
Posts: 841
|
Quote:
What is the source of your information? If you looked at the Motorola 68060 Errata you should have seen the same mask set affected by F6 was also affected by I14 & I15. So not only Rev. 5 but Rev. 1 should have Load/Store bypass disabled too. Neither, the 3.9 Setpatch nor the P5 68060.library does anything with Load/Store Bypass disable (PCR register bit #5). You were fortunate to have the NoBypass patch installed, which BTW does nothing for a Rev. 5 CPU. Last edited by SpeedGeek; 26 December 2021 at 15:17. |
|
26 December 2021, 21:17 | #16 | |||
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
My bad memory.
Quote:
Quote:
Quote:
|
|||
26 December 2021, 22:35 | #17 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,233
|
In case of my 68060.library, the answer is quite simple: It runs an instruction sequence that triggers the defect, and it disables the load-store buffer bypass in case the erratum could be reproduced. Thus, knowledge of the relation between mask sets and revisions is not necessary.
|
27 December 2021, 14:03 | #18 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,500
|
Nice!
But as far as I understood it does neither run an F6 test sequence yet, nor disables superscalar in any situation? |
27 December 2021, 14:21 | #19 |
Registered User
Join Date: Jan 2019
Location: Germany
Posts: 3,233
|
Correct. The F6 test sequence is currently only run by the CPU command. I'm a bit unclear what I should do about it. The Os math libraries are not affected (they don't trigger the defect) so the Os is, as such, fine. Unfortunately, it affects direct usage of the FPU in some cases.
|
27 December 2021, 14:53 | #20 | ||
Moderator
Join Date: Dec 2010
Location: Wisconsin USA
Age: 60
Posts: 841
|
Quote:
https://www.amibay.com/forum/amibaye...mation-request Note: You will need to login to Amibay to view this thread. Of course, that information was not available when Simon Goodwin released his NoBypass patch (on Aminet). Quote:
Last edited by SpeedGeek; 27 December 2021 at 15:49. |
||
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Wanted - 68060 Rev 6 | Retro-mania | MarketPlace | 48 | 06 August 2020 23:14 |
68060 rev. errata and performance impact? | gdonner | support.Hardware | 6 | 24 April 2019 18:43 |
Difference between a 68060 rev 5 and 6 | Syntrax | support.Hardware | 2 | 10 February 2019 21:23 |
WTB: 68060 Rev 6 71E41J | TjLaZer | MarketPlace | 3 | 03 January 2016 14:10 |
68060 emulation bug | riftcon | support.WinUAE | 4 | 14 March 2008 22:52 |
|
|