12 August 2016, 19:59 | #61 | |||
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
I enjoy the journey, not the destination!
Quote:
Quote:
Quote:
Just consider for instance, LEA d8(An,Rn),An... Now the Rn can be a An or a Dn, and the bit that selects it can be considered the high bit in a four-bit register field. So far so good. So what about LEA to Dn? There, the A/D bit has to be on the other side of the register... in other words it still looks like a 4 bit register field but with the bits in a different order. And that's before you get onto the possibility of "Data register indirect" addressing modes, for which there is just not enough encoding space. Anyway it certainly involves "special cases" to handle unsigned byte offsets/data. Short branches use signed byte offsets, as do the d8(An,Rn) addressing modes mentioned earlier, and even the venerable moveq.l #n,Dn sign extends its byte data. EDIT: also of course we already have the "special case" of the stack pointer (A7), which increments and decrements by 2 instead of the usual 1 using (A7)+/-(A7) on byte sized operations. Last edited by Mrs Beanbag; 12 August 2016 at 20:15. |
|||
12 August 2016, 21:07 | #62 | |||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
But does it matter ? Quote:
Quote:
Quote:
Quote:
That's not a clever thing and no consideration other than 68000 compatibility can justify it. |
|||||
12 August 2016, 21:45 | #63 | |||
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
i'm not saying it would be a good idea. Just that it wouldn't be very difficult or complicated. Quote:
Yeah there are some cases where it's kind of our fault if that happens, because we should have been more strict in what we would accept and throw an exception or something otherwise, and then they'd know they'd done it wrong. But in this case we're in a pickle because the programmer threw an exception on purpose in order to achieve what they wanted. Quote:
it has architectural advantages as well, because two single-port register files might be simpler to implement than a dual-port register file, so instructions like move (A0)+,D0 can write both results simultaneously. So i kind of waver in my support of it. |
|||
12 August 2016, 22:08 | #64 | ||||||
Banned
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
|
Quote:
Code:
mvz.w d0,d0 ff1 d0 sub.l #16,d0 ; 6 bytes without OP.L #data.w,Dn Quote:
68060 IPC=1.3 pipe length=8 avg instruction size=3 bytes ICache line=16 bytes We throw away 1.3 * 8 = 10.4 instructions on average This is 10.4 * 3 = 31.2 bytes of code fetched and cached This is 31.2 / 16 = 2 ICache lines replaced The Apollo-core may have an IPC of 3+ and a deeper pipeline. IPC=3 pipe length=10 avg instruction size=3 bytes ICache line=16 bytes We throw away 3 * 10 = 30 instructions on average This is 30 * 3 = 90 bytes of code fetched and cached This is 90 / 16 = 6 ICache lines replaced There is also the delay in cycles of the pipe length and any data accessed would go into the DCache. We will do this misprediction twice before we turn around a 2bit saturating prediction. I hope we can quickly see that the cost of those little branches is much higher when they are mispredicted. Advanced processors are not like the 68020/68030 anymore. Quote:
Quote:
Everything adds special cases. It is important to avoid the big tables/muxes and decoding info not in the first word. This is what I consider dirty. Quote:
Quote:
|
||||||
12 August 2016, 22:37 | #65 | |||||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
So for me bit #0 of a branch isn't "free". And that's all. Quote:
Quote:
And the misprediction will occur... twice. So even if it adds 100 clocks, that'll be 100 clocks added to the overall program execution. Big deal. Quote:
Quote:
Bit #0 isn't free. Quote:
Or things that "reuse" a bit that wasn't previously free. Quote:
Quote:
If your car has a small scratch, do you take a hammer and add more, just because it's not exactly in mint condition and so you can't ruin it ? |
|||||||||
13 August 2016, 20:07 | #66 | ||
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
Although on balance, i think i agree with you on leaving this bit be. Quote:
Btw Matthey how did the team manage to get 3 IPC? Is it many-way superscalar or is this thanks to opcode fusion &c? |
||
13 August 2016, 20:18 | #67 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Deliberate throwing the illegal opcode exception is perfectly valid but there is a specific opcode for doing that (aka $4AFC).
|
13 August 2016, 20:27 | #68 | |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
Plus 68020 already broke some old code by not throwing exceptions on odd data accesses! So Motorola seems to think that wasn't such a legitimate technique. Last edited by Mrs Beanbag; 13 August 2016 at 20:35. |
|
13 August 2016, 20:47 | #69 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
They didn't have much choice. The 68020 is 32-bit so "odd address" becomes meaningless, and trapping on all misaligned accesses would have broken 90% of existing programs. |
|
13 August 2016, 20:54 | #70 | ||
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
Quote:
|
||
13 August 2016, 22:00 | #71 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
I recall you that TRAPcc doesn't work on 68000.
Quote:
They did the right choice and it's good for coding flexibility, code density, and even sometimes performance. It was really worth the limited compatibility issue. It's always a matter of trade-offs. Inability of the 68000 to do misaligned accesses was a real pain. Not many programs trigger address errors on purpose ; in fact i haven't found any. So the change is ok. On the other hand, even a very small compatibility threat for just a near useless branch hint bit, isn't worth. |
|
13 August 2016, 22:17 | #72 | |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
oh yeah
Quote:
But indeed... it is the uselessness of the branch hint bit that swings it for me. I don't really care so much that a few old demos and games won't work. Tbh i would be willing to compromise everything in supervisor mode and let the operating system be recompiled if it would help, i only care about compatibility with user mode software, anything that trashes the OS in order to run is something i'd prefer not to run on a computer with too much power! |
|
15 August 2016, 20:21 | #73 |
Registered User
Join Date: May 2014
Location: inside the emulator
Posts: 377
|
My post got eaten. Short version:
Hint bits are not worth it in a modern processor. They can be worth it in simple processors or to shave a few clocks from run-once code (exceptions etc.). The bit can be used for better things, my version: 0000 0000 -> 16 bit displacement 1111 1111 -> 32 bit displacement 0000 0001 -> 64 bit displacement xxxx xxx0 -> normal 8 bit displacement xxxx xxx1 -> available for extension Any incompatible treatment of the LSb should be disabled by default and enabled by the OS if needed. |
16 August 2016, 10:03 | #74 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
What the heck could be the use for 64 bit displacement ?
Programs are never that large ! |
16 August 2016, 12:44 | #75 |
Registered User
Join Date: May 2014
Location: inside the emulator
Posts: 377
|
|
17 August 2016, 23:10 | #76 | |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
But seriously... when i look at this little table it makes me realise something else about why using Bit 0 as a hint bit (or indeed anything else) is really dirty... Because $FE is a legitimate 8-bit branch. So then how do you put a hint bit on it? Then it becomes $FF which means a 32-bit branch... Then again $FE would be a branch to itself, causing a total lock-up, so maybe let's not use that anyway. |
|
18 August 2016, 07:34 | #77 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
A branch with $FE is likely to be a quick'n'dirty error handling, so it has better be predicted as not taken. Hey, wait -- it's a backward branch so default would be taken and it needs the hint bit to reverse that...
So now we have : - $FE unable to get the hint bit (and always mispredicted) - $01 useless - BRA and BSR not using the hint bit - the hint bit moves away in case of larger branches Who told about 'special cases' ? |
18 August 2016, 19:22 | #78 | ||||
Banned
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
|
Quote:
Quote:
Quote:
I documented the least significant bit of the displacement as reserved for these instructions. The CPU handles these different than Bcc even if the encoding is similar. Quote:
I wish to reduce my time spent posting here. I do not view this thread as productive and the ISA is dead end anyway. It only has historical significance as these were the ideas which the "Apollo non-Team" came up with and were discarded with minimal evaluation before a 68k+MMX bolt-on was decided by Gunnar. RIP 68k. |
||||
18 August 2016, 20:46 | #79 | ||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
In some way, yes.
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
The 68k remains the best existing ISA, even if you consider it dead. Gunnar has changed his mind when facing the gruesome facts for the emulation library and it seems current version does not have this silly c2p/pixmerge stuff he once wanted to add (and not even additional data registers). He is likely to change again when seeing the uselessness of what he has added. |
||||||
18 August 2016, 21:33 | #80 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
any given design could be run at fast or slow clock speeds in principle so this is not a very good critereon. It's modern if it uses modern ideas. FPGAs will get faster and we could even see homebrew silicon wafers in the future. Instructions per clock is a better measure. Although even that is open to debate because not all modern applications demand speed.
So. Imho there are better ways to mitigate branch penalties than explicit hint bits, which are alien to 68k architecture. i'd rather we started thinking outside the box a bit more. |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
BOOM (DOOM Enhanced) port to 68k | NovaCoder | News | 155 | 05 May 2023 12:26 |
ISA Ethernet Cards | jmmijo | support.Hardware | 13 | 03 February 2015 11:04 |
Any ISA Mach64 Information? | CU_AMiGA | support.Hardware | 21 | 09 September 2007 22:17 |
Help converting an 8bit ISA slot to 16bit ISA slot | Smiley | support.Hardware | 4 | 25 April 2006 11:20 |
A2000 ISA slots | Unknown_K | support.Hardware | 1 | 20 March 2005 09:48 |
|
|