English Amiga Board

English Amiga Board (https://eab.abime.net/index.php)
-   Coders. Asm / Hardware (https://eab.abime.net/forumdisplay.php?f=112)
-   -   68k details (https://eab.abime.net/showthread.php?t=93770)

NorthWay 09 March 2021 17:17

Quote:

Originally Posted by meynaf (Post 1468577)
But if you have an IPC of say 4, you can hardly have something better.

"A Funny Thing Happened on the Way to the Forum" as they say:
One of the comments I read about the new Mac (and it's soon(?) forthcoming bigger brethren/sisters), is that it is so fast because it can actually issue 8 instructions per cycle, which goes against conventional wisdom that you top out max IPC in the 3 to 4 region.
The crux here was that RISC was making a comeback as the complexity of x86/CISC decoding was rising exponentially and as such going further than decoding 4 instructions per cycles was Unobtainium for Intel/AMD.
Correct me if I understood it wrong.

What I find much more interesting is that Mill Computing are throwing the rulebook about IPC out the window and are aiming way higher at their IPC numbers. They do a lot of stuff to make that happen, both for actual decode and for handling parallellism.

meynaf 09 March 2021 17:44

Quote:

Originally Posted by NorthWay (Post 1468850)
"A Funny Thing Happened on the Way to the Forum" as they say:
One of the comments I read about the new Mac (and it's soon(?) forthcoming bigger brethren/sisters), is that it is so fast because it can actually issue 8 instructions per cycle, which goes against conventional wisdom that you top out max IPC in the 3 to 4 region.
The crux here was that RISC was making a comeback as the complexity of x86/CISC decoding was rising exponentially and as such going further than decoding 4 instructions per cycles was Unobtainium for Intel/AMD.
Correct me if I understood it wrong.

Even if this is true - i have my doubts - this won't make it faster than x86 as they still have to issue 3 instructions instead of 1 for doing the same work.
Also don't forget that theoretical IPC is a maximum. In reality it often drops to a mere 1 because of dependencies.
For me that A1 chip is taking the path IBM Power took before - optimize to death and ultimately fail because the advantages of RISC don't work anymore.
That doesn't mean ARM won't take over the market - but the reason will not be raw cpu speed.


Quote:

Originally Posted by NorthWay (Post 1468850)
What I find much more interesting is that Mill Computing are throwing the rulebook about IPC out the window and are aiming way higher at their IPC numbers. They do a lot of stuff to make that happen, both for actual decode and for handling parallellism.

This is an attempt at a nice theory. It does not look like it's being successful in reality. There have been attempts at VLIW, there has been the Cell, and look at where they all are today.

grond 09 March 2021 17:57

Quote:

Originally Posted by meynaf (Post 1468872)
Even if this is true - i have my doubts - this won't make it faster than x86 as they still have to issue 3 instructions instead of 1 for doing the same work.

I addressed this point above: they do not issue 3 instructions instead of 1 for doing the same work, they identify a bundle of 3 instructions and then issue this bundle as 1 instruction.

meynaf 09 March 2021 18:35

Quote:

Originally Posted by grond (Post 1468876)
I addressed this point above: they do not issue 3 instructions instead of 1 for doing the same work, they identify a bundle of 3 instructions and then issue this bundle as 1 instruction.

That does not change the fact the stream of a RISC cpu has to contain more instructions for doing the same work. And that a CISC cpu can of course merge them too.
If you do 8 per clock but have to execute 18, you're slower than if you execute 4 per clock and need only 6.

grond 09 March 2021 18:50

Quote:

Originally Posted by meynaf (Post 1468899)
That does not change the fact the stream of a RISC cpu has to contain more instructions for doing the same work. And that a CISC cpu can of course merge them too.
If you do 8 per clock but have to execute 18, you're slower than if you execute 4 per clock and need only 6.

You don't seem to understand it. A modern CISC CPU can execute say 4 CISC instructions per clock cycle. A modern RISC CPU can execute 4 bundles of RISC instructions per clock cycle which may correspond to anything between 4 and 20 RISC instructions in total. In the end both CPUs do the same amount of real work. The CISC-CPU needs a more complex decode stage because of the arbitrary length of the CISC instructions, the RISC-CPU needs extra complexity for identifying bundles of RISC-instructions.

meynaf 09 March 2021 19:21

Quote:

Originally Posted by grond (Post 1468906)
You don't seem to understand it. A modern CISC CPU can execute say 4 CISC instructions per clock cycle. A modern RISC CPU can execute 4 bundles of RISC instructions per clock cycle which may correspond to anything between 4 and 20 RISC instructions in total. In the end both CPUs do the same amount of real work. The CISC-CPU needs a more complex decode stage because of the arbitrary length of the CISC instructions, the RISC-CPU needs extra complexity for identifying bundles of RISC-instructions.

I perfectly understand it. You just overestimate the effect.

grond 09 March 2021 19:27

Quote:

Originally Posted by meynaf (Post 1468912)
I perfectly understand it. You just overestimate the effect.

Congratulations, you just won an argument about a rational thing by moving it into the sphere of the irrational.

meynaf 09 March 2021 20:09

Quote:

Originally Posted by grond (Post 1468913)
Congratulations, you just won an argument about a rational thing by moving it into the sphere of the irrational.

There is nothing irrational in what i wrote. But here we clearly see your attempt to move it there.

NorthWay 10 March 2021 16:03

Quote:

Originally Posted by meynaf (Post 1468872)
Even if this is true

And right on que I found this today: https://www.osnews.com/story/133140/...ture-research/ which links to https://dougallj.github.io/applecpu/firestorm.html
Whether that proves or disproves anything I don't know.

meynaf 10 March 2021 17:15

The only sure thing is that another cpu war has begun. Only time will tell how it ends.

grond 10 March 2021 17:34

The bad thing about the Apple chip is that only Apple has it and most probably Apple will keep it to themselves. Other ARM processors aren't likely to kill off x86 any time soon. And Apple isn't going to eliminate x86-based Windows PCs either.

Bruce Abbott 11 March 2021 05:09

Quote:

Originally Posted by grond (Post 1469151)
The bad thing about the Apple chip is that only Apple has it and most probably Apple will keep it to themselves. Other ARM processors aren't likely to kill off x86 any time soon. And Apple isn't going to eliminate x86-based Windows PCs either.

Why is that bad?

Quote:

Originally Posted by NorthWay
And right on que I found this... Whether that proves or disproves anything I don't know.

Quote:

Elimination

Certain instructions do not need to issue:

mov x0, 0 (handled by renaming)
mov x0, x1 (usually handled by renaming)
movi v0.16b, #0 (handled by renaming)
mov v0.16b, v1.16b (usually handled by renaming)
mov imm/movz/movn (handled by renamer at a max of 2 per 8 instructions, includes all tested "mov")
nop (never issues)
Um...

grond 11 March 2021 09:04

Quote:

Originally Posted by Bruce Abbott (Post 1469289)
Why is that bad?

Well, one of the two is bad. A lot of manufacturers could benefit from a fast ARM implementation that is as fast as x86 processors but Apple will keep it for themselves and not sell it as a part for others to use. This could even lead to competitors switching from ARM to x86. On the other hand I believe those two markets are already strongly divided and nobody is going to switch because of processor speed. If Apple computers use ARM, well, it is going to be 5% less market share for x86. This doesn't seem a big deal.


Quote:

Um...
Well spotted, this seems really nonsense.

litwr 20 March 2021 12:52

Quote:

Originally Posted by Photon (Post 1464487)
I'm new to this thread, but "people" chose Intel not because of Intel at all. It obviously helped Intel that PC buyers didn't care about the "hardware inside", the real question is how they couldn't tell the difference to other brands and still forked up big cash.

80x86 was some horrible extension of 8-bit and 680x0 was a CISC, i.e. "wishlist" CPU design. ARM showed the way early, and both Intel and Motorola adopted RISC, and probably spent millions and millions adapting their horrible designs...

Pentium is pure sh*t and 68060 is a beautiful CPU. But by this time idiots had paid $1500 for crap and hundreds more to expand it. Let's call it the IBM "ecosystem". ;)

PowerPC and Alpha were interesting but remained so for the technical few.

Maybe Commodore was shut down in 1994 just because they could give the 68060 a chance when there were strong players against it.

Bruce Abbott 21 March 2021 04:53

Quote:

Originally Posted by litwr (Post 1471640)
Maybe Commodore was shut down in 1994 just because they could give the 68060 a chance when there were strong players against it.

Commodore was going down anyway, due to PCs taking over the market. After Apple switched to Power PC there was practically no market left for the 68060.

litwr 23 March 2021 18:04

Quote:

Originally Posted by Bruce Abbott (Post 1471841)
Commodore was going down anyway, due to PCs taking over the market. After Apple switched to Power PC there was practically no market left for the 68060.

Commodore could have provided some ground for the 68060. The top Amiga models was the best personal computers for video processing that time. Therefore the 68060 based hardware could easily find its customers. Escom sold thousands of the Amiga 4000 which based on the 68060. Market share meant nothing for this.
We can also dream a little. For example, it was quite possible that a cheap Amiga based on the 68040 might have been released in 1994 or 1995.

Bruce Abbott 23 March 2021 19:09

Quote:

Originally Posted by litwr (Post 1472503)
Commodore could have provided some ground for the 68060.

They did. Both the A3000 and A4000 had a CPU board connector (which is how I was able to have a 68060 in my A3000). The A3000 was released in 1991, so you could say they provided for the 68060 quite early on.

The 68060 was released in late 1994 when Commodore was already gone, so there wasn't much else they could have done to 'provide some ground' for it. However they did produce an 040 board for the A4000 which they could easily have modified to take an 060. If Commodore had survived for a few more years I'm betting they would have produced one.

Thomas Richter 27 March 2021 14:05

There is a strange race condition on the 68060 as in "you cannot restore the CPU state in all conditions". Typically, you want a debugger to be able to save the CPU state, fiddle around with its registers, and then restore the state back to how it was before.

Interestingly, this is not always possible on the 68060, which is really a bizarre exception throughout the motorola series.

The condition is as follows: If the FPU is in NULL state, and a non-implemented floating point exception is taken because the first opcode the FPU ever sees is something it does not handle in hardware, then the FPU is still in NULL state, but the FPIAR register points to the instruction to be emulated (has to, actually, as the FPSP code needs to grab and interpret the instruction from there).

However, this FPU state cannot be restored. Either, you can place a value back into the FPIAR, but this results in the FPU being in IDLE state, not in NULL state. Or you place the FPU back in NULL state, but then the next read from FPIAR returns 0, not the faulting instruction.

This is different from the 68040 - a non-implemented FPU instruction there creates a rather lenghtly exception stack frame and the proper address in FPIAR, and both can be restored.

In reality, it does not matter too much whether the FPU is in IDLE or in NULL state, though. For the 68060, the stack frame is the same size anyhow.

meynaf 27 March 2021 14:34

Quote:

Originally Posted by Thomas Richter (Post 1473459)
However, this FPU state cannot be restored. Either, you can place a value back into the FPIAR, but this results in the FPU being in IDLE state, not in NULL state. Or you place the FPU back in NULL state, but then the next read from FPIAR returns 0, not the faulting instruction.

And what happens if you place the FPU back in NULL state, then restart the instruction that has placed the FPU in that "non-restorable" state at first place ?

Thomas Richter 27 March 2021 15:22

Quote:

Originally Posted by meynaf (Post 1473468)
And what happens if you place the FPU back in NULL state, then restart the instruction that has placed the FPU in that "non-restorable" state at first place ?


In the particular use case, this is not an option. But it does not matter too much, it was rather an observation and a curiousity than a problem statement.


All times are GMT +2. The time now is 12:42.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.

Page generated in 0.11551 seconds with 11 queries