![]() |
![]() |
#161 | |
Registered User
Join Date: May 2023
Location: Norwich
Posts: 502
|
Quote:
Instead it became quicker and easier to just read CISC-like instructions from RAM, convert them to RISC-like microcode and run them at much higher speeds there. The less data you have to transfer across the slow memory bus (and the increased cache density of storing CISC instructions) outweighs the complexity. Intel really lucked out in that their attempts to replace x86 with just about anything else failed because no customer was willing to do the work of migrating existing code. If Motorola had met similar resistance and been forced to stick with 68k, they might have actually done better in the long run. |
|
![]() |
![]() |
#162 |
Registered User
Join Date: Sep 2013
Location: Poland
Posts: 885
|
@AestheticDebris - sure, but...
1. current gen x86 still have absurd level of cache (L3) which is higher than memory size (global) of machines 3 decades ago. So basically you could run application of late 90s entirely out of cache. 2. all high performance x86 introduce L0 cache or uops trace cache. It's relatively small but contains already translated instructions. So if your entire loop fits there it's gonna be executed much faster. And that's how RISC build upon current x86 execution units would most likely run. That's also the reason why Apple was able to match x86 general performance with their ARM implementation. 3. Of course 64b 68k with fully superscalar pipelined architecture and vector instructions (simd) would do fairly fine. Better if it was actually evolving due to competition. Intel didn't bring Core uArch out of the goodness of their hearts. They had to because their Itanium sucked and AMD made own x86 enhancement which was warmly welcomed by the market. Motorola did nothing to allow 3rd parties on their turf... Hence, most likely, motorola exclusive 68k would fail in the long run anyway. |
![]() |
![]() |
#163 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,218
|
Caches don't get bigger because of poor code-density automatically. Address pins don't automatically exceed 32 bits because memory size gets bigger. Even embedded controllers use more than 32 bits nowadays.
|
![]() |
![]() |
#164 | |
Registered User
Join Date: Jul 2024
Location: France
Posts: 23
|
Quote:
The simpler decode advantage still holds to this day - it's how Apple was able to make the M1 so fast and put it in a laptop. They can decode a whole row of instructions at once because the instruction stream is simple and predictable. x86 has a hard time doing that. If CISC is such an advantage, why does nobody other than x86 use it anymore? Arm used a memory-saving instruction set in Thumb mode but it was removed in Arm64. If it was such an advantage, why remove it? In fact, [/QUOTE]Intel really lucked out in that their attempts to replace x86 with just about anything else failed because no customer was willing to do the work of migrating existing code. If Motorola had met similar resistance and been forced to stick with 68k, they might have actually done better in the long run.[/QUOTE] They'd have evolved it for sure, but I suspect they'd have done things like produce a simpler version like x86-64, or what they actually did do with Coldfire. |
|
![]() |
![]() |
#165 |
Registered User
Join Date: May 2023
Location: Norwich
Posts: 502
|
@Promilus
1) Well, yeah, obviously. Modern machines have enormous amounts of RAM compared to 30 year old machines, so they cache enormous amounts of data in comparison. But RISC architectures would need more caching, because they needed more instructions and throughput is critical. 2) Modern ARM is full of not very RISC things, because like everyone else, they've seen that RISC doesn't really give the benefit it was envisioned to. It's why they have SIMD instructions and other "combined" instructions like FMA that do more than one thing at once, which is literally the opposite design philosophy to RISC. The whole RISC/CISC argument died a long time ago as both sides borrowed the best bits of the other. The thing everyone cares about these days is really power efficiency and ARM mostly is at an advantage because x86 carries a lot more legacy baggage (which Intel is very keen to ditch). 3) I'm not saying they wouldn't have failed for other reasons, just that abandoning their CISC architecture and going all in on a totally different RISC design might not have been necessary in the long run. The fact Apple was one of their biggest customers and pushing hard for a RISC design (because like everyone else they thought it was the future) probably made ditching 68k seem like a good option at the time and maybe it was on those grounds alone. |
![]() |
![]() |
#166 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,747
|
My take on this is that Amiga thrives in the magical space of emerging workstation. <3
Conclusion:
I think FPGAs and RPis are extremely useful, e.g. providing modern and fast I/O interfaces and peripherals like RGB2HDMI and much better than a microcontroller for superfast big flash cards, Ethernet, etc etc. The sky really is the limit and imagination can take us anywhere! This is what I'd like to see them used for. I include Zorro compatible gfx cards and GPUs and sound cards in this, although I think with too many expansions (even back in the day cards) - again, it's just like another PC (a pile of cards, not a platform). (There is something about having a platform to turn to that is different from the 'devices' we use every day. I include a web browser, notifications, updates, needless software change (degradation away from power usage) etc in this. The usage is different. The computer is there. You have power. What do you want to do today?) E.g. I'm not an Amiga OS4 aficionado despite all their stellar work(!) because I want something to love that's different from everything else, and oldskool Amigas fill that space. I can turn to it, there are no distractions, and it's a very welcome change from 'devices'. And many other platforms that have been left alone, so you can love them e.g. 8-bit, Archimedes and much more. And it seems these PiStorms, Vampires, and sandboxed Amiberry boxes by any other name are doing the same as OS4/PPC - but starting over on their own with different platforms, with a worse result, and without getting all the work put into OS4. There's also AROS, which is sort of 'doing a NextSTEP' (but seemingly better) and would then be able to use the full power of the (is >4GB RAM possible?) PC CPU. But again, it wouldn't be the same so same as for OS4. ~ My dream Amiga would be one with every custom chip remade in modern components (and finished not WIP), and remade with modesty and taste, and the same with the 68060 - but since producing a CPU takes a hundred man-years, this will never be finished, and so I don't want anything beyond 68060. I will never need higher res than highest progressive mode in AGA, but the sound and Blitter were never improved, so I would love it if we all agreed on a (performant DMA) simple sample sound card ![]() ^ This kind of Amiga would be much more following Jay Miner's Vision. <3 |
![]() |
![]() |
#167 | ||
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,488
|
Because memory throuput still relates to performance.
Quote:
Later 68k iterations also got pipelined and got wider buses and cache ... Reducing the MIPS advantage to higher clock-speeds only. On cycle-by-cycle comparison MIPS had no speed advantage vs. contemporary 040 or 060s - just the opposite. Emu68 can translate 68k-code to aarch64 at a level of 2:1, meaning two ARM instructions pr 68k instruction, in the absolute very best scenarios. So this are not only twice as many instructions, but each of them is also longer than a typical 68k instruction. Quote:
I only gave a speed reference - a new machine would need something that is 10x faster than any current 68080 implementation as a minimum. I am not against RISC CPUs at all. Whatever works. |
||
![]() |
![]() |
#168 | |||
Registered User
Join Date: Sep 2013
Location: Poland
Posts: 885
|
Quote:
![]() Yes, dynamic recompilation is often nearly as fast as running native code. But... if you want to emulate correct behavior of CPU (i.e. all CCR flags etc.) then there's penalty. Quote:
Quote:
|
|||
![]() |
![]() |
#169 | |||
Registered User
Join Date: Jul 2024
Location: France
Posts: 23
|
Quote:
Quote:
When the dual issue 060 came out. MIPS had a quad issue processor at 90MHz. The Alpha was still dual issue, but it was at 300MHz and 500MHz the following year. Quote:
Modern ARM decoders can issue many instructions per cycle. They have caches running into the hundreds of gigabytes per second to feed this. [/QUOTE]I only gave a speed reference - a new machine would need something that is 10x faster than any current 68080 implementation as a minimum.[/QUOTE] You set the bar far too low. At 10x faster the 080 can issue 3.2 billion instructions pre second. An Apple M3 can issue 32 billion. |
|||
![]() |
![]() |
#170 |
Registered User
Join Date: Nov 2018
Location: Germany
Posts: 119
|
Very best real world scenario can go much higher than your predicted 2:1… RC5-72 calculations go as high as 1:1 (or very slightly less than this)…
|
![]() |
![]() |
#171 | |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,488
|
Quote:
![]() But also rather an edge-case? What are your real world scenarios on average and worst case vs. best case? How much bigger is a typical translated aarch64-code section in comparison to the original 68k-section? |
|
![]() |
![]() |
#172 |
Registered User
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,488
|
|
![]() |
![]() |
#173 | |
Registered User
Join Date: Nov 2018
Location: Germany
Posts: 119
|
Quote:
Best case is simple - do an endianness conversion of 32bit (ror + swap + ror) - emu68 can detect it and replace three m68k opcodes with single aarch64 opcode. So, best (not common though) is 1:3 (1 arm vs 3 m68k). Second best case are move to/from memory instructions which can be eventually merged, resulting in 1:2 ratio (1 arm vs 2 m68k). Worst case are all supervisor instructions since they need to check for supervisor bit and eventually through exception. There you can go easily with 50:1 ratio (50 arm vs 1 m68k). Regarding code size, you cannot really compare that. Emu68 translates code starting from each used entry point up to a place where translation need to break. It means, even for small portion of code there can be dozens of translated units. Good thing is however, I have almost never managed to fill the JIT cache (64MB) to a point where Emu68 needs to recover some memory by flushing JIT translations. |
|
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
68080/68060 discussion, comparisons etc | lord of time | support.Hardware | 226 | 14 October 2020 11:32 |
APOLLO CORE 68080 emulation in WinUAE ? | biozzz | support.WinUAE | 10 | 29 June 2018 13:22 |
68080 CPU on WinUAE | AMIGASYSTEM | support.WinUAE | 6 | 04 April 2017 18:51 |
vasm with Apollo Core 68080 and AMMX support | phx | News | 11 | 17 February 2017 23:22 |
Your Valued opinion please | synchro | Retrogaming General Discussion | 32 | 05 May 2007 22:35 |
|
|