View Single Post
Old 16 October 2019, 11:17   #16
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,616
Quote:
Originally Posted by NorthWay View Post
I was thinking modern out-of-order cpus in general. The additional logic to keep hundreds of instructions in-flight at any one time takes so much transistors, heat, and complexity that it dwarfs much the rest of the logic (cache etc notwithstanding).
Yes but does the complexity of OoO part lower significantly when trimming the instruction set ? If i remember correctly, it does not depend that much on the isa.


Quote:
Originally Posted by NorthWay View Post
Well, the failure might just as well have been because the hw designers made their new toy and kicked it over to the sw side with a "make this work", and as such it was a natural consequence. There have been many opinions on the Itanitum - of which a number point to the 'design by committee' nature of it - but I'm not sure it is _fundamentally_ flawed and can't be done right. But I also don't expect to find out, with the possible exception of Mill.
I think that if it is developer hostile, it is fundamentally flawed.


Quote:
Originally Posted by NorthWay View Post
Compiler writers don't care much about any "beauty" of the instruction set, they want anything to make their life easier.
But it's exactly the same at the end, what makes the beauty is the efficiency here...


Quote:
Originally Posted by NorthWay View Post
Well, as I said, the Mill designers heavily oppose this thinking and aim to up the parallellism a lot on regular code.
Then they haven't studied regular code...


Quote:
Originally Posted by NorthWay View Post
Yes. The size of the individual opcodes themselves, but not the parallell "packet" they are delivered in. Their worry is that to be able to do so many opcodes in parallell they need smaller opcodes so they can feed and decode more of them and not blow cache/line sizes.
I still think they would be better off by using the same opcode on different data, rather than many opcodes at once.


Quote:
Originally Posted by NorthWay View Post
Well, they force you to use an intermediate Mill instruction set that is then finalized for the Mill type you are running it on. That is IMO the weakest link in their plans - not necessarily for technical reasons but for getting mental buy-in from their target audience.
But I do agree with their thinking; if you want to replace the extra complexity of modern cpus with logic that does more, then you need to expose internals to the sw side and you need to come up with new ideas to help this along. But I'll say Mill is brave to think they can do this.

(Yes, I like the Mill architecture a lot. It isn't any single (or few) thing(s), and therefore not something you can gift to whatever other cpu family, but a long long line of interconnected and coherent ideas that feel fresh and exciting. I _do_ worry about what clock speed they can achieve though.)
Well, i admit my pov is exactly the opposite. As an asm programmer, i would *not* want to code on that...

I fear this is again theoretical power-on-the-paper.

They want to execute many things in parallel but are many tasks cpu bound today ?
They may just implement it, and then discover the memory interface does not follow and their performance level is not significantly better than the others.

Or find out most of the instructions they have in their blocks are NOPs due to inter-instruction dependencies.
If the mill is in-order cpu, it will be beaten by OoO on regular code regardless of how many things it can do at once, as it WILL have to wait for intermediate results to come.


Quote:
Originally Posted by AmigaHope View Post
Eh I still find 68k the best in terms of code density *AND* readability, while still being compiler-friendly. Since it's cleaner than x86, the CISC -> microop pipeline in a theoretical modern 68k CPU could be a lot cleaner and smaller, yet still provide a lot of the advantages that x86 code has today.
It would actually beat the crap out of x86 in many places, but it's not going to happen.


Quote:
Originally Posted by AmigaHope View Post
Of course now that we're reaching the true hard limits of silicon, we might see a future where a new process (either electronic with a new chemistry, or maybe photonic?) provides 10X-100X the clock rate but with fewer transistors, and RISC might become the future again.
Current limit for clock rate is more generated heat (which does not go linear with speed but actually worse iirc) than speed of individual transistors.
I don't see how this could become different in the future.
meynaf is offline  
 
Page generated in 0.04192 seconds with 11 queries