News of Free 060 Like Apollo Core License - Page 2

grond · 09 November 2017, 14:34

Quote:

Originally Posted by fpgaarcade

The instruction decoder is a massive pain, but the instruction set is highly ordered (it was originally PLA decoded) and the same tricks are possible now. It's only a 16 bit opcode for goodness sake!

Last time I checked the 68k required a 16 to 176 bit decoder.

Megol · 09 November 2017, 14:55

Quote:

Originally Posted by grond

Well, anyone would appreciate options but to be realistic: the 080 is incredibly complex and the likelihood of anybody else doing anything that comes even remotely close to it is zero. I'm a microchip developer by profession and I have seen the VHDL code. It is too complex for me to even understand more than the rough structure or tiny isolated fragments of it. This is also why I laugh at all demands for open-sourcing the code. There just isn't anybody who could pick it up. It is as it is: a competitive CPU isn't something that some hobbyists can develop. We should be grateful that some equally brilliant and persistent professional CPU developers picked the Amiga and 68k CPUs for a hobby.

Doing an advanced processor is hard work, sure. But it isn't rocket science - at least not on the level of advanced we are talking about here. There are several projects where individuals and small groups have created advanced processor designs in the past and no reason why there will not be new ones in the future.

Do you have a good understanding how a processor works internally? That is do you understand the different forms of dependencies and workarounds for them in a pipelined processor in a _practical_ system? Have you got good experience in reading and understanding VHDL? Do you have advanced FPGA design experience?

If you honestly respond "no" to any of the questions above then you frankly haven't got enough knowledge to estimate the complexity of doing a processor. And if you do I'd conclude that the codebase is a mess rather than it being a inherent feature of a processor design.

When many small groups of people have succeeded in making superscalar processors, deeply pipelined RISC processors and wide VLIW processors your basic statement can't be true. But that's not enough.
People have succeeded in making working out of order processors for FPGA designs - something I'd estimate being at least 2 orders of magnitude harder than making a simple processor. And there are several people that have done it themselves, years ago when FPGAs were smaller and less powerful than today. And at least two have done it as a hobby project alone in a relatively short time.

So not only is making a processor possible, it is common. High performance designs are less common but they are out there.

--
Please don't take this as a personal attack - it isn't intended to be. It _is_ intended to be an attack of the negativity of your post and that I stand for 100%.

Megol · 09 November 2017, 14:57

Quote:

Originally Posted by grond

Last time I checked the 68k required a 16 to 176 bit decoder.

For what? Now it was a while since I last looked at decoding 68k instructions but what I really wanted was an equivalent to a 16 input PAL for detecting misc. stuff.

grond · 09 November 2017, 15:09

Quote:

Originally Posted by Megol

There are several projects where individuals and small groups have created advanced processor designs in the past and no reason why there will not be new ones in the future.

How many of those were CISC ISAs with 2-operand code and complex ea computation?

Quote:

If you honestly respond "no" to any of the questions above then you frankly haven't got enough knowledge to estimate the complexity of doing a processor.

I do have to answer "no" to at least one of the questions but I still consider myself qualified to estimate the complexity of doing such a processor.

Quote:

When many small groups of people have succeeded in making superscalar processors, deeply pipelined RISC processors and wide VLIW processors your basic statement can't be true. But that's not enough.
People have succeeded in making working out of order processors for FPGA designs - something I'd estimate being at least 2 orders of magnitude harder than making a simple processor. And there are several people that have done it themselves, years ago when FPGAs were smaller and less powerful than today. And at least two have done it as a hobby project alone in a relatively short time.

Since you were one of the 68k-ISA visionaries on the natami-forum, I guess your FPGA-implemented 68k core is close to completion?

Quote:

Originally Posted by Megol

For what? Now it was a while since I last looked at decoding 68k instructions but what I really wanted was an equivalent to a 16 input PAL for detecting misc. stuff.

A 68k-instruction can have up to 22 bytes length which is why a decoder for a fast superscalar 68k is a completely different beast than that for some RISC. After all you'd want to decode more than one instruction per clock cycle (of course, you need not decode 22 bytes at once but just looking at 16 bit worth of instructions at a time would be far from being fast albei indeed simple).

fpgaarcade · 09 November 2017, 15:52

For the moment I am going for a simple pipeline and maximum clock speed.

A single core at 200MHz will usually outperform a twin superscaler at 100MHz, and is a significantly smaller and easier to maintain design. Perhaps your roadmap takes you to 4 or 8 execution units, but the dispatcher complexity will most likely limit your core clock rate even further.

Megol · 09 November 2017, 17:33

Quote:

Originally Posted by grond

How many of those were CISC ISAs with 2-operand code and complex ea computation? I do have to answer "no" to at least one of the questions but I still consider myself qualified to estimate the complexity of doing such a processor.

EA = base +index*scale +offset32 = base+ (index<<n) +offset32

3 input addition with a (small) shift isn't complex? Do you mean that it can be described in several ways or that the base can be updated?
The base being updated means adding a dedicated addition/subtraction unit and making sure the register file can support multiple writes per cycle.

The problem isn't the 2 operand ISA nor that it's complex but that the instruction format is complex. Or actually not even that but that there's so many instruction formats - there are special cases everywhere.

But decoding is one part of the pipeline, something that while important (of course!) isn't really determining the complexity of doing a processor.

The 68080/Apollo seem to use a relatively simple pipelined design and doing many things in a few pipestages. That makes things more complicated.

But that's their choice.

Quote:

Since you were one of the 68k-ISA visionaries on the natami-forum, I guess your FPGA-implemented 68k core is close to completion?

LOL!
If you'd read anything I wrote you'd know that I dropped any work on a 68k design ages ago with the only effort spent nowadays being some hour thinking about some specific aspect of it occasionally.

And I've given the reasons several times before - lack of interest, lack of a realistic project for cooperation and more importantly lack of concentration. I have severe problems with working memory. That makes some aspects of design take to long too be worth it.

If you tried to insult me or implying I don't know what I'm talking about well... Not impressed, you could do better. Like claiming that my AGU design wasn't real as someone else once did.

Quote:

A 68k-instruction can have up to 22 bytes length which is why a decoder for a fast superscalar 68k is a completely different beast than that for some RISC. After all you'd want to decode more than one instruction per clock cycle (of course, you need not decode 22 bytes at once but just looking at 16 bit worth of instructions at a time would be far from being fast albei indeed simple).

I don't see why the maximum length is interesting. Length _is_ interesting especially in a superscalar design but maximum length isn't a problem.

In one clock cycle one need to decode the length of one instruction. The things that can vary are the size of the immediate field, the size of extended EA words. I'll ignore 32 bit instructions and also instructions with two addresses.

Immediate size and if there is at least one extended EA word (brief or full expansion word) can be seen in the first 16 bits of an instruction.
The problem then becomes to determine if there's an additional EA word, where it is located and to decode the the number of words following that.
This requires decoding (instruction) muxing (extension word) followed by decoding (extension word) which is costly.

For a decoder to be able to decode two full 68k instructions it becomes even more complicated as the above have to be done in a chain. It'd be easier to support a full instruction in the first "slot" and a simpler one in the second "slot". That's AFAIK the way the 68080 design does it? No documentation available of course and facts are prone to change over there.

Decoding being done over several cycles isn't a problem assuming there's a good branch predictor available - sub percentage performance hit at most.

The 68080 may use some form of pre-decode data to reduce work in the critical path, Gunnar have claimed to use it before but again things change over there so it may not currently be true.

I'm realizing that this is getting more and more OT, sorry. If someone want to discuss this open up a new thread.

Megol · 10 November 2017, 12:15

I think this is a good announcement even if not great. There are problems finding '60 CPUs and those that can be found are commonly salvage from old hardware and of unknown reliability. Remarking of 68060 chips are also common.

There is a need for a high performance 68k processor and this cut-down Apollo core can provide that. But not for free - it is a closed source design opening up the problem of future availability. And it isn't 100% 68060 compatible as there are no FPU and no MMU, that they may be provided in a theoretical future redesign doesn't matter - as is a 68060 compatible chip without MMU means system using memory protection can't use it.

So in short this is good filling an existing availability hole but it can also be seen as blocking potential efforts to make an open source 68k design being truly 68060 compatible. Why do something when there's a reasonable alternative already existing?

But reality beats theory every time: thanks Gunnar and co. , didn't see this coming!

Kelv · 10 November 2017, 18:12

With the limitations would this mean that Aros68K would run at a reasonable speed?

Would be nice to see a useful common base platform.
I'm assuming this is being driven by a need to get Apollo-core specific extensions into compilers and supported by devs. etc.
This could be a good thing as long as it to everyone's advantage.
Obviously the premium performance would be from Apollo hardware, but it depends on the requirements imposed for OS hardware implementations i.e. would it end up costing nearly as much as just buying an Apollo-core product to produce..( those stand alone's at Amiga32 looked sweet. )

wawa · 11 November 2017, 01:54

aros68k has been demoed to run at reasonable speed with the native rtg graphics and the like but nothing in this respect has been commited bask to aros repo. so the state of ithis is unknown.

grond · 11 November 2017, 10:31

Quote:

Originally Posted by wawa

aros68k has been demoed to run at reasonable speed with the native rtg graphics and the like but nothing in this respect has been commited bask to aros repo. so the state of ithis is unknown.

I think this speed-up was caused entirely by the P96 driver for the Vampire which in itself is not AROS-specific.

wawa · 11 November 2017, 12:48

im not sure if aros68k vampire aros driver is a native hidd or it is the same p96 driver that is used for the genuine os, in case of aros accessed via p96 wapper.

perhaps speedup (and stability) might have been achieved avoiding the wrapper, which still have some quirks afair. opening further screens with limited ram resources as example.

one way or the other more communication could be helpful here, not necessarily sharing the code, if there is no wish to do so, aros license doesnt even require it afaik, just to know where we are heading and to coordinate efforts. just saying, aros rtg subsystem or drivers are rubbish doesnt lead anywhere particularly.

btw.is michael still active? the driver released or is it in limbo?

grond · 11 November 2017, 13:09

I'm pretty sure it was just the new P96 driver that was used with AROS in those videos without any modification to adapt it to AROS. This is a driver written by flype and buggs. If it is still closed-source, this would be because it isn't complete yet.

wawa · 11 November 2017, 14:40

i was under impression aros driver was in works by mness. but he hasnt posted on dev ml for some time nor responded. if its just p96 driver then its fine. it can stay closed. so, can you enlighten me?

grond · 11 November 2017, 15:11

I haven't seen mness on IRC in some time. He was trying to fix the IDE driver issue. I think he fixed it in some way to make it work better but it is still not as reliable as it should be. Most of all its reliability still seems to depend a lot on timings which seems to indicate that the root cause of the problem hasn't been found yet.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Best way to do SuperAGA in Apollo core	eXeler0	Amiga scene	64	27 February 2016 19:17
Minimig 060 AGA core on FPGA-Arcade	Amiga1992	Retrogaming General Discussion	47	01 January 2016 01:25
apollo-core forum	HanSolo	support.Other	4	16 September 2015 07:51
F1Gp-ED with free license key	heerschop	support.Apps	26	16 October 2012 21:00
Help getting ACT Apollo 060 / Viper 060 and CS060Mk2 - working with ClassicWB ADV	Zetr0	support.Other	30	23 March 2010 15:39

09 November 2017, 15:52	#25
fpgaarcade Registered User Join Date: Jul 2009 Location: sweden Posts: 30	For the moment I am going for a simple pipeline and maximum clock speed. A single core at 200MHz will usually outperform a twin superscaler at 100MHz, and is a significantly smaller and easier to maintain design. Perhaps your roadmap takes you to 4 or 8 execution units, but the dispatcher complexity will most likely limit your core clock rate even further.

10 November 2017, 12:15	#27
Megol Registered User Join Date: May 2014 Location: inside the emulator Posts: 377	I think this is a good announcement even if not great. There are problems finding '60 CPUs and those that can be found are commonly salvage from old hardware and of unknown reliability. Remarking of 68060 chips are also common. There is a need for a high performance 68k processor and this cut-down Apollo core can provide that. But not for free - it is a closed source design opening up the problem of future availability. And it isn't 100% 68060 compatible as there are no FPU and no MMU, that they may be provided in a theoretical future redesign doesn't matter - as is a 68060 compatible chip without MMU means system using memory protection can't use it. So in short this is good filling an existing availability hole but it can also be seen as blocking potential efforts to make an open source 68k design being truly 68060 compatible. Why do something when there's a reasonable alternative already existing? But reality beats theory every time: thanks Gunnar and co. , didn't see this coming!

10 November 2017, 18:12	#28
Kelv Registered User Join Date: Apr 2017 Location: Bradford / UK Posts: 19	With the limitations would this mean that Aros68K would run at a reasonable speed? Would be nice to see a useful common base platform. I'm assuming this is being driven by a need to get Apollo-core specific extensions into compilers and supported by devs. etc. This could be a good thing as long as it to everyone's advantage. Obviously the premium performance would be from Apollo hardware, but it depends on the requirements imposed for OS hardware implementations i.e. would it end up costing nearly as much as just buying an Apollo-core product to produce..( those stand alone's at Amiga32 looked sweet. )

11 November 2017, 01:54	#29
wawa Registered User Join Date: Aug 2007 Location: berlin/germany Posts: 1,054	aros68k has been demoed to run at reasonable speed with the native rtg graphics and the like but nothing in this respect has been commited bask to aros repo. so the state of ithis is unknown.

11 November 2017, 12:48	#31
wawa Registered User Join Date: Aug 2007 Location: berlin/germany Posts: 1,054	im not sure if aros68k vampire aros driver is a native hidd or it is the same p96 driver that is used for the genuine os, in case of aros accessed via p96 wapper. perhaps speedup (and stability) might have been achieved avoiding the wrapper, which still have some quirks afair. opening further screens with limited ram resources as example. one way or the other more communication could be helpful here, not necessarily sharing the code, if there is no wish to do so, aros license doesnt even require it afaik, just to know where we are heading and to coordinate efforts. just saying, aros rtg subsystem or drivers are rubbish doesnt lead anywhere particularly. btw.is michael still active? the driver released or is it in limbo?

11 November 2017, 13:09	#32
grond Registered User Join Date: Jun 2015 Location: Germany Posts: 1,918	I'm pretty sure it was just the new P96 driver that was used with AROS in those videos without any modification to adapt it to AROS. This is a driver written by flype and buggs. If it is still closed-source, this would be because it isn't complete yet.

11 November 2017, 14:40	#33
wawa Registered User Join Date: Aug 2007 Location: berlin/germany Posts: 1,054	i was under impression aros driver was in works by mness. but he hasnt posted on dev ml for some time nor responded. if its just p96 driver then its fine. it can stay closed. so, can you enlighten me?

11 November 2017, 15:11	#34
grond Registered User Join Date: Jun 2015 Location: Germany Posts: 1,918	I haven't seen mness on IRC in some time. He was trying to fix the IDE driver issue. I think he fixed it in some way to make it work better but it is still not as reliable as it should be. Most of all its reliability still seems to depend a lot on timings which seems to indicate that the root cause of the problem hasn't been found yet.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)