AGA integration in Apollo core : work in progress - Page 13

meynaf · 29 June 2017, 18:17

Quote:

Originally Posted by grond

Then tell us the good reason. I will: transistors were sparse and nobody could imagine having two or more CPUs on a single chip.

Again trying to argue against today's technology with 1980s technical arguments.

Then tell me why todays machines, while being multi-core, still use DMA.
It is not 1980s technical arguments.

Quote:

Originally Posted by grond

And again zero sustainable technical arguments from you.

I will try to ignore this.

Quote:

Originally Posted by Signman

Looks like somebody is mad because he was kicked out of the treehouse.
Give back your decoder ring.

Leaving was my choice.

Quote:

Originally Posted by DamienD

Again, if you cannot get along I will just close the thread as I don't have hours free to deal with this constant shite.

I have the strange desire of writing "yes, please do" here. How odd

Quote:

Originally Posted by DamienD

Last warning to all involved, keep on topic without name calling / fighting etc. We as Global Mods have discussed this, if it doesn't stop then we will start giving repeat offender 1 / 2 week bans; unfortunately nothing else seems to work...

This is fair.

Quote:

Originally Posted by grond

That's obviously the plan: if you can't stop the product, you can at least stop the threads about it.

Apparently not, as more are open all the time

grond · 29 June 2017, 18:26

Quote:

Originally Posted by meynaf

Then tell me why todays machines, while being multi-core, still use DMA.
It is not 1980s technical arguments.

Because today's machines run multi-processor multi-threading operating systems and can put an additional CPU core to better uses than just copying data around. Not an 1980s technical argument?

Samurai_Crow · 29 June 2017, 18:41

Quote:

Originally Posted by grond

Because today's machines run multi-processor multi-threading operating systems and can put an additional CPU core to better uses than just copying data around. Not an 1980s technical argument?

Adding to this, the Intel i-series uses both multicore and hyperthreading techniques. A quad-core i3 has eight threads.

Gorf · 29 June 2017, 18:53

Quote:

Originally Posted by grond

Because today's machines run multi-processor multi-threading operating systems and can put an additional CPU core to better uses than just copying data around. Not an 1980s technical argument?

That is one reason - the main reason is still memory-bandwidth. You want your CPUs to do as few memory operations as possible, since moving something that is not in cache is still horribly slow. The problem increases of course with more cores/threads.

Gorf · 29 June 2017, 19:01

Quote:

Originally Posted by grond

Then tell us the good reason. I will: transistors were sparse and nobody could imagine having two or more CPUs on a single chip.

Again trying to argue against today's technology with 1980s technical arguments. And again zero sustainable technical arguments from you.

You asked about DMA vs. virtual Core.

For just one specific task, there is not so much difference: you can use a DMA to offload the work or a second CPU.
But since hyper threading does not exactly double your speed, you will slow down the other thread a little bit while doing so.

It becomes more complicated if you want to offload more than one task.
You can have a dedicated DMA-controller for each purpose und things will get done without bothering the CPU.

If you want to do different DMA-tasks with one thread of your HT-CPU ... well than you end up needing a extra scheduler for this or some support in the OS or both.

daxb · 29 June 2017, 19:02

@Overflow:
Thanks for post #215

How does Vampire/Apollo IDE work? How much CPU time is used? Isn`t the CPU much to slow to have fast IDE? My FastATA A1200 uses nearly all CPU time to be "fast" (~8MB/sec.). As far as I`ve heard the old DMA or UDMA standard don`t use much CPU time. Will the standalone get a decent SATA controller so we get standard transfer speed?

MMU is a basic part of most 68k CPUs (030, 040, 060). Apollo is or will be 680x0 compatible. Hence it isn`t unusual to assume ther will be one. Same goes for FPU but later nothing is promised after people noticed and asked. What you get is only a gift is the answer. Of course this leads/spawns to discussions.

@OlafSch:
What is an MMU application or what do you think what it is? As far as I know MMU is a hardware unit/technic (kind of layer between CPU and memory) that can be used or not. So application can only make use of it. You can program MMU direct or use mmu.library to access it. So MMU application does not exists.

That a MMU slows down much I can`t confirm. In fact I don`t notice it with 040/40 and mmu.library/MuForce running. If a MMU unit slows down Apollo core much then the FPGA is two small/slow or the design is not good enough and would need improvement/changes. Overall I`m the opinion that speed is not everything.

If someone says that MMU is only useful for developers than it is wrong. Maybe I`m the only one but I`m not a developer but use tools that make use of MMU.

Gorf · 29 June 2017, 19:15

About the Hyper-Thingy

I don't think HT on Apollo/Vampire is a bad idea. I fact I was asking for more cores in the fpga myself. HT is at least a small step towards more cores.

What for? For one there is AROS-SMP, that might possibly be adapted ... once it is working stable enough.

And there is Sandboxing. But of course this would need a exposed MMU.
But than it would allow to run something like a second instance of Exec in a Sandbox on the second core - very nice for development and testing, but also for security.

But this wish was low priority - FPU and MMU would have been much more important I think.
Sure, this is up to Gunnar - but I can understand that some people might be a little disappointed.

meynaf · 29 June 2017, 19:26

Quote:

Originally Posted by Gorf

And there is Sandboxing. But of course this would need a exposed MMU.
But than it would allow to run something like a second instance of Exec in a Sandbox on the second core - very nice for development and testing, but also for security.

I see no reason why sandboxing would need a second core ?

Lord Aga · 29 June 2017, 19:27

Quote:

Originally Posted by StingRay

Well, WHDLoad uses the MMU quite extensively!

What does it use it for? Honest question, I have no idea.
I had an accelerated system without a MMU, and now I have one with a MMU, and I didn't notice any difference with WHDLoad.

Gorf · 29 June 2017, 19:34

Quote:

Originally Posted by meynaf

I see no reason why sandboxing would need a second core ?

You are right of course. I just needs a MMU.
But it would be also a way to make use of a second (or more) cores on AmigaOS, without braking compatibility:

Take a raytracer or a decoding datatype and offload portions of task to different sandboxes and cores - fetch the data when finished.

Amiga1992 · 29 June 2017, 19:37

Quote:

Originally Posted by Lord Aga

What does it use it for? Honest question, I have no idea.

Mostly developer stuff, not stuff one would notice as a user/gamer.
http://whdload.de/docs/en/mmu.html

Quote:

it uses the MMU for memory protection, cache management and some special features like Snooping and resload_Protect#?.

Gorf · 29 June 2017, 19:40

Quote:

Originally Posted by Akira

Mostly developer stuff, not stuff one would notice as a user/gamer.
http://whdload.de/docs/en/mmu.html

Yes.

If you get lucky, the protection the MMU offers, will allow you to recover from a crash within whdload without rebooting.

grond · 29 June 2017, 20:05

Well, the 080 core has normal DMA (start address, destination address, number of bytes, pull trigger, go) already in case facts about the present state are of any interest here. DMA should go through the cache for coherency such that there is again no real difference when comparing to a second CPU. Just try to see the 2nd thread as a flexibly programmable DMA controller and one that uses all resources that are currently not used by the main thread: it's completely for free! You could also use it as a software blitter doing pixel format conversion on-the-fly and much more.

meynaf · 29 June 2017, 20:24

What happens if both threads need the same resource at the same time ?

Gorf · 29 June 2017, 20:31

Quote:

Originally Posted by grond

Well, the 080 core has normal DMA (start address, destination address, number of bytes, pull trigger, go) already in case facts about the present state are of any interest here. DMA should go through the cache for coherency such that there is again no real difference when comparing to a second CPU. Just try to see the 2nd thread as a flexibly programmable DMA controller and one that uses all resources that are currently not used by the main thread: it's completely for free! You could also use it as a software blitter doing pixel format conversion on-the-fly and much more.

As I pointed out already: it is not completely for free:

On a real second core it would occupy some memory bandwidth - so you have to code and schedule things very carefully, so nothing gets slowed down.
This is a problem you can observe on every multicore system out there.

It gets worse if you just have two virtual cores through hyper-something:
the second thread will slow down the first one, since not everything can be absorbed just by a multiscalar design.

Samurai_Crow · 29 June 2017, 20:35

Quote:

Originally Posted by meynaf

What happens if both threads need the same resource at the same time ?

The main thread takes precedence over the slave thread.

Gorf · 29 June 2017, 20:41

Quote:

Originally Posted by Samurai_Crow

The main thread takes precedence over the slave thread.

That does not help if, and this is the common case, both threads need the same functional units in the core.
These units are of course limited - there may even be some redundancy on an multiscalar CPU, but once all the units needed are assigned to the operations in the pipeline: one thread has to wait.

Lord Aga · 29 June 2017, 20:52

Quote:

Originally Posted by Akira

Mostly developer stuff, not stuff one would notice as a user/gamer.
http://whdload.de/docs/en/mmu.html

Quote:

Originally Posted by Gorf

If you get lucky, the protection the MMU offers, will allow you to recover from a crash within whdload without rebooting.

Oh, that's cool, thanks guys. It's not a huge deal, but I guess it's better to have it than to not have it

TrashyMG · 29 June 2017, 20:54

That said, I've never encountered any hard crashes or reboots with WHDload on any of the games I've played on my Vampired A600. Of course haven't tried any AGA games yet, waiting for the Gold 3 core for that.

pandy71 · 29 June 2017, 20:54

Short: Meynaf is right,
Long: DMA can provide time deterministic response for time critical event which is usually beyond capabilities of the modern CPU's (unless special CPU's design) nowadays DMA are usually provided with basic data processing (so they can be considered as special case CPU) - they can even assist decoding for modern video codecs such as H.264 or detect particular bitstream errors (classical examples are DMA channels frequently called FDMA - Flexible DMA in modern multimedia players SoC's).
With clever (proper) system design - DMA may use cycles wasted by CPU or may use available bus time significantly more efficient than CPU.
General CPU usually perform code with lot of conditional cases where DMA is usually block oriented wit relatively simple data flow (without conditional or conditional are highly limited - error detection etc).
DMA can be emulated by CPU but usually at a cost of additional cycles.
Bus access is always problem as such you need good architectural design for any concurrent access arbitration (so good bus arbiter is crucial from overall system performance perspective).
Good illustration for this is general trend in modern CPU's (those with long queue length) is to avoid interrupts in a favor of pooling...

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
vasm with Apollo Core 68080 and AMMX support	phx	News	11	17 February 2017 23:22
Apollo core and AROS 68k bounty	TuKo	Amiga scene	23	05 August 2016 20:25
Best way to do SuperAGA in Apollo core	eXeler0	Amiga scene	64	27 February 2016 19:17
apollo-core forum	HanSolo	support.Other	4	16 September 2015 07:51
Work in progress.	Cowcat	Coders. General	7	18 February 2014 22:33

29 June 2017, 19:02	#246
daxb Registered User Join Date: Oct 2009 Location: Germany Posts: 3,303	@Overflow: Thanks for post #215 How does Vampire/Apollo IDE work? How much CPU time is used? Isn`t the CPU much to slow to have fast IDE? My FastATA A1200 uses nearly all CPU time to be "fast" (~8MB/sec.). As far as I`ve heard the old DMA or UDMA standard don`t use much CPU time. Will the standalone get a decent SATA controller so we get standard transfer speed? MMU is a basic part of most 68k CPUs (030, 040, 060). Apollo is or will be 680x0 compatible. Hence it isn`t unusual to assume ther will be one. Same goes for FPU but later nothing is promised after people noticed and asked. What you get is only a gift is the answer. Of course this leads/spawns to discussions. @OlafSch: What is an MMU application or what do you think what it is? As far as I know MMU is a hardware unit/technic (kind of layer between CPU and memory) that can be used or not. So application can only make use of it. You can program MMU direct or use mmu.library to access it. So MMU application does not exists. That a MMU slows down much I can`t confirm. In fact I don`t notice it with 040/40 and mmu.library/MuForce running. If a MMU unit slows down Apollo core much then the FPGA is two small/slow or the design is not good enough and would need improvement/changes. Overall I`m the opinion that speed is not everything. If someone says that MMU is only useful for developers than it is wrong. Maybe I`m the only one but I`m not a developer but use tools that make use of MMU.

29 June 2017, 19:15	#247
Gorf Registered User Join Date: May 2017 Location: Munich/Bavaria Posts: 2,294	About the Hyper-Thingy I don't think HT on Apollo/Vampire is a bad idea. I fact I was asking for more cores in the fpga myself. HT is at least a small step towards more cores. What for? For one there is AROS-SMP, that might possibly be adapted ... once it is working stable enough. And there is Sandboxing. But of course this would need a exposed MMU. But than it would allow to run something like a second instance of Exec in a Sandbox on the second core - very nice for development and testing, but also for security. But this wish was low priority - FPU and MMU would have been much more important I think. Sure, this is up to Gunnar - but I can understand that some people might be a little disappointed.

29 June 2017, 20:05	#253
grond Registered User Join Date: Jun 2015 Location: Germany Posts: 1,918	Well, the 080 core has normal DMA (start address, destination address, number of bytes, pull trigger, go) already in case facts about the present state are of any interest here. DMA should go through the cache for coherency such that there is again no real difference when comparing to a second CPU. Just try to see the 2nd thread as a flexibly programmable DMA controller and one that uses all resources that are currently not used by the main thread: it's completely for free! You could also use it as a software blitter doing pixel format conversion on-the-fly and much more.

29 June 2017, 20:24	#254
meynaf son of 68k Join Date: Nov 2007 Location: Lyon / France Age: 51 Posts: 5,323	What happens if both threads need the same resource at the same time ?

29 June 2017, 20:54	#259
TrashyMG Registered User Join Date: Nov 2016 Location: Vermont - USA Posts: 44	That said, I've never encountered any hard crashes or reboots with WHDload on any of the games I've played on my Vampired A600. Of course haven't tried any AGA games yet, waiting for the Gold 3 core for that.

29 June 2017, 20:54	#260
pandy71 Registered User Join Date: Jun 2010 Location: PL? Posts: 2,748	Short: Meynaf is right, Long: DMA can provide time deterministic response for time critical event which is usually beyond capabilities of the modern CPU's (unless special CPU's design) nowadays DMA are usually provided with basic data processing (so they can be considered as special case CPU) - they can even assist decoding for modern video codecs such as H.264 or detect particular bitstream errors (classical examples are DMA channels frequently called FDMA - Flexible DMA in modern multimedia players SoC's). With clever (proper) system design - DMA may use cycles wasted by CPU or may use available bus time significantly more efficient than CPU. General CPU usually perform code with lot of conditional cases where DMA is usually block oriented wit relatively simple data flow (without conditional or conditional are highly limited - error detection etc). DMA can be emulated by CPU but usually at a cost of additional cycles. Bus access is always problem as such you need good architectural design for any concurrent access arbitration (so good bus arbiter is crucial from overall system performance perspective). Good illustration for this is general trend in modern CPU's (those with long queue length) is to avoid interrupts in a favor of pooling...

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)