Blithog behavior

ovale · 10 January 2015, 23:32

Hello,

The HRM says "If DMAF_BLITHOG is a 0, the DMA manager will monitor the 68000 cycle requests. If the 68000 is unsatisfied for three consecutive memory cycles, the blitter will release the bus for one cycle."

Assuming
5 bitplanes are active
a blit with 4 channels is active (no idle cycles)
BLITHOG is zero
Code in chipram

Then, how a movem.w to chip ram is sequenced?

Is this sequence correct?

vvvbvrvbvvvrvbvbvvvwvbvbvvvw...
v = video DMA
b = Blitter DMA
r = CPU read access
w = CPU write access

I ask because this behavior seems really penalizing the 68k compared to the Blitter.

Thanks
Ovale

mc6809e · 11 January 2015, 02:46

Ideally the blitter would always yield to the CPU since the CPU if need be can always be stopped with a STOP instruction.

Perhaps there was some technical reason for the three cycle rule. At least two cycles seem to be required since the CPU asserts *AS around the time Agnus is setting things up for a transfer on the next cycle and the next cycle itself must complete before letting the CPU back in.

Still, I don't think the CPU is too heavily penalized.

Toni Wilen · 11 January 2015, 11:52

I checked using my logic analyzer and it seems to always repeat same sequence after it has "synced":

B4C2B351 C4B2B351 C4B2B351 C4B2B351

(B = blitter, C = CPU, number = plane)

Logic which I have used in emulation seems to still work: if CPU has waited at least 4 cycles: give any future blitter cycle to CPU. Reset counter when CPU gets any free cycle.

ovale · 11 January 2015, 15:05

Thanks for the answers.

Assuming, this time, the code is in fast ram. The 68k will continue to get 3 wait states (6 DMA cycle) for every write.

Is there any way to use fast ram and prefetch to gain better parallelism between plain 68k and Blitter when copying data to chipram?

I guess no, but I'm not an expert so better to ask

mc6809e · 12 January 2015, 06:46

Intentionally using some of the slower channel combinations might improve concurrency in some situations and improve overall speed.

For example, a B->D copy has one idle cycle. These idle cycles combined with BLIT_HOG turned off will allow the CPU to touch chipram five times for every six blitter accesses during bitplane fetch of a five bitplane display.

(BTW, technically every DMA cycle that blocks the CPU introduces two waits.)

ovale · 12 January 2015, 08:05

Yes, I thought to Blitter idle cycles too but I hoped for some 68k trick.<br />
<br />
By the way, is there any way to obtain idle cycles when using all the 4 Blitter channels?<br />
<br />
Thanks

Edit: I mean, I hoped that some of the techniques used in c2p routines could be used in this case too

10 January 2015, 23:32	#1
ovale Registered User Join Date: Jun 2014 Location: milan / italy Posts: 174	Blithog behavior Hello, The HRM says "If DMAF_BLITHOG is a 0, the DMA manager will monitor the 68000 cycle requests. If the 68000 is unsatisfied for three consecutive memory cycles, the blitter will release the bus for one cycle." Assuming 5 bitplanes are active a blit with 4 channels is active (no idle cycles) BLITHOG is zero Code in chipram Then, how a movem.w to chip ram is sequenced? Is this sequence correct? vvvbvrvbvvvrvbvbvvvwvbvbvvvw... v = video DMA b = Blitter DMA r = CPU read access w = CPU write access I ask because this behavior seems really penalizing the 68k compared to the Blitter. Thanks Ovale

12 January 2015, 08:05	#6
ovale Registered User Join Date: Jun 2014 Location: milan / italy Posts: 174	Yes, I thought to Blitter idle cycles too but I hoped for some 68k trick.<br /> <br /> By the way, is there any way to obtain idle cycles when using all the 4 Blitter channels?<br /> <br /> Thanks Edit: I mean, I hoped that some of the techniques used in c2p routines could be used in this case too Last edited by ovale; 12 January 2015 at 17:40.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Strange bitmap behavior	losso	support.WinUAE	2	31 December 2013 12:42
Floppy weird behavior	desantii	support.Hardware	20	03 August 2010 21:03
Very strange SFS behavior.	Thorham	support.Apps	26	17 October 2009 15:04
discovered strange behavior	NfernalNfluence	support.WinUAE	7	26 May 2009 08:10
Strange behavior in A4000	Computolio	support.Hardware	8	22 September 2007 12:39

11 January 2015, 02:46	#2
mc6809e Registered User Join Date: Jan 2012 Location: USA Posts: 372	Ideally the blitter would always yield to the CPU since the CPU if need be can always be stopped with a STOP instruction. Perhaps there was some technical reason for the three cycle rule. At least two cycles seem to be required since the CPU asserts *AS around the time Agnus is setting things up for a transfer on the next cycle and the next cycle itself must complete before letting the CPU back in. Still, I don't think the CPU is too heavily penalized.

11 January 2015, 11:52	#3
Toni Wilen WinUAE developer Join Date: Aug 2001 Location: Hämeenlinna/Finland Age: 49 Posts: 26,502	I checked using my logic analyzer and it seems to always repeat same sequence after it has "synced": B4C2B351 C4B2B351 C4B2B351 C4B2B351 (B = blitter, C = CPU, number = plane) Logic which I have used in emulation seems to still work: if CPU has waited at least 4 cycles: give any future blitter cycle to CPU. Reset counter when CPU gets any free cycle.

11 January 2015, 15:05	#4
ovale Registered User Join Date: Jun 2014 Location: milan / italy Posts: 174	Thanks for the answers. Assuming, this time, the code is in fast ram. The 68k will continue to get 3 wait states (6 DMA cycle) for every write. Is there any way to use fast ram and prefetch to gain better parallelism between plain 68k and Blitter when copying data to chipram? I guess no, but I'm not an expert so better to ask

12 January 2015, 06:46	#5
mc6809e Registered User Join Date: Jan 2012 Location: USA Posts: 372	Intentionally using some of the slower channel combinations might improve concurrency in some situations and improve overall speed. For example, a B->D copy has one idle cycle. These idle cycles combined with BLIT_HOG turned off will allow the CPU to touch chipram five times for every six blitter accesses during bitplane fetch of a five bitplane display. (BTW, technically every DMA cycle that blocks the CPU introduces two waits.)

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)