English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 28 July 2014, 21:46   #1
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
Under which circumstances is an extra-read required?

...before relying on the value.

Specifically: The extra read before relying on the result of blitter busy flag. You know, this one.

Code:
	tst.w DMACONR
.l:	btst #6,DMACONR+1
	bne.s .l
1) What is the actual reason and on which models is it required?
2) Is it required if the code is running in chipmem? (Such code works fine without extra read on A1200-060, with FMODE=0.)
3) Why isn't it required for VHPOS, INTREQ, etc? (also pollable regs set by custom chips)
Photon is offline  
Old 28 July 2014, 21:48   #2
Galahad/FLT
Going nowhere
 
Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,017
Its a bug in some German A1000's isn't it?
Galahad/FLT is offline  
Old 28 July 2014, 21:48   #3
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
That's the original reason, at least, from the HRM.
Photon is offline  
Old 28 July 2014, 23:15   #4
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,545
IIRC the A1000 Agnus sets the BBUSY bit only when the first Blitter DMA slot after a Blitter start is reached. There might be situations where the CPU is fast enough to check BBUSY before that happens (Fast RAM?).

Later Agnus versions (Fat Agnus?) set the BBUSY bit at the same time the BLTSIZE is written.

Toni Wilen knows the details for sure...
phx is offline  
Old 29 July 2014, 01:37   #5
Lonewolf10
AMOS Extensions Developer
 
Lonewolf10's Avatar
 
Join Date: Jun 2007
Location: near Cambridge, UK
Age: 44
Posts: 1,924
Quote:
Originally Posted by phx View Post
Toni Wilen knows the details for sure...
Yes he does.

I'm sure this was only discussed a few weeks back, though it may have been when a thread topic drifted a little so it's not something easily found.

Perhaps this thread should be made sticky??
Lonewolf10 is offline  
Old 29 July 2014, 02:01   #6
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
Quote:
Originally Posted by phx View Post
IIRC the A1000 Agnus sets the BBUSY bit only when the first Blitter DMA slot after a Blitter start is reached. There might be situations where the CPU is fast enough to check BBUSY before that happens (Fast RAM?).

Later Agnus versions (Fat Agnus?) set the BBUSY bit at the same time the BLTSIZE is written.

Toni Wilen knows the details for sure...
This theory isn't true, because if it were about speed only, tsti'ng any read-only custom register would suffice. The solution specifically states to read DMACONR once before relying on its contents.

I've tipped off Toni in the hope that his probe will reveal the truth.
Photon is offline  
Old 29 July 2014, 08:33   #7
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,573
I think it was simply much easier to document it as "read DMACONR" twice than some long and boring explanation which everyone would ignore or misunderstand anyway
Toni Wilen is online now  
Old 29 July 2014, 09:49   #8
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
I'm more interested in how it works than how it was documented, unless the documentation is perfect and complete of course, and we're done here?

The goal is of course to remove as many unnecessary extra reads as possible (since blits are often in inner loops) but stay compatible under the right circumstances. F.ex., it doesn't seem required due to CPU speed or required due to prefetch.
Photon is offline  
Old 29 July 2014, 11:11   #9
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,545
In fact you can access any custom chip register to generate the required delay. For example in my WAITBLIT I'm setting the blitter priority bit before checking BBUSY, which has the same effect. So it comes for free.
Code:
        macro   WAITBLIT
        move.w  #$8400,DMACON(a6)
.\@:    btst    #6,DMACONR(a6)
        bne.b   .\@
        move.w  #$0400,DMACON(a6)
        endm
phx is offline  
Old 29 July 2014, 13:14   #10
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
phx: never heard of. Tested on such A1000s?

It's not about generating delay. A1000 accelerator=no blit works? Don't think so!

Plus it's slower so it's not a very good replacement
Photon is offline  
Old 29 July 2014, 20:02   #11
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
It is about generating delay. Everything suggests it's exactly what it looks like, that BBUSY is not set until the Blitter performs its first memory access.

Adding an accelerator to the system doesn't change the delay introduced by touching the hardware registers, it doesn't work that way. The bus where the hardware sits can't serve requests at infinite speed, and there's always a minimum delay introduced, regardless of whether you are running on a 7 MHz 68000 or a 100 MHz 68060.

Also, when you say phx's method of waiting for the Blitter would be slower, is that because of those extra move instructions before and after, and because you always keep Blitter priority disabled anyway?
Leffmann is offline  
Old 29 July 2014, 20:11   #12
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
I have wondered if one could just do this:

Code:
        macro   WAITBLIT
        move.w  #$8400,DMACON(a6)
        move.w  #$0400,DMACON(a6)
        endm
The second instruction should stall the CPU until the blitter finishes anyway.
Mrs Beanbag is offline  
Old 29 July 2014, 20:56   #13
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
The blitter doesn't stall the CPU, it takes cycle slots on the chipmem bus. BLTPRI=1 with a clear-blit only takes half of them. There are also gaps in the slot sequence at the beginning and end of a blit, as the blitter waits for the source/dest channel's assigned slot to come up for loading/unloading a word.

This would make phx's macro fail on these A1000s if it's not as he claims but as it's written in HRM - the btst will have been reached a few words into the clear-blit without triggering the desired effect inside the old Agnus.

Your macro is not a waitblit macro.

For both macros, if BLTPRI=0 when called, BLTPRI will be changed after the blit is started. I don't know if the blitter reads BLTPRI in the middle of a blit.
Photon is offline  
Old 29 July 2014, 21:14   #14
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,545
Quote:
Originally Posted by Photon View Post
The blitter doesn't stall the CPU, it takes cycle slots on the chipmem bus.
Right. MrsBeanbag's macro might work in a Chip-RAM-only system, when all four Blitter DMA channels are busy. But I wouldn't count on that.

Quote:
This would make phx's macro fail on these A1000s if it's not as he claims but as it's written in HRM - the btst will have been reached a few words into the clear-blit without triggering the desired effect inside the old Agnus.
As I understand writing to DMACON lets the CPU wait for the next free Chip RAM access cycle in the same way as reading DMACONR would.

Would be interesting to test that. Solid Gold uses my macro. Does anybody own an A1000 with at least 1MB RAM? Mine has only 512K.

Quote:
I don't know if the blitter reads BLTPRI in the middle of a blit.
AFAIK it does. Otherwise the whole macro makes no sense. But please correct me, anybody.
phx is offline  
Old 29 July 2014, 21:22   #15
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,573
Quote:
Originally Posted by phx View Post
Right. MrsBeanbag's macro might work in a Chip-RAM-only system, when all four Blitter DMA channels are busy. But I wouldn't count on that.
There are demos that does something like that.. It does work if nasty bit is set and blitter cycle sequence uses all cycles (no idle cycles), D=A copy blit also uses all cycles.

Quote:
As I understand writing to DMACON lets the CPU wait for the next free Chip RAM access cycle in the same way as reading DMACONR would.
Any custom chipset access (read or write) or chipram access does work in place of "dummy" DMACONR read.

Quote:
AFAIK it does. Otherwise the whole macro makes no sense. But please correct me, anybody.
It does work in real-time. This method is not that rare, I have seen it used in some games and demos.
Toni Wilen is online now  
Old 29 July 2014, 21:28   #16
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
Mrs. Beanbag's macro doesn't check BBUSY. If the very next instruction writes to a blitter register, I'm sure it would mess the blit up, because of the gaps in the slot sequence when assigned channel slots are reading their first input words, even for four-channel blits.

Quote:
Originally Posted by phx View Post
Right. MrsBeanbag's macro might work in a Chip-RAM-only system, when all four Blitter DMA channels are busy. But I wouldn't count on that.

As I understand writing to DMACON lets the CPU wait for the next free Chip RAM access cycle in the same way as reading DMACONR would.

Would be interesting to test that. Solid Gold uses my macro. Does anybody own an A1000 with at least 1MB RAM? Mine has only 512K.

AFAIK it does. Otherwise the whole macro makes no sense. But please correct me, anybody.
Well, your macro (as written) does not allow the BLTPRI write to be prefetched and executed. However, in a sequence of blits using your macro after them, BLTPRI would be 0 going in, since you reset it (which is at least unnecessary since you're not blitting in between end-of-blitwait and before setting BLTSIZE next time). This would in all cases set BLTPRI=1 after the third cycle slot used by the blitter, if not sooner. Which in turn means all blits with this macro after it are run with BLTPRI=1 until BBUSY=0.
Photon is offline  
Old 29 July 2014, 21:44   #17
phx
Natteravn
 
phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,545
Quote:
Originally Posted by Photon View Post
Well, your macro (as written) does not allow the BLTPRI write to be prefetched and executed.
I don't understand what that means. Why does it have to be prefetched?

Quote:
However, in a sequence of blits using your macro after them, BLTPRI would be 0 going in, since you reset it
This might be a misunderstanding.
I will always use that macro before a Blit, to ensure that the Blitter is available. In the time before I execute the macro I want the Blitter to run in parallel to the CPU. But when I do nothing else than waiting for BBUSY the Blitter should finish as fast as possible. So I set its priority flag.
phx is offline  
Old 29 July 2014, 22:25   #18
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
Quote:
Originally Posted by phx View Post
I will always use that macro before a Blit, to ensure that the Blitter is available. In the time before I execute the macro I want the Blitter to run in parallel to the CPU. But when I do nothing else than waiting for BBUSY the Blitter should finish as fast as possible. So I set its priority flag.
Right.

There is no point in the CPU stealing cycles from the Blitter to do nothing but check if the Blitter is finished.
Mrs Beanbag is offline  
Old 29 July 2014, 22:28   #19
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 373
Are there any drawbacks to simply polling bit 6 in INTREQR?

The only thing I can think of is that it might be slightly slower because of the write to INTREQ to clear the bit before starting the blit.

And isn't there an issue with the BBUSY bit possibly being reset before the last write by the blitter? If bit 6 in INTREQR is set after the last blitter write then that would further argue for its use especially in code that mixes blitter and CPU writes to the same area of memory.
mc6809e is offline  
Old 29 July 2014, 22:48   #20
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
Quote:
Originally Posted by phx View Post
I don't understand what that means. Why does it have to be prefetched?

This might be a misunderstanding.
I will always use that macro before a Blit, to ensure that the Blitter is available. In the time before I execute the macro I want the Blitter to run in parallel to the CPU. But when I do nothing else than waiting for BBUSY the Blitter should finish as fast as possible. So I set its priority flag.
No misunderstanding. Prefetch is one way of instructions that you don't expect getting executed (before a hardware event) getting executed. It's followed by a "however, instructions you don't expect will be executed even with BLTPRI=1".

A tight loop, in which the blit wait would follow directly after the write to BLITSIZE, is just the worst case and that's what we need to test to know anything.

Quote:
Originally Posted by mc6809e View Post
Are there any drawbacks to simply polling bit 6 in INTREQR?

The only thing I can think of is that it might be slightly slower because of the write to INTREQ to clear the bit before starting the blit.

And isn't there an issue with the BBUSY bit possibly being reset before the last write by the blitter? If bit 6 in INTREQR is set after the last blitter write then that would further argue for its use especially in code that mixes blitter and CPU writes to the same area of memory.
Yeah, it's the real reason for requiring an extra read of BBUSY before relying on its value that we're trying to find out (I hope). If the claims of it just being a delay are true, that's great because you could remove the extra read and put a useful instruction there instead.

Basically it's a question about whether "extra read before blitwait on some A1000s" and "clear INTREQ twice to work on A4000(T?)" are the ONLY compatibility patches.

Me, I just want to remove that pesky extra read of blitwait and find the circumstances under which it's safe to do so (and under which circumstances it's safe to omit the blitwait completely, which I think I know already).
Photon is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Benefactor Extra Levels hextreme Nostalgia & memories 16 30 August 2021 15:10
Why extra branches? (Which compiler?) crabman Coders. Asm / Hardware 31 01 May 2014 08:24
Strange pause issues under certain circumstances Bloodwych support.WinUAE 3 21 December 2009 11:25
Looking for extra RAM for A1200 Vollldo support.Hardware 10 07 November 2009 21:53
Extra Material Haystack HOL suggestions and feedback 0 08 October 2003 16:05

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 20:36.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10268 seconds with 13 queries