English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 17 April 2012, 23:37   #1
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
B channel DMA enabled during line mode?

I'm trying to get some information about what happens when SRCB in BLTCON0 is turned on in line mode. Does Agnus read from the address pointed to by BLTBPT? And if so, does the fetched data get shifted once, continuously, or not at all?
mc6809e is offline  
Old 18 April 2012, 01:11   #2
Photon
Moderator

Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 4,781
I can only guess at what you're trying to do but isn't a simple experiment in order?

From my experience, the source setup in line mode is hard-coded (each source has a specific use) and there isn't much to do about that.

But there might be an exploit waiting to be discovered. I think the state-machine loop is custom for linedrawing, but in the case that B can function as a normal source it should work as in all other blits; the entire line of words pointed to by B is shifted left/right according to DESC. The problem is that BLTSIZE is special for linedrawing, so how does it know how many words to read "as in normal mode"? If you get what I mean.

Last edited by Photon; 18 April 2012 at 01:26. Reason: nothing's impossible! but some things are.
Photon is offline  
Old 18 April 2012, 04:52   #3
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Quote:
Originally Posted by Photon View Post
I can only guess at what you're trying to do but isn't a simple experiment in order?

From my experience, the source setup in line mode is hard-coded (each source has a specific use) and there isn't much to do about that.

But there might be an exploit waiting to be discovered. I think the state-machine loop is custom for linedrawing, but in the case that B can function as a normal source it should work as in all other blits; the entire line of words pointed to by B is shifted left/right according to DESC. The problem is that BLTSIZE is special for linedrawing, so how does it know how many words to read "as in normal mode"? If you get what I mean.
I'm all for experiment, but all I have left of my A500 is the dead motherboard, so everything I do now is via emulator and I don' think the emulator has a complete enough model of the blitter. I just saw an A500 on ebay for about $139US. Maybe I should just buy it.

For what I want, I really don't need a width for B. What I'm interested in right now is a C2P routine. If B fetches, but the shift stays constant, line mode should be able to grab a bit from each chunky pixel while A shifts it into place at the destination. By writing $8080 into A, two chuncky pixels can have one bit extracted at a time.

But I'm starting to think the shift amount of B's barrel shifter changes whether B is fetched from memory or not. That would spoil my idea.

Your comment does make me wonder, though, about just what happens when the width is something other than two. Maybe something interesting happens.
mc6809e is offline  
Old 18 April 2012, 09:03   #4
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Quote:
Originally Posted by mc6809e View Post
I'm trying to get some information about what happens when SRCB in BLTCON0 is turned on in line mode. Does Agnus read from the address pointed to by BLTBPT? And if so, does the fetched data get shifted once, continuously, or not at all?
I think I checked this with logic analyzer years ago. Unfortunately I don't remember what it did (if anything, I am quite sure it didn't affect line shape). I can recheck.
Toni Wilen is offline  
Old 18 April 2012, 19:47   #5
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Quote:
Originally Posted by Toni Wilen View Post
I think I checked this with logic analyzer years ago. Unfortunately I don't remember what it did (if anything, I am quite sure it didn't affect line shape). I can recheck.
Well if a fetch occurs into BLTBDAT then C2P might become easier.

My idea was to take advantage of the automatic changes in shift values to A and B that occur in line mode. I think it's already possible to take advantage of this for C2P using standard methods. In line mode, B is incrementally shifted, even if A is not (like during a vertical line). Load B with the two chunky pixels, and A with $8080. Now draw a vertical line. As B is shifted, A extracts the correct bits from B.

If the bitplanes are properly interleaved in memory, and the slope of the line is set correctly (should have a slope of -8 I think), it should be possible to process two chunky pixels at a time with a single line draw. The trouble of course is that BLTBDAT must be reloaded each time for every two chunky pixels so there is overhead involved with reloading. Still, it might be worth it if the CPU is made to work on other pixels at the same time since line mode leaves every other cycle open (-C-D-C-D is the pattern if I recall). This creates a perfect interleaving of CPU and blitter.
mc6809e is offline  
Old 22 April 2012, 05:50   #6
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Learned a few things by hunting around old usenet postings: when channel B DMA is turned on, BLTBDAT is indeed loaded with the data pointed to by BLTBPT with BLTBPT incremented by BLTBMOD every cycle. B's shifter is also changed every cycle so the value seen in B is the shifted value.

This assumes horizontal width is set to 2. My guess is that other values simply cause BLTxPT values to be incremented by 2. Not sure what happens with the barrel shifters. They may or may not stop. It's possible they're triggered at the same time MOD values are added to the PTR's.
mc6809e is offline  
Old 26 April 2012, 20:08   #7
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
I finally managed to do some logic analyzer tests.

If B channel is enabled, it really starts doing DMA reads and line mode cycle diagram changes to:

BC-BD-BC-BD- (normal line mode is C-D-C-D-)

I am not exactly sure what happens with B channel.

First two B channel reads come from same address (Data where B points can be seen in capture in "both B cycles"), following reads are either same data forever or something else. Seems to depend on line direction?

B MOD value seems to be ignored, it does not seem to never read B+2, whatever the B MOD value.

No idea about address bcause I can't capture address lines (at least not full space), 32 channels is not enough for RGA+Data+address bus.

Fun tests:

Width set to 1: cycle diagram is -D-D-D (or -BD-BD if B set). Does one write to address pointed by C and all following writes to same address pointed by D. Not very useful...

Width set to >2: cycle diagram is (width - 1) * "-C" + "-D". (3 = C-C-D-, 4 = C-C-C-D- and so on..) Not sure what happens, line mode seems to get quite confused.

More tests needed?
Toni Wilen is offline  
Old 27 April 2012, 07:56   #8
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Quote:
Originally Posted by Toni Wilen View Post
I finally managed to do some logic analyzer tests.

If B channel is enabled, it really starts doing DMA reads and line mode cycle diagram changes to:

BC-BD-BC-BD- (normal line mode is C-D-C-D-)

I am not exactly sure what happens with B channel.
What an odd pattern! Very interesting!

Quote:
Originally Posted by Toni Wilen View Post
First two B channel reads come from same address (Data where B points can be seen in capture in "both B cycles"), following reads are either same data forever or something else. Seems to depend on line direction?

B MOD value seems to be ignored, it does not seem to never read B+2, whatever the B MOD value.

No idea about address bcause I can't capture address lines (at least not full space), 32 channels is not enough for RGA+Data+address bus.
So there are situations where BLTBPT actually does change and sometimes not. When it does change, is it always to B+2 or did I misunderstand?

Quote:
Originally Posted by Toni Wilen View Post

Fun tests:

Width set to 1: cycle diagram is -D-D-D (or -BD-BD if B set). Does one write to address pointed by C and all following writes to same address pointed by D. Not very useful...
This seems connected in some way to an interesting feature you mentioned a while back concerning the first write of a pixel in line mode. I think I remember you writing that the first pixel is plotted to the address in D, but following pixels go to the address pointed to by C.

Quote:
Originally Posted by Toni Wilen View Post


Width set to >2: cycle diagram is (width - 1) * "-C" + "-D". (3 = C-C-D-, 4 = C-C-C-D- and so on..) Not sure what happens, line mode seems to get quite confused.
What happens to the addresses? Does C change for each access? This is all so fascinating! Wish I still had that old Lecroy and a working 500.

Quote:
Originally Posted by Toni Wilen View Post
More tests needed?
If you're offering.

I'd love to know what happens to B's address when a strictly horizontal or vertical line is drawn while B DMA is on.

Is there any chance you can fill each chipram address with its own address MOD 65536? Then reading the data gets you most of the address it came from -- at least until that address is trashed by the blitter. Might be useful in figuring out the addresses of B C and D.

The patterns involving C-C-C-D, etc, are very interesting. Does D use C's address? And what happens to C?
mc6809e is offline  
Old 27 April 2012, 09:37   #9
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Quote:
Originally Posted by mc6809e View Post
What an odd pattern! Very interesting!
Very. I think this 100% proves it line mode is a hack, "normal" blitter logic is still there, just partially disabled but still causing interesting side-effects.

Quote:
So there are situations where BLTBPT actually does change and sometimes not. When it does change, is it always to B+2 or did I misunderstand?
I never managed to get any accesses from B+2. B+0, B+0, B+<less or more than 16 at least>

I'll retest this by setting B to middle of 64k WORD array, each word containing unique value and then checking in logic analyzer capture which value(s) are read. (Can "capture" chip memory addresses without extra hardware)

Quote:
This seems connected in some way to an interesting feature you mentioned a while back concerning the first write of a pixel in line mode. I think I remember you writing that the first pixel is plotted to the address in D, but following pixels go to the address pointed to by C.
Yeah. "C to D copy" seems to always happen in line mode.

Quote:
What happens to the addresses? Does C change for each access? This is all so fascinating! Wish I still had that old Lecroy and a working 500.
No idea. I'll do above 64k array test using C channel too.

Quote:
I'd love to know what happens to B's address when a strictly horizontal or vertical line is drawn while B DMA is on.
Will be done

Quote:
Is there any chance you can fill each chipram address with its own address MOD 65536? Then reading the data gets you most of the address it came from -- at least until that address is trashed by the blitter. Might be useful in figuring out the addresses of B C and D.
You had same idea as I had after I posted my previous findings..

Single 64k word array should be more than enough to confirm first few cycles, the rest of the cycles are identical anyway, except possibly the very last cycle.

Quote:
The patterns involving C-C-C-D, etc, are very interesting. Does D use C's address? And what happens to C?
I'll check this too.
Toni Wilen is offline  
Old 27 April 2012, 19:41   #10
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Height = 3, vertical line, "standard" 40 byte (320px) wide bitmap.

After each C fetch C-PTR is increased by 40. Because there are now 2 C fetches before D write, C and D gets out of "sync".

Read C+0
Read C+40
Write to D+0 ("background" from C+40) This is original D PTR. After this happens single D=C copy.
Read C+80
Read C+120
Write to D+40 ("background" from C+120)
and so on..

Line is still drawn correctly but because C and D are not in sync, old "background" data comes from wrong address.

More tests later..

Interesting side-effect, not sure if it can be used for anything useful..

ADDED:

B channel, if enabled, is never incremented by 2. C-channel in above height > 2 mode also is never incremented by 2. I guess all +2/-2 adders are disabled when in line mode.

Last edited by Toni Wilen; 27 April 2012 at 22:58.
Toni Wilen is offline  
Old 27 April 2012, 23:47   #11
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
I think I might have the explanation for some of what's happening with BPTR and the BC-BD-BC-BD sequence.

This morning I was considering the constants loaded into AMOD, APTR, and BMOD and I think I now understand how they're used.

It's known that the blitter uses Bresenham's line algorithm to draw lines, but the meaning of some of the constants and how they're used isn't obvious.

BMOD is somewhat obvious. In the integer version of the algorithm, 2dy is added to the error term for every increment of the x-coordinate (assume dy<dx and plotting from left to right). This comes from multiplying everything by 2dx. Since the slope of the line is dy/dx, multiplying by 2dx gets rid of the fraction leaving 2dy. The need to multiply everything by 2dx instead of just dx comes from the use of 0.5 in the floating point or fixed point version of the algorithm's error comparison. Simply multiplying by dx will leave dx/2, so we must multiply everything by 2dx. And since the bottom bit isn't used in the blitter registers, all values must be left-shifted giving 4dy.

That explains the 4dy in BMOD. AMOD and APT are more difficult. The error term after multiplying everything by 2dx gives dx. Shifting left gives 2dx. The only thing that looks like it might be related to this is the expression in APTR which is 4dy-2dx. So why the 4dy at the beginning? How is the current error compared with this value?

The answer is that it isn't compared. Like many CPUs, the blitter doesn't have a multi-bit comparison circuit. Instead, it performs an arithmetic operation and looks to see if the result is negative. That's where the -2dx in APTR comes from. If the current error is added to -2dx, and the answer is negative, then the error is still small and no adjustment is made.

But why 4dy-2dx in APTR instead of just -2dx? Because both are added to the error at the same time to avoid two separate additions. If the answer is positive, then the error is large.

This helps explain the 4dy-4dx in AMOD. If error is large, then both 4dy and -4dx are added to the error value instead of just -4dx to avoid extra additions. If the error is still small, the 4dy in BMOD is added to the error instead.

I think we can explain the BC-BD-BC-BD pattern if we assume the error is stored in BPTR. Some of the logic of the circuit associated with channel B really is used twice per pixel. The first time is for the comparison -- APTR is added to the error in BPTR to check if its gotten too large. The second time BPTR is altered after learning the result of the previous addition of APTR to BPTR. If the error is still small, BMOD is added to BPTR otherwise AMOD is added to BPTR.

This might also explain the width=2 requirement. My guess is that it controls the two phases of B.

It occurs to me that the conditional addition performed by the blitter during line draw resembles shift/add multiplication. A slight change to the blitter with the shifting APTR instead of adding to BPTR might have worked. Had the contents of the PTR registers been available for storage in memory, the blitter could have been a full blown streaming GPU supporting addition and multiplication of large vectors. Ah, so close.

(BLTDDAT isn't any help, is it?)

Added: still working out SUD, etc bits and coordinate change logic...

Last edited by mc6809e; 28 April 2012 at 00:01.
mc6809e is offline  
Old 28 April 2012, 06:37   #12
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Okay. I think I have a test for checking if BPTR is the error term.

Load all registers for a line draw from 0,0 to 79,1.

If B DMA is turned on and BPTR holds the error term, then BPTR should increase by 4 for each pixel drawn until it reaches about 160. Then it should suddenly go negative to about -160, counting up towards 0 as the line continues to 79,1.
mc6809e is offline  
Old 28 April 2012, 11:45   #13
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Quote:
Originally Posted by mc6809e View Post
Added: still working out SUD, etc bits and coordinate change logic...
Just check UAE source?
AFAIK it uses 100% matching pixel perfect algorithm and does not need BPT (BMOD is added to APT or AMOD is subtracted from APT) and also does not need any "internal" temp variables.

(Of course it assumes width=2 and B channel not enabled)

Last edited by Toni Wilen; 02 May 2012 at 12:03. Reason: height=width
Toni Wilen is offline  
Old 28 April 2012, 21:14   #14
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Quote:
Originally Posted by Toni Wilen View Post
Just check UAE source?
AFAIK it uses 100% matching pixel perfect algorithm and does not need BPT (BMOD is added to APT or AMOD is subtracted from APT) and also does not need any "internal" temp variables.

(Of course it assumes height=2 and B channel not enabled)
I'll look at the source, but subtracting AMOD from APT just doesn't make sense to me.

Suppose the value in AMOD really needed to be subtracted from APT. This requires negating the value from AMOD. Why waste gates on a layer of inverters and the conversion of a half adder to a full adder when you can simply require that the value to be loaded in AMOD be negated by the CPU before being stored there? Then be no need for subtraction logic. The hardware could just add the already negated value to what is in APT and it's done.

The other issue I have with the subtraction idea is that this layer of inverters would have to be on a signal path that already includes a MUX to select from BMOD or AMOD and that result is fed to an adder which then goes back to APT. That's a lot of distance. And ddders are notorious for ending up on the critical path (if they don't, you make them wider). I don't think a designer would do that.

On the other hand, additional subtraction logic is unnecessary if BPTR is used for the error term. And BTPR does seem to change
mc6809e is offline  
Old 28 April 2012, 21:29   #15
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Quote:
Originally Posted by mc6809e View Post
I'll look at the source, but subtracting AMOD from APT just doesn't make sense to me.
But blitter does have both addition and subraction support, modulos are added when in ascending mode and substracted when in descending mode. AFAIK hardware is already there.

Quote:
On the other hand, additional subtraction logic is unnecessary if BPTR is used for the error term. And BTPR does seem to change
I modified BPT while line draw blit was active but nothing happened, line was still drawn correctly (I tried multiple different values and line coordinates)

<few minutes passed> I now have proof that BPT does not change during linedraw:

I set up line draw normally, pointed BPT to specific address, started line draw, waited for blit to finished.

Then I set normal D=B copy blit, didn't modify BPT, set DPT, BPLCON0/1 (0x05cc = BPLCON0) and started 1x1 blit. Data pointed by B was copied to D.

-> BPT didn't change.
Toni Wilen is offline  
Old 29 April 2012, 02:49   #16
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Quote:
Originally Posted by Toni Wilen View Post
But blitter does have both addition and subraction support, modulos are added when in ascending mode and substracted when in descending mode. AFAIK hardware is already there.

I modified BPT while line draw blit was active but nothing happened, line was still drawn correctly (I tried multiple different values and line coordinates)

<few minutes passed> I now have proof that BPT does not change during linedraw:

I set up line draw normally, pointed BPT to specific address, started line draw, waited for blit to finished.

Then I set normal D=B copy blit, didn't modify BPT, set DPT, BPLCON0/1 (0x05cc = BPLCON0) and started 1x1 blit. Data pointed by B was copied to D.

-> BPT didn't change.
You're right of course about the hardware already being there. Hmm. BPT doesn't change, but something appears on the bus claiming to be from channel B's DMA! (You mentioned addresses from B looking like B+0, B+0, then B+<something less or more than 16 at least)

I wonder what's generating those addresses. And does the blitter actually load data into BDAT from these addresses even though BPT isn't responsible?

Is there any way to verify whether or not APT actually changes?

BTW, thanks for indulging my curiosity, Toni. I'm fascinated by the internals of the blitter.
mc6809e is offline  
Old 29 April 2012, 03:32   #17
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 304
Went to take a look at patent #4874164 http://www.google.com/patents/US4874...ner%22&f=false. In figure 4, there is a single adder (complete with invert and carry in for subtraction) that takes at most one pointer register and one modulo register. There's no way for my idea to have worked. Bah.
mc6809e is offline  
Old 29 April 2012, 09:32   #18
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Quote:
Originally Posted by mc6809e View Post
You're right of course about the hardware already being there. Hmm. BPT doesn't change, but something appears on the bus claiming to be from channel B's DMA! (You mentioned addresses from B looking like B+0, B+0, then B+<something less or more than 16 at least)
I meant in "normal" line mode, without setting B DMA enable bit in BLTCON0, BPT does not change.

Quote:
Is there any way to verify whether or not APT actually changes?
Same method I used with B should work fine.

EDIT: APT does change. I didn't check if it changes exactly like in UAE but it does change.

Last edited by Toni Wilen; 29 April 2012 at 16:14.
Toni Wilen is offline  
Old 02 May 2012, 11:21   #19
TheDarkCoder
Registered User
 
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 211
Quote:
Originally Posted by Toni Wilen View Post
Height = 3, vertical line, "standard" 40 byte (320px) wide bitmap.
With height you mean (as customary) the bits 6-15 of BLTSIZE? I ask beacuse in the rpevious messages you were speaking of setting the line width (bits 0-5 of BLTSIZE) to a value different than 2 (the "prescribed" value)

Many thanks to Toni and mc6809e for this interesting study going on...
TheDarkCoder is offline  
Old 02 May 2012, 12:03   #20
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 44
Posts: 23,359
Quote:
Originally Posted by TheDarkCoder View Post
With height you mean (as customary) the bits 6-15 of BLTSIZE? I ask beacuse in the rpevious messages you were speaking of setting the line width (bits 0-5 of BLTSIZE) to a value different than 2 (the "prescribed" value)
Yeah, I meant bits 0-5.
Toni Wilen is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Is it possible to have scanlines enabled in native mode but disabled in RTG? Dr.Venom support.WinUAE 4 22 May 2011 11:04
Line mode blitter absence Coders. General 4 25 September 2009 21:50
WinUAE still freezes with "Faster RTG mode" enabled StingRay support.WinUAE 4 13 April 2007 09:34
TV line mode Avanze request.UAE Wishlist 0 25 July 2006 18:37
Blinking Taskbar Line in FS mode ;(( Leo42 support.WinUAE 3 31 May 2003 14:48

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 20:55.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.
Page generated in 0.09909 seconds with 13 queries