Questions on C2P using blitter only
Hi there,
I want to create a (...my first, excuse me if I'm ignorant =p) C2P, which exclusively uses the blitter because I want to fully use the CPU to calc the screen. Basically I ported a 8x8 grid engine which maps and interpolates a 128x128x4-bit texture on a 8x8 grid. Used for roto zoomers, wobblers, tunnels and all these nerdy 2D demo effects =p Now to my idea, resolution is 320x256: Original texture: Code:
abcd efgh ijkl mnop Code:
a000b000 c000d000 Code:
aeimbfjn cgkodhlp Code:
aeim quyC Code:
D = (A & C) | ((B>>4) & (~C)) Code:
BLTCDAT = $f000, $0f00, $f000, $0f00 Code:
BLTSIZH = 1 (8 cycles * 256 lines * 20 words * 4 blitter operations) / 7.09 (PAL Frequency) = 23,1086 msecs The timing without blitter setup and that stuff for sure. Unfortunately it needs a little bit more than time for a frame, so this will display with half the framerate =/ What do you think? Completely senseless? What I tried was omitting the first passes by using a prescrambled texture. I want to use double buffering of course, so the blitter is merging/displaying while the CPU is calcing the new screen. I'm aware of the MOVEP instruction but since I'm on 68020 (without turbo) I cannot use that. Thanks for some opinions. |
I just noticed, BLTSIZV may not be big enough for that resolution so I would have to stick to 256x256 or add a swap pass before, so I'm able to write a full word of bitplane data (witout having BLTDMOD = -1).
|
You are on the right track. There are a couple of things which you need to consider though.
The blitter is reading and writing words. These memory accesses have to be to even word addresses. You can't make the blitter write misaligned words. If you want the blitter to write a smaller unit than a word, then you need to read from the output area and include that into the blit operation as another masking step. You are out of blitter channels in your case so you can't do that in any sensible way. Blitter performance when you have two active source and one active destination DMA channel is about 40kB/frame on a stock A1200. If you run ordinary CPU code at the same time then it shares chipmem bandwidth with the blitter - so both the CPU code and the blit operation will run correspondingly slower. When you do a merge operation such as the one you described above then you need to be able to shift sometimes to the left, and sometimes to the right. You shift to the left by performing a descending blit. In your example you would do blit #1 and #3 ascending, and blit #2 & #4 descending. Given the above (particularly the 'blitter always reads and writes words') your current dataformat will require you to do two blitter passes; one with 4-bit shift length and another with 8-bit shift length. You can get rid of the 8-bit blitter pass by doing it in the CPU or by using MOVEP writes (this works on all machines except 68060, and do consider, MOVEP.W to chipmem will do twice the number of write accesses than a normal MOVE.W) or by having longword-pixel-sized prescrambled textures. Or perhaps in other ways. |
All times are GMT +2. The time now is 21:54. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.