Thread: Amiga Vs ST
View Single Post
Old 11 September 2014, 22:47   #483
Registered User
Join Date: Jan 2012
Location: USA
Posts: 281
Originally Posted by Cyprian View Post
ok, actually I know (thanks to you and Toni) how to calculate blitting cycles. In py post, I asked about how many blitter registers CPU have to write in order to initialize such a blitter copy. I'd just like to calculate how many CPU cycles that initialization takes and where is worth to use CPU and where blitter.
The first initialization before beginning the horizontal line drawing loop would take a large number of writes (five longwords and four words), but after that first initialization, many of the registers would stay the same and wouldn't have to be reinitialized. An arbitrary bit block copy isn't necessary either since the horizontal line looks the same whether it's shifted or not.

I count one longword write, and four word writes per copy of a horizontal line, assuming all other registers have been initialized beforehand and that the frame buffer and source lines don't cross 128K boundaries -- one longword write is for the masks on the source A which points to the stored horizontal line, three word writes for the lower 16 bits of each of three pointers, and one word for the size which starts the blit.

Such a copy would use DMA channels BCD. That would take between 8 CPU cycles per word in the best case (during overscan and blanking) and 16 CPU cycles per word in the worst case (during display, sprite, refresh, disk, or audio DMA).

Originally Posted by Cyprian View Post
what kind of code do you mean?

I know that blitplanes have an impact onto CPU. If my calculations are correct in mode 320x256 mode, in case of 5bitplanes, ST is faster by 24% and in case of 6bitplanes 37%. In case of 320x200 mode 21% and 31% respectively.
Your calculations are only correct if you assume that all instructions need to access memory on absolutely every fourth cycle. That doesn't always happen.

Suppose code includes many MULSs and DIVSs (a 3d game with filled polys, for example). Those instructions spend most of their time processing internally. The extra cycles taken for the display would only modestly slow down those instructions because display DMA fetch can occur at the same time the instruction is running.
mc6809e is offline  
Page generated in 0.08502 seconds with 9 queries