English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 11 August 2020, 18:10   #1
BigT
Registered User
 
Join Date: Apr 2020
Location: Melbourne / Australia
Posts: 23
Fastest method to clear a single bitplane on Amiga OCS - My findings

I was messing around with clearing a single bitplane in ASM on A500 OCS ie 320x256=10240 bytes. My findings were as follows:
clr.l (a0)+ dbra loop - 178 scanlines
clr.l (a0)+ dbra loop unrolled x16 clr.l statements - 118 scanlines
move.l d1,(a0)+ unrolled x16 - 73 scanlines
movem.l d1-d6/a2-a3,-(a0) unrolled x2 - 56 scanlines
blitter D channel clear - 50 scanlines
Blitter D + movem.l combination - 27 scanlines


What I was most surprised by was how close in performance a movem.l operation was to the blitter. I had always thought the blitter was much faster....
BigT is offline  
Old 11 August 2020, 19:04   #2
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
Quote:
Originally Posted by BigT View Post
I was messing around with clearing a single bitplane in ASM on A500 OCS ie 320x256=10240 bytes. My findings were as follows:
clr.l (a0)+ dbra loop - 178 scanlines
clr.l (a0)+ dbra loop unrolled x16 clr.l statements - 118 scanlines
move.l d1,(a0)+ unrolled x16 - 73 scanlines
movem.l d1-d6/a2-a3,-(a0) unrolled x2 - 56 scanlines
blitter D channel clear - 50 scanlines
Blitter D + movem.l combination - 27 scanlines


What I was most surprised by was how close in performance a movem.l operation was to the blitter. I had always thought the blitter was much faster....
It is actually MUCH faster. Otherwise how could you cut the time required for a clear combining CPU + Blitter?
When you activate only the D channel half of the blitter cycles are idle cycles (therefore usable by the CPU or other DMA channels to access the memory).
ross is offline  
Old 11 August 2020, 19:52   #3
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,213
other DMA activity will affect the timings too.

How were these "scanlines" calculated ? Was bitplane DMA active at any point down the screen etc...
DanScott is offline  
Old 11 August 2020, 21:22   #4
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
Quote:
Originally Posted by DanScott View Post
other DMA activity will affect the timings too.

How were these "scanlines" calculated ? Was bitplane DMA active at any point down the screen etc...
Just checked with my buffer clear code: 27 scanlines without other DMA activity.
Attached Thumbnails
Click image for larger version

Name:	clear.jpg
Views:	206
Size:	24.7 KB
ID:	68452  
ross is offline  
Old 11 August 2020, 21:41   #5
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,323
Quote:
Originally Posted by BigT View Post
What I was most surprised by was how close in performance a movem.l operation was to the blitter. I had always thought the blitter was much faster....
For a task as simple as this, the limiting factor is not so much the CPU power, but the chip memory bandwidth. The advantage is more on the side of the blitter if you try implement the minterms, masking and shifts as well by the CPU.
Thomas Richter is offline  
Old 12 August 2020, 02:23   #6
BigT
Registered User
 
Join Date: Apr 2020
Location: Melbourne / Australia
Posts: 23
Quote:
Originally Posted by DanScott View Post
other DMA activity will affect the timings too.

How were these "scanlines" calculated ? Was bitplane DMA active at any point down the screen etc...

For my simple test, I had the one bitplane active in terms of bitplane DMA and was waiting until line 54 to execute the clear operation.
BigT is offline  
Old 12 August 2020, 02:26   #7
BigT
Registered User
 
Join Date: Apr 2020
Location: Melbourne / Australia
Posts: 23
Quote:
Originally Posted by ross View Post
Just checked with my buffer clear code: 27 scanlines without other DMA activity.
Is your code using both Blitter & CPU or just Blitter ?
BigT is offline  
Old 12 August 2020, 05:02   #8
BigT
Registered User
 
Join Date: Apr 2020
Location: Melbourne / Australia
Posts: 23
Quote:
Originally Posted by Thomas Richter View Post
For a task as simple as this, the limiting factor is not so much the CPU power, but the chip memory bandwidth. The advantage is more on the side of the blitter if you try implement the minterms, masking and shifts as well by the CPU.
Agreed, when using advanced logic functions like the cookie cutter $CA, or blitter masks for clipping BOB's, the blitter is far superior to CPU operations on 68000 OCS/ECS platforms.
BigT is offline  
Old 12 August 2020, 10:26   #9
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
Quote:
Originally Posted by BigT View Post
Is your code using both Blitter & CPU or just Blitter ?
CPU+Blitter.
And yes, one bitplane do not change much (26~27 or similar, depends on how much I unroll the CPU code).


Quote:
Originally Posted by BigT View Post
Agreed, when using advanced logic functions like the cookie cutter $CA, or blitter masks for clipping BOB's, the blitter is far superior to CPU operations on 68000 OCS/ECS platforms.
Also the shift operations are free, while with the bare 68k are very heavy.

And no complex operations are required. You try a simple A->D (with blitter nasty active).
CPU is blocked by the maximum speed obtainable with movem, while the blitter uses all available cycles.
Then you realize the real speed of the blitter
ross is offline  
Old 12 August 2020, 14:44   #10
kamelito
Zone Friend
 
kamelito's Avatar
 
Join Date: May 2006
Location: France
Posts: 1,861
I guess testing if part of the screen is already cleared add too much overhead right?
kamelito is offline  
Old 12 August 2020, 14:48   #11
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 54
Posts: 4,498
Quote:
Originally Posted by kamelito View Post
I guess testing if part of the screen is already cleared add too much overhead right?
It depends.
If you have a lot of modified parts and few bitplanes, a full clear is better.
The reverse is true for a few objects and many bitplanes
ross is offline  
Old 12 August 2020, 19:51   #12
dodke
Registered User
 
Join Date: Feb 2018
Location: London / UK
Posts: 112
But often if you use blitter only to clear it can be a great time to do some other calculations you probably need to do anyway using the CPU
dodke is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Amiga 500 keyboards, need to clear some things up! ElectroBlaster support.Hardware 15 12 December 2015 15:19
simple method to transfer files to Amiga using Gotek kipper2k support.Hardware 6 22 January 2015 10:50
Some findings on Windows timer functions accuracy, that may benefit WinUAE Dr.Venom support.WinUAE 4 14 November 2013 10:35
Amiga clear out ivansc MarketPlace 40 27 May 2010 18:38
Hardware Lines or a Tiled Bitmap? - Which Method is the Fastest When Making a Grid? Franz Bazarov Coders. General 3 18 May 2009 04:06

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 09:39.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.13142 seconds with 14 queries