English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 31 January 2021, 15:17   #1
Galahad/FLT
Going nowhere
 
Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,987
Proper cache usage versus unrolled loop

OK, so got a game that really needs one particular routine to run as fast as possible.

I've not had much to do with cpu caches on 68020, but thinking this is where they are ideal for a tight loop.

The loop I have repeats 934 times, its a screen conversion routine. The routine easily fits in the 256 byte cache of 020.

Unrolled, the routine is a massive 44K in size!!!

There is no MULU's or DIVS involved, its majority movem.l with a couple of add.l to address registers, looping over and over until the screen is converted.

So can anyone confirm that a tight loop that repeats 934 times with caches enabled is going to be quicker than unrolling it?

Fast ram is NOT an option here, i'm loathe to be doing an ST conversion that requires 68020 in the first place, but this particular game and the way it is written would essentially need a complete rewrite for a 68000 Amiga, and its not worth the effort and is a curiosity more than anything else, IF I was prepared to get it to work properly on 68000, i'd rather put the time into doing my own game on 68000 instead.
Galahad/FLT is offline  
Old 31 January 2021, 15:26   #2
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,410
Usually, caching loops (rather than unrolling) so they fit in the cache is going to be quicker on 68020. I had some experience with that when I wrote my audio mixer: the 68000 version had unrolled loops (which got quite big for one of the modes, at least over 10KB), but on the 68020 the much smaller one that fit in the cache was notably faster. This was on basic A1200/no fast ram.

Note that it's possible that for some situations this does not apply, but it generally holds.
roondar is offline  
Old 31 January 2021, 15:29   #3
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
No doubt, go for the cached loop routine.
Unroll it to fill the 256 bytes (minus a safety margin).
ross is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Atari ST versus Amiga in pictures Retro-Nerd Retrogaming General Discussion 64 20 December 2020 20:53
Tight loop causes WinUAE to shoot up to 800% CPU usage Leffmann support.WinUAE 12 14 August 2012 18:36
Developing a versus fighter for the OCS CMA Death Adder Coders. General 41 15 April 2011 16:16
Disk cache, pre-cache NoULTalk Coders. General 7 30 January 2010 19:07
A1200 Composite versus Video Chaeo support.Hardware 9 17 December 2004 17:05

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:50.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.20355 seconds with 15 queries