English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 08 February 2020, 22:17   #1
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,722
Question about CACR

So I've been reading up on the cache on 68020+ processors and was wondering about forward/backwards compatibility of the CACR register. The question really is: do any of the existing bits that control the 68020 cache change between the various processors?


And would code that bangs CACR directly fail on higher 68k processors? If so, are certain bits still safe to change?
roondar is offline  
Old 08 February 2020, 22:29   #2
jotd
This cat is no more
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 48
Posts: 3,657
no, there are enough bits on CACR to maintain backwards compatibility. 68020 uses 2 bits IIRC (flush, enable/disable instruction cache). Those bits are the same up to 68060.
jotd is offline  
Old 08 February 2020, 22:32   #3
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,722
Cool, good to know
roondar is offline  
Old 09 February 2020, 10:15   #4
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 344
Quote:
Originally Posted by roondar View Post
So I've been reading up on the cache on 68020+ processors and was wondering about forward/backwards compatibility of the CACR register. The question really is: do any of the existing bits that control the 68020 cache change between the various processors?

Yes, many. The proper way of handling cache control is through the exec.library functions, e.g. CacheClearU(). For example, the 68030 uses an entirely different protocol to clean the data cache compared to the 68040 and 68060, and the 68020 does not have a data cache in first place. CACR is part of the supervisor instruction set and as such not stable across processor versions.
Thomas Richter is offline  
Old 09 February 2020, 12:00   #5
ross
Per aspera ad astra

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 2,462
Quote:
Originally Posted by roondar View Post
And would code that bangs CACR directly fail on higher 68k processors? If so, are certain bits still safe to change?
Yes and no.

The assignment of the bits for cache functions are different on 020+ family; as Thomas advised you, it is better to let exec.library do it for you.

In any case it is possible to use a fairly generic writing on CACR in many cases because the vast majority of bits have exclusive functions on the single processor.
I have to have a routine for a generic icache activation for the whole 68k family with a single CACR write somewhere.

But I advise you to take a look first at the manuals available for the various processors .
ross is offline  
Old 09 February 2020, 13:12   #6
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,722
Quote:
Originally Posted by Thomas Richter View Post
Yes, many. The proper way of handling cache control is through the exec.library functions, e.g. CacheClearU(). For example, the 68030 uses an entirely different protocol to clean the data cache compared to the 68040 and 68060, and the 68020 does not have a data cache in first place. CACR is part of the supervisor instruction set and as such not stable across processor versions.
I think you're misunderstanding my question. I'm not asking if I can change any bits at random without ill effect, but purely whether the existing bits change meaning. There's only 4 of them on the 68020 and I want to know what happens to the meaning of those specific 4 bits.

Sadly, using the OS is impossible in this case as the extra overhead this would incur due to me having to stack/unstack a number of extra registers many times per frame (plus whatever overhead the cache functions of Exec have over a single write to CACR) would defeat the whole point of what I'm trying to do.
Quote:
Originally Posted by ross View Post
Yes and no.

The assignment of the bits for cache functions are different on 020+ family; as Thomas advised you, it is better to let exec.library do it for you.
For setting up the cache once, I fully agree. However, I'm changing the status of the instruction cache (freezing/unfreezing) about 40 times per frame (on base A1200). Doing this through the OS would highly likely end up costing more cycles than I'm saving using this method.

BTW, I know - this is micro optimisation at it's finest. Perhaps I'll drop the idea. Or generate a "68020 only" binary and a generic 68020+ binary.
Quote:
In any case it is possible to use a fairly generic writing on CACR in many cases because the vast majority of bits have exclusive functions on the single processor.
This was pretty much what I was trying to figure out: bit assigment per processor.
Quote:
But I advise you to take a look first at the manuals available for the various processors .
I did check the 68020 and 68040 manual. And while I could find the relevant bits in the 68020 manual, I couldn't find them in the 68040 manual. There was a section on the cache and CACR, but no bit assignments anywhere in sight. It's probably in there somewhere, but I couldn't find it

Hence my turning to the forum for help
roondar is offline  
Old 09 February 2020, 14:01   #7
ross
Per aspera ad astra

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 2,462
Code:
020:
03 C = Clear Cache
02 CE = Clear Entry In Cache
01 F = Freeze Cache
00 E = Enable Cache

030:
13 WA = Write Allocate
12 DBE = Data Burst Enable
11 CD = Clear Data Cache
10 CED = Clear Entry in Data Cache
09 FD = Freeze Data Cache
08 ED = Enable Data Cache
04 IBE = Instruction Burst Enable
03 CI = Clear Instruction Cache
02 CEI = Clear Entry in Instruction Cache
01 FI = Freeze Instruction Cache
00 EI = Enable Instruction Cache 

040:
31 DE = Enable Data Cache
15 IE = Enable Instruction Cache

060:
31 EDC = Enable Data Cache
30 NAD = No Allocate Mode (Data Cache)
29 ESB = Enable Store Buffer
28 DPI = Disable CPUSH Invalidation
27 FOC = 1/2 Cache Operation Mode Enable (Data Cache)
23 EBC = Enable Branch Cache
22 CABC = Clear All Entries in the Branch Cache
21 CUBC = Clear All User Entries in the Branch Cache
15 EIC = Enable Instruction Cache
14 NAI = No Allocate Mode (Instruction Cache)
13 FIC = 1/2 Cache Operation Mode Enable (Instruction Cache)
Unlisted bits are unassigned and can be set w/o collateral effects (at least until a new official processor is released but it is an unlikely event ).

Remember that on 040+ MMU play a big role on cache usage, so these are only 'global' setups.

As you can se the only ambiguous is the 13 one, but fortunately not with fondamental effect.
ross is offline  
Old 09 February 2020, 14:09   #8
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,722
Thanks! This was exactly what I needed to know.

And by the way, I do agree that normally you should use the OS to set these kind of things up. I have nothing against using the OS (in fact, I really rather like it). It's just that I aim my code at "small" Amigas and sometimes using the OS is not really viable on those systems - unless you accept notably lower performance.
roondar is offline  
Old 09 February 2020, 14:47   #9
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 344
Quote:
Originally Posted by roondar View Post
I think you're misunderstanding my question. I'm not asking if I can change any bits at random without ill effect, but purely whether the existing bits change meaning.
In the sense of "some of them do not work anymore", yes.


Quote:
Originally Posted by roondar View Post

Sadly, using the OS is impossible in this case as the extra overhead this would incur due to me having to stack/unstack a number of extra registers many times per frame (plus whatever overhead the cache functions of Exec have over a single write to CACR) would defeat the whole point of what I'm trying to do.
Have you made measurements to back this claim up? What makes you believe that you need to modify the cache settings many times a frame? 40 times a frame sounds like "forget about it, minimal overhead".



Honestly, a user program should not have to modify the cache settings at all - it may possibly have to clear the cache when loading or creating code, but CACR is not the answer to this problem. CacheClearU() is the answer, because, depending on the processor, a cache push is not controlled by CACR in first place.
Thomas Richter is offline  
Old 09 February 2020, 14:55   #10
jotd
This cat is no more
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 48
Posts: 3,657
here's some code from JST. It runs in supervisor mode, without any OS. It flushes both code & data caches

note that attnflags have been saved before killing the OS

Code:
FlushCachesSup:
	ori	#$700,SR
	bsr	GetAttnFlags	; calls a function that returns a cached copy of cpu status
	BTST	#AFB_68020,D0
	BEQ.B	.no020			; tested outside the function but better safe than sorry
	MC68020
	MOVEC	CACR,D1		: gets current CACR register
	MC68000
	BSET	#CACRB_ClearI,D1
	BTST	#AFB_68030,D0
	BEQ.B	.no030
	BSET	#CACRB_ClearD,D1
.no030:
	MC68020
	MOVEC	D1,CACR
	MC68000
	BTST	#AFB_68040,D0
	BEQ.B	.no040

	MC68040
	CPUSHA	BC
	MC68000
.no020
.no040:
	RTS
jotd is offline  
Old 09 February 2020, 15:28   #11
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,722
Quote:
Originally Posted by Thomas Richter View Post
In the sense of "some of them do not work anymore", yes.
Yes, I can see that now that I actually have a listing of bits per processor.
Quote:
Have you made measurements to back this claim up? What makes you believe that you need to modify the cache settings many times a frame? 40 times a frame sounds like "forget about it, minimal overhead".
It occurs to me you may have read that part of my post as meaning the OS is slow. That was not the intent, the problem here is calling the OS, not the OS routines themselves (which might well be super fast).

Anyway, I have indeed made measurements on a real A1200 without expansions (using the CIA). Which is why I know that the overhead will be too much in my particular case as stacking the registers needed to call the OS will already add more overhead than what is saved doing this. DMA is rather busy, so the CPU doesn't get many accesses to memory. Everything the CPU does that accesses memory is slow as result.

This testing is also why I know that freezing/unfreezing the cache will improve performance of my experimental routine by about 1%. For low end Amiga's, even 1% can be the difference between something running at 50Hz, or at 25Hz. For higher end ones, it won't matter. Please note that I never claimed there would be massive changes in performance or that everyone should do this all the time. It was merely an experiment to see how far I could push something on a low end machine, nothing more.
Quote:
Honestly, a user program should not have to modify the cache settings at all - it may possibly have to clear the cache when loading or creating code, but CACR is not the answer to this problem. CacheClearU() is the answer, because, depending on the processor, a cache push is not controlled by CACR in first place.
I got the idea from reading up on 68020 optimization and found several sources that specifically mentioned that freezing the instruction cache selectively can boost performance by forcing routines to stay in the cache. I decided to try this out and it turns out that for some routines it does indeed offer a rather modest gain, for effectively no cost.

I do agree that normally you don't want to do this. However, coding for the low end Amiga's often means either not doing something or going to "the bare metal" to get it to work. I'm interested in that as I want to see how far the A500 and A1200 can be pushed. So, sometimes I try out things that would normally not be done. When I see that some routines can run faster "for free", that makes me interested in seeing how portable such a method is.

As it turns out, it is portable - providing I don't change the other bits in the register. Worst case it won't do anything, which is still ok.
Quote:
Originally Posted by jotd View Post
here's some code from JST. It runs in supervisor mode, without any OS. It flushes both code & data caches

note that attnflags have been saved before killing the OS
Thanks!
roondar is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to disable caches using MOVEC => CACR jotd Coders. Asm / Hardware 12 07 November 2017 20:45
WHDLOAD CACR resload FlushCache error paul773car New to Emulation or Amiga scene 4 07 November 2015 18:28
Whdload CACR resload_flushcache error paul773car support.OtherUAE 2 29 October 2015 22:26
Amiga 1200...board revisions question / wire link modification question voyager_1701e support.Hardware 3 20 February 2014 12:32
A question oldpx project.SPS (was CAPS) 3 10 February 2003 17:37

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 03:03.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.
Page generated in 0.07732 seconds with 15 queries