View Single Post
Old 04 October 2017, 16:30   #1
SpeedGeek
Registered User
SpeedGeek's Avatar
 
Join Date: Dec 2010
Location: Wisconsin USA
Age: 56
Posts: 505
FastCache040+ Released!

FastCache040+ 2.4 ©SpeedGeek 2020

INTRODUCTION:
FastCache040+ is a patch to replace the CachePreDMA() and
CachePostDMA() functions of most 68040/060 libraries. While
the old functions are adequate they are far from optimal.
These old functions have 2x more code then the new ones
provided with this patch!

Also, the new functions implement a much more efficient method
of managing the Copyback cache for DMA. While every system
will have some CPU performance loss under DMA conditions, the
new functions keep this performance loss to a bare minimum.

FEATURES:
- Replaces CachePreDMA() and CachePostDMA() with smaller
and more efficient code
- Replaces complex MMU code with simple and fast DTTR code
- Temporarily changes Copyback mode to Write Through for DMA
(but only when required!).
- Never flushes the ATC!
- Never flushes the DC for Chip RAM DMA!
- Uses 68040/060 library detection code
- Will not patch itself
- 100% Assembler code

CODE SIZE COMPARISONS:
- FastCache040+ 2.4 (NewFunc 190 bytes)
- 68060.library 46.7 (OldFunc 304 bytes)
- 68040.library 44.2 (OldFunc 414 bytes)

REQUIREMENTS:
- Amiga with 68040 or 68060 CPU and MMU
- 68040.library or 68060.library

WARNING:
Do NOT use this patch with GigaMEM, VMM or any similar
virtual memory software! Do NOT use this patch with any
code which uses the MMU to write protect or remap modified
data structures!

NOTES:
Remapping a mirror image of the Kickstart ROM with the MMU
is OK! The new functions still have one thing in common with
the old functions. They do NOT translate virtual addresses
as specified in the Amiga RKRM! For more info on the old
functions see the Enforcer.guide by Michael Sinz.

UPDATE:
FastCache040+ v1.7 has been removed. Phase5 68060.library
can optionally use FixMapP5.

HISTORY:
(Pre 2.0 history deleted)
v2.0 - Added code to enable only one DTTR when the Nest count
is one. Most systems have only one DMA driver and only need to
have 16MB of address space managed for this case.
Removed 1.9BR version which was over-rated due to most DMA
drivers operating at higher priority than typical user tasks.
v2.1 - Reworked the code to fix a problem with Snoopy 2.0
(Aminet). Sorry, this version no longer supports 16 byte aligned
cache enabled MEMF_24BIT transfers. NOTE: The original P5
library functions have problems with Snoopy too.
v2.2 - The Snoopy fix broke MEMF_24BIT transfers. So another
bug fix was required. Let's hope it's the last.
v2.3 - The 16 byte alignment code is back and now avoids the
change of cache mode for this specific case. Removed
Continue case from PreDMA since the expected results are
the same as the Non-Continue case. The cache disable test
code was removed to save the overhead of this very
uncommon case.
v2.4 - Reworked PostDMA code to fix Nested call cache flush bugs.
We really don't want to forget about systems with multiple
DMA drivers do we?
Code:
CachePostDMA:	
	MOVE.L  A0,D1
	ANDI.L  #$FFE00000,D1   ;Chip RAM
	BEQ.B	lbC00002A
	BTST	#3,D0		;ReadFromRam	
	BNE.B	lbC00002A
	MOVE.L	A5,-(SP)
	MOVE.L  A0,D1
	OR.L    (A1),D1
	ANDI.B  #15,D1		;16 byte aligned
	BEQ.B	lbC000020				
	LEA	Nest(PC),A1
	SUBQ.W  #1,(A1)
	BEQ.B	lbC000024
lbC000020	
	LEA	(lbC000050,PC),A5
	BRA.B	lbC000028					
lbC000024
	LEA	(lbC00004E,PC),A5
lbC000028	
	JSR	(-$1E,A6)	;Call Supervisor
	MOVE.L	(SP)+,A5

lbC00002A
	RTS

lbC00004E
	MOVEQ   #0,D1
	MOVEC	D1,DTT1		;Disable DTT1	    
	MOVEC   D1,DTT0		;Disable DTT0
lbC000050	
	CPUSHA	DC       
	RTE            

CachePreDMA:
	MOVEM.L	A0/A5,-(SP)		
	MOVE.L  A0,D1
	ANDI.L  #$FFE00000,D1   ;Chip RAM
	BEQ.B   lbC000068
	BTST	#3,D0		;ReadFromRam
	BNE.B	lbC000068
	MOVE.L  A0,D1
	OR.L    (A1),D1
	ANDI.B  #15,D1		;16 byte aligned
	BEQ.B	lbC000060		
	LEA	(lbC000074,PC),A5
	BRA.B   lbC000064
lbC000060
	LEA	(lbC000084,PC),A5
lbC000064	 
	JSR	(-$1E,A6)	;Call Supervisor
lbC000068
	MOVEM.L	(SP)+,A0/A5
	MOVE.L  A0,D0
	RTS

lbC000074	
	LEA	Nest(PC),A1
	TST.W	(A1)
	BEQ.B   lbC000078
	MOVE.L  #$0000C040,D1	;NoCache mode + Serialized      		
	MOVEC	D1,DTT0		;Enable DTT0
	MOVE.L  A0,D1
	ANDI.L  #$FF000000,D1   ;MEMF_24BIT
	BEQ.B	lbC000082
	MOVE.L  #$00FFC000,D1 	;Cache WT mode + ignore FC
	MOVEC 	D1,DTT1		;Enable DTT1	
	BRA.B   lbC000082
lbC000078
 	MOVE.L  A0,D1
	ANDI.L  #$FF000000,D1   ;MEMF_24BIT
        BNE.B   lbC000080        
	ORI.B	#$40,D1 	;NoCache mode + Serialized
lbC000080
	ORI.W   #$C000,D1	;Cache WT mode + ignore FC
	MOVEC	D1,DTT0
lbC000082
	ADDQ.W  #1,(A1)	
lbC000084
	CPUSHA 	DC		;Flush dirty cache lines  
	RTE		
Nest:	DC.W	0
Attached Files
File Type: lha CACHEDMABENCH11.LHA (2.2 KB, 152 views)
File Type: lha FIXMAPP5_14.LHA (3.4 KB, 111 views)
File Type: lha FASTCACHE040+24.LHA (2.8 KB, 16 views)

Last edited by SpeedGeek; 08 March 2020 at 04:30.
SpeedGeek is offline  
 
Page generated in 0.04279 seconds with 11 queries