Copymem Quick & Big Released!
3 Attachment(s)
CopyMem Quick & Big v1.7
Parts of patch install code by Dirk Busse 1999 Enhanced patch code by SpeedGeek 2021 INTRODUCTION: CMQ&B is a big and faster Copymem + Copymemquick patch. The main goal is to give the fastest possible results with Testit from COPMQR28. In order to obtain these fast results CMQ&B must have the redundant and bloated code needed to handle many "Worst Case" copies. FEATURES: - Installs one of the fastest CMQ patches for 68020+ Amigas - New JMP copy code speeds up small copies - Safely exits if the patch is already installed (e.g. a good patch program should really avoid patching itself) REQUIREMENTS: - Amiga with 68020+ NOTES: CMQ&B is an extension of CMQ&S. It has some extra code to handle many small and misaligned copies. There are trade offs in supporting these "Worst Case" copies. Specifically, The Best Case performance has been reduced and the size of the patch has increased to 320 bytes. HISTORY: v1.6 first release v1.7 Updated Big loop code with faster instructions. Increased Big loop copy size to 112 bytes. Replaced Small loop copy code with new JMP copy code for <= 108 bytes. ****************************************************** CopyMem Quick & Big040 v2.3 Parts of patch install code by Dirk Busse 1999 Enhanced patch code by SpeedGeek 2021 INTRODUCTION: CMQ&B040 is a big and faster Copymem + Copymemquick patch. The main goal is to give the fastest possible results with Testit from COPMQR28. In order to obtain these fast results CMQ&B040 must have the redundant and bloated code needed to handle many "Worst Case" copies. FEATURES: - Automatically installs one of the fastest CMQ patches for 040+ - The Move16 address is restricted only for performance reasons (See Notes) - New smart buffer copy code handles MOVE16 alignment restrictions - User selected 1024-8192 byte Block Size options allow "Tuning" the MoveL vs. Move16 performance of your system. Since v2.1 the default Block size is 4096 - Safely exits if the patch is already installed (e.g. a good patch program should really avoid patching itself) REQUIREMENTS: - Amiga with 68040+ - Move16 is only enabled for the (minimum) Block Size version you installed (larger sizes always qualify). NOTES: CMQ&B040 is an extension of CMQ&S. It has some extra code to handle many small and misaligned copies. There are trade offs in supporting these "Worst Case" copies. Specifically, The Best Case performance has been reduced and the size of the patch has increased to 540 bytes. Since v2.1 stack usage is now 84 bytes per misaligned large block copy. Move16 does not cause a burst access problem with Chip RAM since it simply is not possible to access Chip RAM in this way. Burst operation is controlled in Hardware (See Transfer Burst Inhibit operation in the 040 manual). The Smart buffer copy loop is address restricted (for performance reasons only) when the destination address is in Chip RAM. Block size "Tuning" options are application specific. If you want the fastest copy results for Fast RAM use the Block size = Data cache size option. If you want better multitasking performance use the Block size = 1/2 Data cache size option. If a particular Software application targets non-cacheable memory (e.g. Chip RAM or Graphics Board RAM) the Block size = Smallest option may be faster for that particular case. HISTORY: v1.7 first release v1.8 minor change - removed obsolete Copymemquick source address compare code v1.9 New smart buffer copy code provides a BIG SPEED UP since the MOVE16 alignment restrictions are well handled! v2.0 Fixed a seldom occurring but serious bug with internal Smart buffer usage. - Nested call large block copies (WHEN MISALIGNED!) could corrupt each others data when sharing the same buffer. This fix uses a stack based buffer solution which results in a private buffer for each call. v2.1 Many changes - Fixed a rarely occurring stack size bug when the stack was word aligned and offset by one word from a 16 byte aligned address. - Added code to test for the Move16 address bug and safely exit upon detection - Added code to restrict Smart buffer copy usage when the destination address is in Chip RAM. - Added code to change the default Block size v2.2 minor change - Removed "Move16 Bug" detection code. This was a blunder due to Ax = Ay meaning the same registers rather than the same addresses. v2.3 minor change - Changed address register longword math to word math for the Smart buffer copy loop. This is a small optimization but we always want the fastest possible results ************************************************************* CopyMem Quick & Big040 SAFER v2.3 Parts of patch install code by Dirk Busse 1999 Enhanced patch code by SpeedGeek 2024 INTRODUCTION: CMQ&B040_SAFER is a special version of CMQ&B040 which is intended to be somewhat safer than the standard version. However, it should not ever be considered 100% safe. More specifically, it should provide the ability to crash without a loss of data as described in several of the Motorola Move16 errata cases. This version has some extra code to test if the source and destination addresses are equal. This is a user program bug, but it's still safer to avoid using Move16 in this particular case. There is also code to test these specified destination addresses: - $E00000 EXT ROM space (512 KB) - $F80000 STD ROM space (512 KB) The EXT ROM space is marked MMU invalid for 512 KB Kickstart ROM systems by most 68040 and 68060 libraries. While this should not be the case for 1 MB ROM systems this address space may be MMU write protected by ROM remapping tools. The STD ROM address space may also be MMU write protected by ROM remapping tools. If I understand the Motorola documentation correctly, there should be no need to test the source address for Move16 since the MMU invalid address space doesn't have any valid data to become cached and invalidated. UNSAFE USAGE: This version does NOT attempt to be safe with any possible reported hardware bugs such as: - The early mask set 68040 (e.g. "XC" variant CPUs) - Broken or defective 68040/060 accelerators and turbo boards This version is not safe nor recommended for use with the mmu.library (AKA MMUlib by ThoR). NOTES: Small block copy performance is not affected by the extra Move16 safety code. But of course, large block copy performance will be reduced. Testit results will not be provided with this special version. HISTORY: v2.3 First Safer version - Added code to test for equal source and destination addresses and avoid using Move16 for this specific case. - Added code to test specified destination addresses and avoid using Move16 for those cases. ************************************************************* |
Some Testit results for CMQ&B 1.7:
Code:
This test will compare the old CopyMem/CopyMemQuick routines with Code:
This test will compare the old CopyMem/CopyMemQuick routines with |
Thanks for good patch. What is your next project ?
|
** NEWS UPDATE **
CMQ&B040 v1.8 released v1.8 minor change - removed obsolete Copymemquick source address compare code @HanSolo When there's nothing more to do on this project maybe some scsi.device stuff... |
Hey SpeedGeek,
Where can I find version 1.8? Doesn't seem to be on Aminet. |
Shouldn't New CopyMem have shorter times?
|
Quote:
Quote:
|
Quote:
Thanks, I'll give it a try. :great |
|
** 2ND NEWS UPDATE **
CMQ&B040 1.9 released! -v1.9 New smart buffer copy code provides a BIG SPEED UP since the MOVE16 alignment restrictions are well handled! (See new Testit results). |
** 3RD NEWS UPDATE **
CMQ&B040 2.0 released! v2.0 Fixed a seldom occuring but serious bug with internal Smart buffer usage. - Nested call large block copies (WHEN MISALIGNED!) could corrupt each others data when sharing the same buffer. This fix uses a stack based buffer solution which results in a private buffer for each call. |
Quote:
I have CopyMem060 on mine. |
Quote:
BTW, the so called "060 Optimized" CMQ patches really don't offer much of a performance difference from the 040 CMQ patches. |
Well I'm not getting a fail code. It just boots up and then that's it. Also the version I got was from 09 and off Aminet, so I defo have an old version I think. No matter I just followed the guide that said stick it in where ever and then invoke in your startup after setpatch somewhere and make sure you type the command Run before hand. But I dunno what it's doing or what performance enhancement I'm getting.
You have to pardon my ignorance BTW Well actually you don't but please do :laughing ****edit**** ok so I've just realised your patch is a different thing entirely. Perhaps I should scrap the copymem then and install yours! I downloaded to try! |
** 4TH NEWS UPDATE **
CMQ&B 1.7 released! v1.7 Updated Big loop code with faster instructions. Increased Big loop copy size to 112 bytes. Replaced Small loop copy code with new JMP copy code for <= 108 bytes (See new testit results for 1.7). |
** 5TH NEWS UPDATE **
CMQ&B040 2.1 released! v2.1 Many changes - Fixed a rarely occurring stack size bug when the stack was word aligned and offset by one word from a 16 byte aligned address. - Added code to test for the Move16 address bug and safely exit upon detection - Added code to restrict Smart buffer copy usage when the destination address is in Chip RAM. - Added code to change the default Block size |
** 6TH NEWS UPDATE **
CMQ&B040 2.2 released! v2.2 minor change - Removed "Move16 Bug" detection code. This was a blunder due to Ax = Ay meaning the same registers rather than the same addresses. |
** 7TH NEWS UPDATE **
CMQ&B040 2.3 released! v2.3 minor change - Changed address register longword math to word math for the Smart buffer copy loop. This is a small optimization but we always want the fastest possible results |
Did/does anyone ever notice a real improvement from these CopyMem-improvement patches? Or maybe measure how many calls and what kind of parameters would be generated when using the OS for some ordinary tasks?
There were a lot of these patches, I also did one back in the day and was happy with myself. Whether it made any difference, that's another matter. |
As it says in the description:
Quote:
I doubt you find much, if any software, that will be much fast with that patch compared to other similar patches. |
All times are GMT +2. The time now is 05:44. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.