Short: Patch the utility.library for 68060 Author: matthey7@gmail.com (Matt Hey) Uploader: matthey7 gmail com (Matt Hey) Type: util/boot Version: 1.2 Requires: 68060, Kickstart 3.0+ Architecture: m68k-amigaos >= 3.0 *** This patch no longer patches SDivMod32() & UDivMod32(). These functions use divsl.l & divul.l which return 64 bits of results in 2 registers but are NOT considered to be 64 bit integer instructions which are missing in the 68060. The previous versions of this patch should not be used anymore as the SDivMod32() & UDivMod32() are slower after all. The SMult64() and UMult64() are faster than what is in the Mu 68060.library in all cases but there are few enough programs using these functions that this patch is barely worth while. I tried to get Thomas Richter to include the code in his Mu 68060.library so this patch would be unnecessary but he likes the slower Motorola code because it's tested better. Using a variation of the code in FFmpeg was not proof enough I guess. *** Description: This is a small patch which replaces the UMult64() and SMult64() functions of utility.library V39+ with faster 68060 optimized versions. UtilPatch060 tests for a 68060 processor. If it can't find one, it doesn't install the patch and exits with a return code of 20 (=fail). If it can't open utility.library V39+, it also does not install the patch and exits with a return code of 20. This code is based in part on Harry "Piru" Sintonen's NewPatchMult64. Thanks Piru! Features: Avoids going through the CPU trap for unimplented integer exception. Slightly faster UMult64() and SMult64() than Piru's NewPatchMult64. Doesn't touch the MMU (directly) which is proper when using Thomas Richter's mmu.library and 68060.library and also unnecessary. Installation is fast, doesn't fragment memory, and doesn't use much memory. The full source code is included. Assembled with PhxAss. Free. Installation: Copy UtilPatch060 wherever you like. It runs from Workbench but I recommend starting from cli in the S:startup-sequence after the setpatch command. It does not detach from the shell so the following command is needed... Run >NIL: UtilPatch060 As little as 512 bytes of stack should be fine. A control C will break UtilPatch060 causing it to uninstall which can be dangerous because of how the exec setpatch function works. Debugging: I have included a Snoopy 2.0 (Aminet:util/moni/snoopy20.lha) script that logs the patched functions in the utility.library. Please report the values logged if there are any errors. The default output goes to t:snoopy.txt but can be changed in the icon tooltype. Please Report all bugs to the e-mail address at the top. Speed comparison: Here are some speed comparisons using Mult64SpeedTest from Piru's NewPatchMult64. The first is the Mu 68060.library vs UtilPatch060... 5000000xUMult64 (positiveXpositive): 207 187 9.6% faster 5000000xUMult64 (positiveXzero): 113 110 2.7% faster 5000000xSMult64 (positiveXpositive): 239 219 8.4% faster 5000000xSMult64 (negativeXpositive): 248 231 6.9% faster 5000000xSMult64 (any)product fits in 32 bits: 239-248 110 54.0%-55.6% faster 5000000xSMult64 (positiveXzero): 142 110 22.5% faster and NewPatchMult64 vs UtilPatch060... 5000000xUMult64 (positiveXpositive): 191 187 2.1% faster 5000000xUMult64 (positiveXzero): 110 110 same speed 5000000xSMult64 (positiveXpositive): 215 219 1.9% slower 5000000xSMult64 (negativeXpositive): 223 231 3.6% slower 5000000xSMult64 (any)product fits in 32 bits: 215-223 110 48.8%-50.7% faster 5000000xSMult64 (positiveXzero): 110 110 same speed Note that smaller numbers are better (less time). There is a commented out version of SMult64() in the source that is slightly faster than Piru's in all cases. My default SMult64() is much faster if the product of the multiply fits in 32 bits and is a little slower than Piru's if it doesn't. I thought this was a good tradeoff. Future: I may make a CPUpatch that applies all 68060 patches if a 68060 is detected and all 68040 patches if a 68040 is detected, etc. I may have the mmu.library protect the patches. History: 1.0 (23.07.09) First release 1.1 (25.07.09) Saved a few bytes 1.2 (12.09.09) SDivMod32() & UDivMod32() are no longer patched as divsl.l & divul.l are in the 68060. Added missing Permit() after Replying to the WB message. Thanks: Harry "Piru" Sintonen and Dirk Busse for NewPatchMult64. Thomas Richter for pointing out that divsl.l & divul.l are in the 68060 after all.