Short: XPK libs, smart delta with HA (ASC/HSC) Author: Harri Hirvola and chinoclast@softhome.net (Gaelan Griffin) Uploader: chinoclast softhome net (Gaelan Griffin) Type: util/pack Version: 1.3 (19.6.1999) Requires: XPK installed, util/pack/xpk_User.lha Architecture: m68k-amigaos License/Disclaimer ------------------ This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. Installation ------------ You'll need have the XPK package installed before you can use this library. The latest version of the XPK package is always available on aminet in the directory pub/aminet/util/pack/xpk_User.lha via http and ftp. It should also be available on http://www.amigaworld.com/support/xpkmaster/ If you already have the XPK package installed, then you can add the libraries to your system simply by copying the files from the 68000 or 68020 subdirectory to your LIBS:Compressors directory. Description ----------- This are two XPK sublibraries designed to offer maximum compression. The libs use code taken from the HA archiver by Harri Hirvola (HA 0.999beta) and implement the ASC/HSC compression methods of HA as XPK library. The code was rewritten to be completely reentrant and sophisticated delta preprocessor was added to form two XPK sub libraries. The delta routines test the data to determine the best type of delta to use: 8-bit, 16-bit, or none. Although the library is designed for packing sound data, it also quite good at packing other types of data. Modes Description ------ -------------- 0- 33 plain ASC/HSC compressor 34- 66 ASC/HSC compressor + delta preprocessor 67-100 each chunk is packed twice: with and without delta preprocessor, the shorter chunk is used (doubles packing time) Note: It makes only sense to use this mode if a file contains normal data and samples. Otherwise the packing time is doubled but compression rate will probably not increase. The libraries in detail: xpkSASC.library Compression method using sliding window dictionary followed by arithmetic coder. Offers quite good compression on wide variety of file types. xpkSHSC.library Compression method based on finite context model and arithmetic coder. Quite slow for binary data, but offers very good compression especially for longer text files. The two libraries are basically the same except that they use different algorithms. The HSC algorithm is very slow, but usually packs better than the ASC algorithm. The HSC algorithm is probably too slow to be useful for everyday use, except for with very small files, or perhaps a PPC version. The ASC algorithm is also relatively slow in comparison to some other sub libraries, but upper level 680x0 processors should be able to make use of it. Since HA is a MS-DOS(tm) program it does some things only to overcome 64kB segments of 8086. Using more efficient data structures for 32 bit platforms should improve the performance. A PPC version of these libraries will also improve performance dramatically for those processors, as would an assembly rewrite (any volunteers?;-)). The source code is included, however I would appreciate it if you sent me some email before publically distributing any derivative libraries with the same name. This is in order to avoid incompatibilities and other potential problems. Statistics ---------- Here are some benchmarks to give you an impression of the compressionrate compared to other compression-libraries. Unless otherwise noted maximum efficiency (100%) was used for the tests. a) the ProTracker module 'Space Debris' (mostly 8-bit sample data) XPK Lib Size --------- ------ mod.space_debris 347582 mod.space_debris.sasc.0 262008 mod.space_debris.sasc.50 222184 mod.space_debris.sasc.100 220908 mod.space_debris.shsc.0 241228 mod.space_debris.shsc.50 216764 mod.space_debris.shsc.100 215884 mod.space_debris.bzip 216424 mod.space_debris.crms 224548 mod.space_debris.duke 264768 mod.space_debris.smpl 245276 mod.space_debris.shri 263248 mod.space_debris.dlta+shri 223384 mod.space_debris.sqsh 228328 b) the 22kHz WAV sample 'jingle' (16-bit sample data) XPK Lib Size --------- ------ jingle.wav 799738 jingle.wav.sasc.0 774184 jingle.wav.sasc.50 677008 jingle.wav.sasc.100 677008 jingle.wav.shsc.0 788680 jingle.wav.shsc.50 688612 jingle.wav.shsc.100 688612 jingle.wav.bzip 682704 jingle.wav.crms 799872 jingle.wav.duke 800000 jingle.wav.sdhc.70 (P W1) 726388 jingle.wav.sdhc.75 (P W2) 725672 jingle.wav.smpl 799968 jingle.wav.sqsh 800000 jingle.wav.shri 782472 c) the intuition autodoc (english text) XPK Lib Size --------- ------ intuition.doc 283460 intuition.doc.sasc 92072 intuition.doc.shsc 80280 intuition.doc.bzip 74032 intuition.doc.gzip 85564 intuition.doc.shri 82084 d) the AmigaVision executable (binary data) XPK Lib Size --------- ------ AmigaVision 594712 AmigaVision.sasc 298948 AmigaVision.shsc 288000 AmigaVision.bzip 290316 AmigaVision.gzip 299456 AmigaVision.shri 283636 e) Canterbury Corpus (http://corpus.canterbury.ac.nz) OS Version V40 no-startup, only SetPatch 43.6 CPU: CyberStorm 68060/50Mhz + PPC603e/233MhZ File 'alice29.txt' with a size of 152089 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 57564 2.86 53177 1.17 129990 62.2 SASC: 66 1.3 71716 2.53 60114 1.43 106355 52.9 SASC: 100 1.3 57564 5.39 28216 1.17 129990 62.2 SHSC: 33 1.3 48472 7.32 20777 7.69 19777 68.2 SHSC: 66 1.3 60332 11.32 13435 11.91 12769 60.4 SHSC: 100 1.3 48472 18.63 8163 7.69 19777 68.2 File 'asyoulik.txt' with a size of 125179 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 50964 2.36 53041 1.04 120364 59.3 SASC: 66 1.3 64252 2.11 59326 1.29 97037 48.7 SASC: 100 1.3 50964 4.48 27941 1.04 120364 59.3 SHSC: 33 1.3 43220 6.37 19651 6.70 18683 65.5 SHSC: 66 1.3 53868 9.88 12669 10.48 11944 57.0 SHSC: 100 1.3 43220 16.24 7708 6.70 18683 65.5 File 'cp.html' with a size of 24603 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 7860 0.31 79364 0.16 153768 68.1 SASC: 66 1.3 9868 0.32 76884 0.20 123015 59.9 SASC: 100 1.3 7860 0.63 39052 0.16 153768 68.1 SHSC: 33 1.3 6988 1.20 20502 1.27 19372 71.6 SHSC: 66 1.3 9164 1.87 13156 2.01 12240 62.8 SHSC: 100 1.3 6988 3.08 7987 1.28 19221 71.6 File 'fields.c' with a size of 11150 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 3092 0.12 92916 0.06 185833 72.3 SASC: 66 1.3 3828 0.13 85769 0.08 139375 65.7 SASC: 100 1.3 3092 0.25 44600 0.06 185833 72.3 SHSC: 33 1.3 2860 0.46 24239 0.49 22755 74.4 SHSC: 66 1.3 3520 0.68 16397 0.73 15273 68.5 SHSC: 100 1.3 2860 1.14 9780 0.49 22755 74.4 File 'grammar.lsp' with a size of 3721 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 1244 0.04 93025 0.03 124033 66.6 SASC: 66 1.3 1556 0.04 93025 0.03 124033 58.2 SASC: 100 1.3 1244 0.09 41344 0.03 124033 66.6 SHSC: 33 1.3 1136 0.18 20672 0.20 18605 69.5 SHSC: 66 1.3 1476 0.26 14311 0.29 12831 60.4 SHSC: 100 1.3 1136 0.45 8268 0.20 18605 69.5 File 'kennedy.xls' with a size of 1029744 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 180196 116.72 8822 5.15 199950 82.6 SASC: 66 1.3 251812 46.27 22255 6.32 162934 75.6 SASC: 100 1.3 180196 163.01 6317 5.15 199950 82.6 SHSC: 33 1.3 142024 65.98 15606 77.20 13338 86.3 SHSC: 66 1.3 255292 80.93 12723 90.89 11329 75.3 SHSC: 100 1.3 142024 146.90 7009 77.21 13336 86.3 File 'lcet10.txt' with a size of 426754 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 155436 7.29 58539 3.17 134622 63.6 SASC: 66 1.3 192008 6.75 63222 3.87 110272 55.1 SASC: 100 1.3 155436 14.04 30395 3.17 134622 63.6 SHSC: 33 1.3 132492 20.26 21063 21.36 19979 69.0 SHSC: 66 1.3 164680 30.70 13900 32.42 13163 61.5 SHSC: 100 1.3 132492 50.95 8375 21.36 19979 69.0 File 'plrabn12.txt' with a size of 481861 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 205856 11.23 42908 4.20 114728 57.3 SASC: 66 1.3 255748 8.85 54447 5.15 93565 47.0 SASC: 100 1.3 205856 20.09 23985 4.20 114728 57.3 SHSC: 33 1.3 173388 24.89 19359 26.06 18490 64.1 SHSC: 66 1.3 213032 36.20 13311 38.01 12677 55.8 SHSC: 100 1.3 173388 61.08 7889 26.06 18490 64.1 File 'ptt5' with a size of 513216 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 52068 9.39 54655 1.39 369220 89.9 SASC: 66 1.3 66140 9.34 54948 1.68 305485 87.2 SASC: 100 1.3 52068 18.73 27400 1.39 369220 89.9 SHSC: 33 1.3 52036 12.05 42590 12.99 39508 89.9 SHSC: 66 1.3 66292 17.38 29529 18.54 27681 87.1 SHSC: 100 1.3 52036 29.42 17444 12.99 39508 89.9 File 'sum' with a size of 38240 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 13204 1.11 34450 0.29 131862 65.5 SASC: 66 1.3 16720 0.84 45523 0.36 106222 56.3 SASC: 100 1.3 13204 1.95 19610 0.29 131862 65.5 SHSC: 33 1.3 13876 3.03 12620 3.36 11380 63.8 SHSC: 66 1.3 17760 4.27 8955 4.73 8084 53.6 SHSC: 100 1.3 13876 7.30 5238 3.36 11380 63.8 File 'xargs.1' with a size of 4227 bytes. Type Num Version P CSize CTime CSpd UTime USpd Rate SASC: 33 1.3 1744 0.05 84540 0.04 105675 58.8 SASC: 66 1.3 2248 0.06 70450 0.05 84540 46.9 SASC: 100 1.3 1744 0.11 38427 0.04 105675 58.8 SHSC: 33 1.3 1580 0.24 17612 0.26 16257 62.7 SHSC: 66 1.3 2072 0.37 11424 0.41 10309 51.0 SHSC: 100 1.3 1580 0.62 6817 0.26 16257 62.7 "Thank you"s must go to ----------------------- Bryan Ford, Urban Dominik Müller, Christian Schneider, Christian von Roques, Dirk Stöcker for the XPK standard Harri Hirvola for releasing the sources of HA under the GNU license Dirk Stöcker for making the Canterbury Corpus benchmarks Jan Krolzig for testing the library on his '060 History ------- V1.3 First public Release