Short:        file compressor (100% bugfixed)
Author:       Wildfire of Decay
Uploader:     aminet aminet net
Type:         util/pack
Version:      1.2
Architecture: m68k-amigaos


        MaxPacker version 1.20 final release (100% version for DECAY!)


     INTRODUCTION:


        Well this is it the final 100% version release.
        Only 1 major bugfix!  which occured when users tried packing
        already compressed files.  Now when this happens and as a result
        nothing can be made smaller within a 16 kb distance you will see
        a message telling you of the problem.  This however should rarely
        happen since gains can even be made on other types of compression

        This Program is written in Assembler for use on the Amiga only!
        The current version of MaxPacker is fashioned after that of LHARC,
        although it is designed to operate more like TetraPacker.
        Testing has revealed that this version Packs 40% faster than Tetra-
        Packer and Depacking is also over 200% faster!!!

        The average rates are as follows -  Pack: 22 kilobytes per minute
                                          DePack: 67 kilobytes per second


        Please NOTE: The Following Informations are derived in part from
        an early ARJ document.  I sought to include the applicable info
        here so as to enlighten users a bit more on how file compression
        has evolved over the years...
        So the following is a brief synopsis on the state of the art in
        data compression research.  MaxPacker does not use any of the
        proprietary algorithms quoted in the text below.  Infact the
        algorithm is one which is not patented, but like ARJ & LHARC etc.
        is based on both Lempel and Huffman encoding to achieve optimum
        results.


     TERMINOLOGY:

        COMPRESSION - The process of encoding redundant information into
        data requiring less storage space.

        COMPRESSION PERCENTAGE/RATIO - The percentage compression reported
        by MaxPacker is a variation of one of the TWO standard methods of
        expressing compression ratio in the technical literature.  Max uses
        the compressed size / original size ratio.  The other method is the
        inverse ratio.  When Max reports 95% as the compression ratio, that
        means that the compressed file is 95 percent of the original size
        (very little compression).  Other archivers use their own methods.
        LHARC v1.30 uses an (inverse ratio) the opposite ratio of MaxPacker


     RECENT DATA COMPRESSION RESEARCH:

        The following information is explained in more detail in the book,
        TEXT COMPRESSION by Timothy Bell, John Cleary and Ian Witten

        published by Prentice Hall, 1990.  ISBN 0-13-911991-4.

        Three of the most useful types of data compression involve context
        modeling, Lempel-Ziv 1977, and Lempel-Ziv 1978.

        Context models use the preceding few characters to PREDICT
        (estimate the probability of) the next character.  For example,
        what is the next character after "tabl"?  There is a high
        probability that the character is "e".  Much of today's research is
        focused on these types of data compressors because they provide the
        best "TEXT" compression results.  The disadvantage of this family
        of compressors is the amount of memory required.  Three practical
        models require from 500,000 to 5,000,000 bytes of memory.

        More familiar is the Lempel-Ziv 1978 (LZ78) family of data
        compressors.  This family as well as Lempel-Ziv 1977 are classified
        as adaptive dictionary encoders.  In these methods, text strings
        are replaced by pointers to previous occurrences of duplicate
        strings.  For example, the words in this document could be
        represented by dictionary page and line numbers.  The significant
        characteristic of LZ78 algorithms is the parsing of previous text
        data into PHRASES that are stored in a dictionary.  Pointers are
        allowed to reference such phrases; however, pointers are not
        allowed to reference substrings of such phrases.

        Lempel-Ziv-Welch, 1984, (ARC shrinking, crushing, UNIX COMPRESS) is
        the most familiar of this family.  Numerous articles have been
        published concerning this method.  Unknown to many is the fact that
        this algorithm has a US patent number owned by UNISYS.  These
        algorithms are FAST.  Their disadvantages include the problem of
        handling a FULL dictionary when the input text data no longer
        matches the contents of the dictionary and the poor compression of
        non-text data.

        Lempel-Ziv 1977 (LZ77) algorithms have recently come into popular
        practical use (LHARC, PKZIP 1.10, ARJ, PAK).  In the original LZ77
        scheme, pointers are allowed to reference any phrase in a
        fixed-size window that precedes the current phrase.  A matched
        current phrase is replaced by the pointer to the previously
        occurring duplicate phrase.  This pointer in LZ77 consists of an
        offset into the window and the length of the phrase.  The original
        implementation was considered impractical as it used a brute force
        string searching method to find duplicates.  It was quite slow
        requiring up to N character comparisons in an N sized window for
        each phrase looked for.

        However, in 1987, Timothy Bell proposed using a binary tree to
        index into the window to speed the lookup of phrases.  This
        provided an order of magnitude speed increase.  In the same year,
        Brent published an algorithm that used a secondary compression
        method (Huffman encoding, 1952) to further encode the output of a
        LZ77 encoder.  Huffman encoding is the subsitution of variable
        bit-length codes for standard codes (8-bit ASCII) based upon
        frequency of occurrence.  The most frequent codes have the shortest
        bit-lengths.

        LHARC is a combination of these two ideas.  It uses a binary tree
        LZ77 encoder with dynamic Huffman encoding (1978) of the output.
        The advantage of using dynamic Huffman encoding is adaptivity to
        rapidly changing data.  The disadvantage is slow decoding.

        PKZIP 1.0, I believe, uses a combination of binary tree encoding
        with SHANNON-FANO (1949) static encoding.  Static SHANNON-FANO
        encoding has the advantage of fast decoding and the disadvantage of
        less optimal compression especially on rapidly changing data.

        PKZIP will perform better in compression than LHARC on many large
        files because PKZIP uses a larger window (8K) than LHARC (4K).
        LHARC will perform best on binary data with rapidly changing
        characteristics like executables that include a lot of text data.

        LH (LHarc 2.0), I believe, uses a digital TRIE LZ77 encoder with a
        static Huffman encoder.  It uses an 8K window.  LH uses a
        particularly efficient Huffman encoder which works on small data
        sets.  Many static encoders suffer from the cost of including the
        table of codes with the encoded data.  The digital TRIE has the
        advantage of always producing the most recent match unlike binary
        trees.  This recency of match is significant when secondary
        compression is used. Much of this work has been derived and
        improved upon from literature published in the last two years.

        It will be interesting to see if developers can significantly
        improve upon LH since it is based upon the most recent published
        research in data compression.

        One of the most significant new encoders published is that of Fiala
        and Greene, 1989.  This method uses pointers that point to nodes in
        the TREE data structure as opposed to the phrases in the window.
        This bypasses the problem of redundancy of phrases in any given
        window.  Since each node is unique, fewer pointers are needed and
        thus are more easily compressed by a secondary compressor.  This
        technique is currently patent pending.

        I hope that this synopsis of compression research has been
        enlightening.


     End of document