我试图弄清楚如何证明Lempel ZIV 77压缩算法确实可以提供最佳压缩效果。
我发现以下信息:
So how well does the Lempel-Ziv algorithm work? In these notes, we’ll
calculate two quantities. First, how well it works in the worst case, and
second, how well it works in the random case where each letter of the message
is chosen uniformly and independently from a probability distribution, where
the ith letter appears with probability pi
. In both cases, the compression
is asymptotically optimal. That is, in the worst case, the length of the
encoded string of bits is n + o(n). Since there is no way to compress all
length-n strings to fewer than n bits, this can be counted as asymptotically
optimal. In the second case, the source is compressed to length
α
H(p1, p2, . . . , pα)n + o(n) = n∑(-pi log2 pi) + O(n)
i=1
which is to first order the Shannon bound.
这是什么意思?为什么没有办法将全长度n个字符串压缩到少于n位呢?
谢谢大家
有2 ^ n个长度为n的不同随机字符串。为了解压缩它们,压缩算法必须将它们全部压缩为不同的压缩版本:如果将两个不同的n长字符串压缩为相同的序列,将无法得知要解压缩到哪个。如果将所有压缩到长度为k <n的字符串,则只有2 ^ k <2 ^ n个不同的压缩字符串,因此在某些情况下,两个不同的字符串将压缩为相同的值。
注意,在所有情况下都没有切实可行的最优方案。如果我知道带有密钥的流密码的输出是很明显的随机序列,那么我也可以通过仅给出密码和密钥的设计来描述它,但是压缩可能要花费大量时间该算法可以解决以这种方式可以极大地压缩长的明显随机序列。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句