combos.html // This script runs combino.py in the process. ?> Calculating the fraction of combinations (This section is taken from an earlier draft of squozen_code.html where I was using nbits() instead of P().)

We can ask, how big is the message after the first gap s is removed. At that point the number of gaps decreases by one, and the total of the gap widths decreases by s. Subtracting the two sizes gives the number of bits it took to encode s. Let's call this function nbitsC() for Combinatorics:

               (1000000 +232-1)!         (999999 +232-1-s)!
nbitsC(s)=log2(-----------------) - log2(-----------------)
                1000000!(232-1)!          999999!(232-1-s)!
Of course this is a division (the inverse of the fraction of messages that start with s):
               (1000000 +232-1)!  999999!(232-1-s)!
nbitsC(s)=log2(-----------------------------------)
                1000000!(232-1)! (999999 +232-1-s)!       
Separating the first factor of the first factorial...
       (1000000 +232-1) (999999 +232-1)!  999999!(232-1-s)!
= log2(---------------------------------------------------)
        1000000!(232-1)!                 (999999 +232-1-s)!       
Canceling and rearranging a bit...
       1000000+232-1         (999999 +232-1  )! (232-1-s)!
= log2(-------------) + log2(----------------------------)
          1000000            (999999 +232-1-s)! (232-1  )!
When s = 0, the right term is zero, and the constant on the left is the number of "overhead" bits per codeword:
                 1000000+232-1
nbitsC(0) = log2(-------------) ~= 12.0687673
                   1000000 
When s is small, the right term is approximately
                            999999 +232-1-(s+1)/2
nbitsC(s)-12.0687673 = log2(---------------------) * s
                               232-1-(s+1)/2
which is fairly linear until s gets close to 232, as the following table (calculated
here) shows:

This page was made with a php script and a Python program. Last change

--Steve Witham

Up to