Calculating the fraction of combinations

combos.html // This script runs combino.py in the process. ?> Calculating the fraction of combinations (This section is taken from an earlier draft of squozen_code.html where I was using nbits() instead of P().)

We can ask, how big is the message after the first gap s is removed. At that point the number of gaps decreases by one, and the total of the gap widths decreases by s. Subtracting the two sizes gives the number of bits it took to encode s. Let's call this function nbitsC() for Combinatorics:

               (1000000 +2³²-1)!         (999999 +2³²-1-s)!
nbitsC(s)=log₂(-----------------) - log₂(-----------------)
                1000000!(2³²-1)!          999999!(2³²-1-s)!

Of course this is a division (the inverse of the fraction of messages that start with s):

               (1000000 +2³²-1)!  999999!(2³²-1-s)!
nbitsC(s)=log₂(-----------------------------------)
                1000000!(2³²-1)! (999999 +2³²-1-s)!

Separating the first factor of the first factorial...

       (1000000 +2³²-1) (999999 +2³²-1)!  999999!(2³²-1-s)!
= log₂(---------------------------------------------------)
        1000000!(2³²-1)!                 (999999 +2³²-1-s)!

Canceling and rearranging a bit...

       1000000+2³²-1         (999999 +2³²-1  )! (2³²-1-s)!
= log₂(-------------) + log₂(----------------------------)
          1000000            (999999 +2³²-1-s)! (2³²-1  )!

When s = 0, the right term is zero, and the constant on the left is the number of "overhead" bits per codeword:

                 1000000+2³²-1
nbitsC(0) = log₂(-------------) ~= 12.0687673
                   1000000

When s is small, the right term is approximately

                            999999 +2³²-1-(s+1)/2
nbitsC(s)-12.0687673 = log₂(---------------------) * s
                               2³²-1-(s+1)/2

which is fairly linear until s gets close to 2³², as the following table (calculated here) shows:

This page was made with a php script and a Python program. Last change

--Steve Witham
Up to

How to Reinvent the Rice Code.
Squozen Sort.
my home page.