Suppose I have an ordered list of bytes (the hexdump of some object file), and wish to calculate the information entropy of this file. My understanding is I can calculate this as $$ \sum_{n=0}^{n=255} -p_n \log_{256}(p_n) $$
where $p_n = \frac{(\text{number of n-valued bytes})}{(\text{total number of bytes})}$.
My understanding is that the information entropy should be the theoretical lower bound for the compression ratio for a file. But when calculating the entropy of the standard C library, I get an entropy of ~0.8, when it's possible to compress the standard C library to 40% of the original size using gzip.
What am I misunderstanding here? Perhaps my calculation of $p_n$ is too simplistic, as the value of every byte in a byte stream is not independent of the preceding bytes, in the same way that characters in English text are not independent. Is there a better way to calculate the informational entropy of a file?