## Entropy and Information

That probability is closely connected with information should come as no surprise after problems such as Exercise 6.5 (the jailer paradox). What entropy does is to make this connection precise. We begin with finite-valued random variables. The notion of entropy is quite clear in this case, and it forms the basis for one of the most dramatic applications of the law of large numbers to information theory: The Shannon Coding Theorem. We then consider continuous random variables. Entropy is much harder to define in this case, but the reward is that we can then prove that essentially all the interesting distributions we have seen in probability theory may be defined by entropy considerations.