# information entropy

## EnglishEdit

### NounEdit

**information entropy** (*uncountable*)

- (information theory) A measure of the uncertainty associated with a random variable ; a measure of the average information content one is missing when one does not know the value of the random variable (usually in units such as bits); the amount of information (measured in, say, bits) contained per average instance of a character in a stream of characters.
*A passphrase is similar to a password, except it can be a phrase with a series of words, punctuation, numbers, whitespace, or any string of characters you want. Good passphrases are 10-30 characters long, are not simple sentences or otherwise easily guessable (English prose has only 1-2 bits of*— BSD General Commands Manual : ssh-keygen(1), October 2, 2010.**entropy**per character, and provides very bad passphrases), and contain a mix of upper and lowercase letters, numbers, and non-alphanumeric characters.-
*Imagine a full binary tree in which the probability of getting from any parent node to one of its child nodes is 50%. Associate labels of '0' and '1' to the paths to the child nodes of each parent node. Then the probability of getting from the root to a leaf node is a negative power of 2, and the length of the path (from the root to that leaf node) is the negated base-2 logarithm of that probability. Now imagine a random bit stream in which each bit has an equal chance of being either 0 or 1. Think of the full binary tree as describing a Huffman encoding for a script whose characters are located at the leaf nodes of the said full binary tree. Each character decodes a string of bits whose length is the length of the path from the root node to the leaf node corresponding to that character. Now use "Huffman decoding" to convert the bit stream to a character stream (of the given script). The average compression ratio between the bit stream and the character stream can be seen to be equal to the***information entropy**of the character stream.