Vigenère Cipher Codebreaker
Pronunciation: Pronounced "vidjenair"
Type: Substitution cipher
Description: Each letter in the message is replaced by a letter some fixed
number of positions down the alphabet. However, there is a different shift number
for each letter based on a predetermined key.
Do you know the key that dictates the shift for each letter?
TIP: This codebreaker analyzes the encrypted text to determine
the most probable key length and then tries to guess the key based on known character
frequencies/words in the English language. It may not find the actual key,
so make sure to perform your own human analysis of the results. For example,
the codebreaker may guess the key is "dacrypk". You can then determine the
key is actually "decrypt" and run the codebreaker again with "decrypt" as the key.
Codebreak may take up to a minute to finish.
Information About the Vigenère Cipher Codebreaker Algorithm
A Vigenère cipher shifts each character of a plain text message a number of positions
based on a keyword. Essentially, a Vigenère cipher consists of several Caesar
ciphers in sequence with different shift values.
The shift value for any given character is based on the keyword. The keyword
is repeated so that it is the same length of the message. Then, the corresponding
keyword character determines the shift for it's respective message character.
Codebreaking Without a Key
- Guessing the Key Length
Our first step is to examine repetitions in the encrypted text so we can guess at
the length of the key. In a Vigenère cipher, common words such as "the" or
"that" can be encrypted differently each time. However, if the message is
long enough, repetitions in the code can still be exploited. This codebreaker
analyzes the space between these repetitions to make a guess at the key length.
- Guessing the Most Probable Key
After making an educated guess at the key length, we now need to guess at the
key text itself. Research of the English language has produced a table of
how often each letter of the English alphabet is used in the English language.
For example, "e" is used most often, so for most encryptions we can guess that the
most often used character is probably actually an "e". With a Vigenère cipher,
this is difficult because an "e" could be encrypted in multiple ways based on the
key. Counting the characters won't work.
However, if we know that the key is of length N, then we know that every Nth character
is encrypted the same way. We can split the whole encrypted message into N
messages by lumping the characters that are encrypted with the same shift.
Then we essentially have N different messages that are normal Caesar shifts.
Each is shifted by one value - a single character of the key. We can apply
frequency analysis to each individual message to guess at those single characters.
We combine our best guesses at each character to get the best keyword guess.
- Looking At More Keys
Since the above method makes many assumptions, it's likely that the best guess
is not the key. This codebreak tries to overcome that limitation by making
many best guesses. For example, the 2nd best guess may use the 2nd most likely
letter based on frequency analysis to guess the 1st character of the key.
We find about 1000 best guesses.
Then we further rank the "strength" of these best guesses. A guess is considered
"stronger" if it contains a word from the English dictionary. After all, a
key is most likely going to be an actual word or multiple words. A guess is
considered even stronger if it contains special geocaching terms like "cache."
The best of the best guesses rank higher and are displayed higher on the list.