After some crypto sillyness with @feliam, @julianor and @ortegaalfredo on Twitter I cooked up a one-time pad crypto implementation in Python. This speaks volumes, not of my talent as a cryptographer (which is none at all) but the sad state of my social life these days (which happens to be the same amount).
What is one-time pad encryption?
Feel free to skip this section if you already know the answer. Especially so you don’t have to suffer my layman’s explanations of cryptography.
To put it simply, a one-time pad cipher is one in which the plaintext (i.e. the original message) is encoded using a completely new, random key each time it’s sent. When properly used (and I hope I have) this system is provably unbreakable. That means the ciphertext (that is, the encoded message) can never be decoded without the proper key – even if the encoding algorithm is very simple, like a bitwise XOR operation on each byte.
There are a few caveats: first of all, the key can never be reused. If you do, the system not only ceases to be unbreakable but it’s also as strong as the encoding algorithm you used. So if you chose XOR encoding and were tempted to use the key twice, you might as well have used a “magic ring” from a cereal box.
The second caveat: the key must be truly random. For this reason need a random generator that can guarantee a certain amount of entropy, for example /dev/random on many Unix systems, to get the one-time pads, instead of the random module, which is only a PRNG (pseudo-random generator). PRNGs can only produce a seemingly random stream of numbers, all derived from a single value (called the seed number) – so it’s “randomness” is just as good as the seed number from which all others are calculated. (This is a useful property in other contexts, like avoiding to have to store the contents of all malformed files produced by a fuzzer in order to reproduce the crashes, but I digress).
The third caveat: the key must never be transmitted over an insecure medium. Sounds pretty much like a no-brainer, I know, but it’s worth mentioning that public key crypto doesn’t suffer from this problem. (Now you know why GPG is so much better than this). Real-life uses of one-time pads include storing the keys in codebooks, which the recipient of the message would carry everywhere. Then the encrypted messages could be safely sent in the clear, say on some radio frequency by a numbers station, until the codebook was used up.
How does this code work?
If you weren’t among the lucky ones who skipped over my ramblings in the previous section you can easily guess by now: we’ll be using a bitwise XOR encoding of each byte of the plaintext against the corresponding byte of the one-time pad to produce the ciphertext. This is how we generate a one-time pad of any given size:
$ ./otp.py generate test.key -s 1024 $ ls -l test.key -rw-r--r-- 1 user group 1024 2010-02-17 01:23 test.key $
The alternative for the lazy is to pass the name of the file we want to encrypt. A one-time pad of the exact same size will be generated. We’ll use the -f flag this time to force overwriting the previous file.
$ ./otp.py generate test.key conscience.txt -f $ ls -l test.key conscience.txt -rw-r--r-- 1 user group 3880 2010-02-17 01:22 conscience.txt -rw-r--r-- 1 user group 3880 2010-02-17 01:24 test.key $
And to satisfy all audiences, there’s also an option for the paranoid: the -p flag uses /dev/random for maximum security instead of the much faster /dev/urandom. It does take considerably longer to generate even small one-time pads, that’s why this option is disabled by default.
$ ./otp.py generate test.key conscience.txt -f -p $ ls -l test.key conscience.txt -rw-r--r-- 1 user group 3880 2010-02-17 01:22 conscience.txt -rw-r--r-- 1 user group 3880 2010-02-17 01:38 test.key $
Now that we have our one-time pad we can encrypt the message:
$ ./otp.py encrypt conscience.txt test.key conscience.crypto $ ls -l conscience.* -rw-r--r-- 1 user group 3880 2010-02-17 01:38 conscience.crypto -rw-r--r-- 1 user group 3880 2010-02-17 01:22 conscience.txt $
Both files are the same size but have different contents. Since it’s no longer ASCII trying to cat the file only renders a bunch of garbage in the terminal. Finally, this is how you decrypt it:
$ ./otp.py decrypt conscience.crypto test.key conscience2.txt $ ls -l conscience* -rw-r--r-- 1 user group 3880 2010-02-17 01:38 conscience2.txt -rw-r--r-- 1 user group 3880 2010-02-17 01:38 conscience.crypto -rw-r--r-- 1 user group 3880 2010-02-17 01:22 conscience.txt $ cmp conscience.txt conscience2.txt $
After decryption, conscience2.txt is identical to the original file and contains the familiar text of The Conscience of a Hacker.
As always, the code is available for download below. Enjoy!
- 24-Jul-2011: Small update to the command like parsing and the documentation.