One Time Pads

Last Updated 7 Sept 06
Copyright 2006 by Stephen Vermeulen



Introduction


One time pads are the only encryption system that has been proven to be theoretically unbreakable.

That said, one time pads suffer from two main flaws:
  1. the contents of the "pad", which is a book of keys must be kept secret, and since the sender and the receiver both need a copy of this book physical security may well be an issue.
  2. the keys that comprise the one time pad must be cryptographically random

One time pads can be quite attractive for encrypting small amounts of data, for example to exchange passwords over an insecure channel, when other more sophisticated methods are not available. For an early version of my PictaTrove web server I implemented a one time pad system to protect passwords that were entered through the remote administration interface.

The one time pad system is comprised of three main components:
  1. a small Python program to generate a one time pad of keys
  2. some Javascript to run in a web browser that will encrypt some text with a key from the one time pad
  3. some Python script on the Python web server to decrypt the cypher text that the browser sent, using its copy of the one time pad

Key Generation Utility

Of these three pieces the only one that is questionable in a cryptographic sense in the key generation utility. Here is how I have approached the problem of producing a "random" set of keys for the pad:
  1. the user specifies a directory on their computer that has a lot of files in it, the program will do a recursive directory listing of this directory tree and make a list of all the files in it that are larger than 8K bytes
  2. the program will check to see that this list contains at least 256 files, if there are less than 256 the program will stop and not produce a one time pad
  3. the program will select 128 of these files at random from the list (the rest will not be used)
  4. the program will then open each of these 128 files, and then seek to a random location within the file and read 2K of data from it
  5. at this point the program has 256K bytes of data
  6. it then shuffles this block of data
  7. it then starts an MD5 digest running (initializing it with some random text the user enters), and loops over the data. It takes blocks of 32 bytes of input and runs it into the MD5 function and then takes the 16 byte hash value and records that as output (so the 256K of input data produces 128K of output)
  8. it encodes 32 bytes of output at a time in hexadecimal format (so it takes 64 characters) and writes these as lines (along with a line number) to the otp.txt output file. So in the end you have 4K lines each containing a 32 byte long key (that is represented as 64 hex characters) for a total of 128K bytes of key data.

The key weakness of this is that the process is completely deterministic.  If you know:
  1. the exact time that the program started running (as this is used as a seed to the random number generator)
  2. the exact directory that was used
  3. the random text that the user entered
  4. and the exact files that were in the directory tree

then the program would always produce exactly the same one time pad file. However, as the exact time of running is not known (though could be reduced to a range of times based on the timestamp of the otp.txt file) and the directory that was used as a source of files is also not known and the random text is not known, then even if an attacker can gain access to the computer where the pad was generated it would take a lot of effort to create a duplicate pad.  In fact, it would be vastly simpler to just look for the otp.txt file and make a copy. Once you consider this, it becomes apparent that any theoretical weakness in the way that the pad file is created is far less significant than the risk of the pad file being discovered and copied.

The Python program to make a one time pad is here: makeotp.py.

Client Side Javascript

To make use of this to protect some data that you enter on a web form you would arrange for the user to have two text entry fields, one to enter the data and a second to enter a key from his copy of the one time pad.  When the server generates the HTML for the form it will include the line number of the key that the user should use from the one time pad. The user can then do a simple copy/paste operation to place the 64 hex characters that make up the key into the form. Note: this limits the amount of data that can be encrypted at once to 32 bytes, though you could always enter more keys to encrypt more data if needed.

You then set up the page so that when the user hits the submit button the a fragment of javascript runs in the browser to actually read the key value, use it to encrypt the data and send the result to the server.

This one-time-pad-example.html page is an example of the HTML and Javascript for such a form. It is public domain so use it if you want. In it I unpack the hex key to binary, combine it with the password to be protected and then re-encode back to hex to send the result to the server (along with a user name that is not protected).



back to vermeulen.ca home