Fathom: The Source for Online Learning  
 
Help About Us Course Directory
Browse Fathom


 
 
 
Unlocking Cryptography
From: Columbia University | By: Deane Yang

EDITOR'S INTRODUCTION | Ever send an encrypted e-mail to a friend? Or wonder why your credit-card number on Amazon.com is actually safe? Deane Yang, professor of mathematics at Polytechnic University and a former postdoctoral associate at Columbia University, investigates the ins and outs of electronic security--the codes that place virtual walls around our most trusted information. The surprise: The success of one of the most sophisticated tools in use today relies on the basic arithmetic you learned in grade school.


ou're an aspiring evil genius looking for a way to take over the world, steal a lot of money, or just create a lot of havoc. What's the easiest way to do this with a minimum of bloodshed?


Figure out how to factor a number really quickly.


You might remember learning about factoring numbers back in elementary school. Factoring a number is like multiplication in reverse. For example, to factor a number--say, 105--just find two smaller numbers that create 105 when you multiply them together.


Here, it's not hard to see that: 105 = 5 x 21. The numbers 5 and 21 are called factors. To factor a number completely, factor each of the factors and keep going until you can't go any further. Here, you can factor 21 as 3 x 7, so 105 = 3 x 5 x 7.


None of the factors--3, 5 and 7--can be factored into smaller whole numbers. They are called primes, numbers that can't be factored into anything other than one and the number itself. The expression 3 x 5 x 7 is called the prime factorization of the number 105.


Now, suppose you found a way to factor, say, 200-digit numbers quickly. What could you do then?


First, you would be able to decrypt virtually any secure message transmitted on the Internet, including credit-card numbers, banking records, tax returns, health care records, and stock market transactions. Further, you could gain access to the secure networks of banks, government agencies and brokerage firms.


Public key cryptography demonstration.
These and similar systems safeguard their transactions through mathematically based systems of cryptography. The most commonly used system is public key cryptography, which works on the principle that although it might be easy to find the factors of a small number such as 105, it is virtually impossible to find the prime factors of very large numbers that have, say, a few hundred digits.

What is cryptography?

Cryptography is the science and art of encrypting a message so that no one can read it except the intended recipient. Cryptanalysis is the science and art of trying to defeat a cryptographic system, so that you can read encrypted messages that were not intended for you. The two fields together comprise the field of cryptology.


At first sight, encrypting a message does not seem very hard to do. You meet with your fellow collaborator and agree on a procedure for jumbling your messages. One of the simplest methods is to replace each letter of the alphabet with another letter or number.


For example, you agree in advance that each letter of the alphabet will be replaced by the letter that follows: "A" will be replaced by the letter "B," the letter "B" by the letter "C" and so on. Using this method, the message "MEET ME TOMORROW" becomes "NFFU NF UPNPSSPX" when encrypted.


Unfortunately, this system is very easy to crack using simple statistical rules. For example, code-breakers' keys can tell you how many times, on average, certain letters of the alphabet appear in different languages. "E" is the most common letter in English, so chances are that it will be the most common symbol in an encrypted message (if encrypted in English), followed closely by letters such as "S" and "T."


There are also statistical rules for the placement of similar letters that tell you, for example, that a certain percentage of the time a "G" is next to an "H" or that a vowel appears next to a certain consonant. You also have a good shot at deciphering small words quickly, because there are only two one-letter words in English (I and a) and very few two-letter words.


Encoders had to find a more complicated approach to stymie these statistical rules. One option requires a two-step process. First, you rearrange the letters of your message according to a system so that you destroy their original position in the message, making it impossible for code-breakers to apply statistical rules regarding the placement of letters to decode the original. For example, you might agree with the recipient of your message to move the letters in your message according to this key:


Original Position
Encrypted Position
Original Position
Encrypted Position
First Letter
Third
Eighth Letter
Fourth
Second Letter
Fifth
Ninth Letter
Fourteenth
Third Letter
Eighth
Tenth Letter
Second
Fourth Letter
Ninth
Eleventh Letter
Tenth
Fifth Letter
Twelfth
Twelfth Letter
Thirteenth
Sixth Letter
First
Thirteenth Letter
Sixth
Seventh Letter
Eleventh
Fourteenth Letter
Seventh


The actual position of the encrypted letters does not matter and can be randomly generated. As long as the recipient of the encrypted message has this key, he can reverse the process and decode the message.


According to the key above, this is what might happen to our message "MEET ME TOMORROW":


Position
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Original
M
E
E
T
M
E
T
O
M
O
R
R
O
W
Encrypted
E
O
M
O
E
O
W
E
T
R
T
M
R
M


The encrypted result is thus: E O M O E O W E T R T M R M.


The second step is substitution, so that the statistical rules, which indicate the average occurrence of letters, cannot be used to decode the message. An easy means of substitution is to generate a list of all possible two-letter combinations, such as AD, CZ or GH, and replace them with numbers. (This list is actually fairly easy to generate by computer.)


This list may have already established that:

EO=55
MO=213
WE=9
TR=44
TM=60
RM=14



Substituting these letters with numbers finishes the encoding process, turning the original message, "MEET ME TOMORROW," into this opaque message "55 213 55 9 44 60 14."


Even this two-step process can be cracked, especially if enough messages are intercepted and analyzed. However, if you follow a process of encoding like this but use a different one for every message you send, you have a secure method for encryption that your enemies would find very difficult to crack. But, of course, if you have the key (the rules for encoding the message), it's a simple matter to decode the message back to its original form.


Until recently, the only known methods for encrypting a message were refinements of this approach, called private key cryptography. It's called "private key" because its success relies on keeping the "key"--the rules for encoding and decoding--only among those actually using it. They say that a secret is no longer a secret once you tell one person; clearly, the number of people who can have access to a private key must be quite small, and you must make sure it is not intercepted by a third party.

Private key cryptography

If you're a big business or spy agency that must make secure transactions with hundreds or millions of customers, it is too much trouble to agree on a new encryption system--a new private key--with every customer or spy. Some organizations get around this problem by using the same encryption system for everyone, but with a different permutation for each person's private key.


As a simple example, say the encryption system requires shifting each letter by a given distance in the alphabet. Each business you deal with would receive a different permutation of this system. For instance, Spy Company A would encode its messages by shifting every letter forward by three letters, turning every "A" into "D" and every "B" into "E." Spy Company B might be instructed to encode its messages by shifting a letter forward by 10 letters, turning "P" into "Z" and "R" into "B."


Clearly there are some basic flaws with this system; you can assign keys only up to 25 different spies and since Spy A knows the general system, it would be relatively easy for A to unlock Spy B's specific permutation of the system.


Real private key cryptographic systems are considerably more complex. The most widely used system in the world is called DES (Digital Encryption Standard), which was developed by IBM in response to a public solicitation by the National Bureau of Standards (NBS), which is now called the National Institute of Standards and Technology (NIST). DES is used by banks for electronic transfers of money and for encryption of passwords on computers.


Although these private key systems have been quite effective, researchers are becoming increasingly successful in attacking DES. In response, NIST has issued a solicitation for a new system, called the Advanced Encryption Standards. NIST is now in the process of evaluating systems that have been submitted by cryptographers worldwide.


There is another issue with private key systems. As good as they are, these systems still require an advance exchange of a key, which is not always possible. For example, if you are buying something from an e-business and want to transmit your credit-card number to them, there is no way for the company to send you a private key to use (which might be time-consuming and expensive) without the possibility of someone intercepting this key. Anytime you send one part of a private key into the world, you run the risk of someone intercepting it and using it to decode all of your messages. And if someone does intercept the key, how do you know you're not receiving a message from a forger?

Public key cryptography

In 1976, Whitfield Diffie and Martin Hellman came up with an innovative idea. What if there were a cryptographic system whereby the key used to encrypt a message is different from the key used to decrypt it? In other words, the sender and the receiver of a message don't work from the same rulebook. The sender can distribute a key to the world that anyone can use to encrypt messages to send. This presents a limited danger to security, because none of these publicly distributed keys can decode the message. The key that can decode the messages is kept--and seen--only by the sender.


This system is now called public key cryptography, so called because anyone can see and use the encryption key, while the sender keeps the decryption key private. There is no need to exchange private keys in advance and to tightly control the distribution of these keys.


It was noticed almost immediately that this approach has an extra benefit. It allows the recipient of a message to know that the message really did come from the sender and not from a forger who has cracked the system.


How companies exchange public keys and secure their cryptographic systems.
This additional level of security against forgers is provided in two ways. First, the two companies exchanging messages must both have public key cryptography systems. Second, in a public key cryptographic system, a company can choose which key to distribute publicly and which to keep for its eyes only. A company can distribute a public key that decodes messages and maintain a private key that encodes messages. As long as the private key performs a different function from the public key, the underlying security of the system is safe.


Public key cryptography was immediately recognized as a brilliant concept. It was so brilliant that the National Security Agency initially tried to keep the idea classified and prevent the researchers who invented it from publicizing their findings. They were unsuccessful.


Yet, trying to implement this great theory was tricky. Mathematicians found it very difficult to create an unbreakable system in which the encryption key was different from the decryption key. Most of the time, the decryption key ended up being easily deduced from the encryption key, or the system was too slow to be practical.


Until three mathematicians--Ron Rivest, Adi Shamir and Leonard Adelman--turned the simple process of finding what numbers multiply into which, or factoring, in their favor. The three mathematicians proposed an algorithm named after them that soon gained wide acceptance: the Rivest, Shamir, Adelman (RSA) algorithm.

Making public encryption work: the RSA algorithm

The RSA algorithm provided the first successful implementation of public key cryptography. It relies, crucially, on the fact that, although we know how to multiply two really big numbers quickly, we do not know how to reverse the process quickly at all.


For example, if you take two 100-digit primes, multiply them together to get a 200-digit number, and give the 200-digit number to someone else, that person, using the current state of knowledge, would not be able to factor the number within our lifetime and, therefore, crack the code. Not even the most powerful computers available today provide enough computing power to get close. The upshot: only the person who created this large number from two primes would be able to decode the message.


So, pay attention, all future hackers: if you want to break the code, learn how to factor. The RSA algorithm for public key encryption is currently secure, because no one has been able to factor large numbers quickly or design a computer fast enough to do so. The best methods for factoring numbers are based on the same methods that you learned in grade school. The simplest approach, when faced with a number such as 105, is to start running through all the primes, starting with 2, 3 and 5, and see which number first divides equally into the number you are trying to factor.


Given the number 105, you find very quickly that 105 is divisible by three 35 times, yielding 3 x 35 = 105. By the same approach, you start then with 35, and see quickly that 5 divides into 35, yielding 5 x 7 = 35. You have quickly reached the prime factorization of 105: 3 x 5 x 7, as we saw in our first example.


The surprising fact is that the best-known methods for factoring, known as sieve methods, are basically sophisticated refinements of this approach. When faced with a number of 200 or more digits, these sieve methods take a very long time, even on today's fastest computers, since there are many primes with 100 digits or less. Therefore, until now the RSA algorithm has been secure from attack.


Rivest, Shamir and Adelman were able to develop a public key cryptography system in which the encryption method uses the tremendously large number, achieved by multiplying two tremendously large primes, but the decryption method requires knowing these two factors. A company, spy agency or e-business distributes the multiplied number (which has hundreds of digits) as part of its public key. When the company receives encrypted messages, it uses the factors of this number to decipher the messages.

The future of encryption

In practice, the RSA method is too slow for encrypting and decrypting entire messages. It is used, instead, merely to encrypt a key for a private key cryptographic system. Once that encrypted key is exchanged, the private key system can be used to exchange messages. This approach is now widely used and, in particular, is used for all secure Web-based communication. This includes the exchange of sensitive information such as credit-card numbers between customers and e-businesses.


Could some enterprising evil genius break this system? Certainly. We do not know for sure that factoring large numbers is necessarily so difficult and time-consuming. We only know that no one has yet found a fast way to factor large numbers or created a new system to do so. Based upon the current state of knowledge in computational theory, it is still possible that a very clever graduate student in either mathematics or computer science could suddenly devise a completely new approach to factoring, one much more effective and quick than the sieve method. Recently, using almost 400 computers working simultaneously, a group was able to factor a 155-digit number in 3.7 months.


Another interesting observation was made by Peter Shor of Bell Labs. He showed that a quantum computer (a computer that harnesses the micropower of quantum mechanics) can factor numbers much faster than digital computers and thereby defeat the RSA encryption method. The catch is that no one has been able to figure out how to build a quantum computer, although people are certainly thinking about how it might be feasible.


But quantum physics might also offer a different kind of light at the end of the cryptographic tunnel: the possibility of developing a system called quantum cryptography, which, although hypothetical, does theoretically promise that grail for code-makers--an unbreakable code.


If the RSA method is defeated, there would have to be a massive effort to replace it with an alternative public key system. There are public key systems that use much more sophisticated mathematical ideas than factoring numbers. Ironically, this makes them potentially less secure. If a system is based upon complicated ideas, it is easier for weaknesses in the system to be masked by its complexity. In other words, it is more difficult to prove that a complicated method is secure than to prove that a simple method is secure. This, in fact, is why the RSA method has gained acceptance over the others and is, until that evil hacker or quantum genius arrives on the scene, one of the best keys to privacy we've got.


Deane Yang would like to thank Jeff Shallit, professor of computer science at University of Waterloo; Dorian Goldfeld, professor of mathematics at Columbia University; and Oisin McGuiness of Sumitomo Bank Capital Markets for their assistance.