Cryptography Or: How I Learned to Stop Worrying, and Love AES
This guest post is by Phillip Gawlowski, who is living in the German wilderness of Oberberg near Cologne. Phillip spends his time writing Ruby as a hobby just for fun. He tries to make life a little easier for himself and for others when he is crazy enough to release his code as open source. He’s neither famous nor rich, but likes it that way (most of the time). He blogs his musings at his blog.
A friend gave you the plans for Dr. Blofeld’s newest Doomsday Device. Over the engine noise of his Aston-Martin, he tells you: “Send this to firstname.lastname@example.org, and make sure it arrives there intact!”
All you have is a laptop, wonky Internet access, and Ruby. What to do?
AES For Safety, SHA2 For Integrity
You now have two goals:
- Make the Doomsday Device plans unreadable, and
- Ensure that the data has arrived at its destination without error.
Fortunately, Ruby provides an API to OpenSSL, a well-tested, widely used library and set of tools used for encryption of all kinds, and includes its own implementations of several cryptographic hashes.
Like many things, Ruby makes creating crypto-hashes easy:
require 'digest/sha2' sha256 = Digest::SHA2.new(256) sha256.digest("Bond, James Bond")
SHA2#new call provides us with the bit length we want our hash to have. SHA2 exists in two variants: 256, also called SHA256, and 512, called SHA512. A longer key length takes longer to calculate, but is also more accurate, and much more difficult to attack with a rainbow table or other cryptanalysis.
Once we have our SHA object, we pass a String of data into the
#digest to have the hash of this data returned as a String.
You can call the
#digest method directly when you are working with MD5 or SHA1:
require 'digest/MD5' Digest::MD5.digest "Bond, James Bond"
The Advanced Encryption Standard
As AES is a so-called symmetric-key block cipher, it operates on chunks of data, called blocks, and applies the provided key to this block to create de- and encrypted output. The use of the same key for encryption and decryption is what makes the cipher symmetric. Conversely, asymmetrical ciphers use different keys for decryption and encryption, usually a private key known only to the recipient to decrypt, and a public key known to anyone to encrypt. SSH, SSL/TLS and PGP are examples for this kind of cipher.
The AES family has three modes of operation: 128 bit, 192 bit, and 256 bit. Just as with SHA2, you’ll find AES-128, or AES-256 being used to describe the particular block size that can be used.
The downside to this approach is that the same key is used for each block of data, which weakens the encryption (the same data is encrypted in the same way!). The solution is to use a so called “mode of operation”, which scrambles the cipher so that it becomes indistinguishable from noise.
A full discussion of methods of operation and their strengths and weaknesses would go well beyond the scope of this article, however.
Now let’s take a look at Ruby’s encryption API:
require 'openssl' require 'digest/sha2' payload = "Plans for Blofeld's newest Doomsday Device. This is top secret!" sha256 = Digest::SHA2.new(256) aes = OpenSSL::Cipher.new("AES-256-CFB") iv = rand.to_s key = sha256.digest("Bond, James Bond") aes.encrypt aes.key = key aes.iv = iv encrypted_data = aes.update(payload) + aes.final puts encrypted_data
Since Ruby’s OpenSSL API is pretty straight forward (and so is the OpenSSL API, if you would like to use OpenSSL in C code), we will only discuss what’s really important.
OpenSSL::Cipher.new("AES-256-CFB") sets up an AES object, with a block size of 256 bits and the CFB mode of operation. To find out which ciphers are supported,
OpenSSL::Cipher.ciphers allows you to interrogate the class for which ciphers are understood.
iv variable stores our random Initialization Vector, random data to seed the mode of operation to ensure that each 256 bit block is encrypted uniquely, and thus (hopefully) indistinguishable from noise.
We also take advantage of SHA2′s 256 bit variant to generate a 256 bit password from a simpler password. AES expects the encryption key to be as long as a block of data, and since creating a 256 bit password from hand is pretty difficult, we let the computer do the job. When used in production, you most likely want to add a salt to the hash, or use a user’s already hashed password.
#encrypt methods, we put our AES object into the proper state. Behind the scenes, this initializes OpenSSL’s encryption engine. These two method calls are required before any other method call!
Last but definitely not least, the
#final methods are where the encryption actually happens. The more data you have, the longer the chunks, and the more complex the cipher, the longer this will take. The
#final method does the same as
#update, but ads padding to a chunk to bring it up to the required block size.
In case you make a mistake, or want to do another round of encryption or decryption, the
#reset method can reset a Cipher object.
Decryption works pretty much the same as encryption, except that we pass the encrypted data to the
aes.decrypt aes.key = key aes.iv = iv puts aes.update(encrypted) + aes.final
Note, however, that both the key and the IV must be the same, and thus have to be stored or transmitted to the recipient of the encrypted data!
As we’ve already seen, a hashing algorithm can turn data of arbitrary length into a fixed length, unique stream of bytes. This can function as password storage, to generate securer keys for encryption, or, since the output of a hash algorithm is deterministic (it’s always the same for the same input) as an integrity check.
If you’ve downloaded a Linux distribution or other software, you have already seen this, in the form of MD5 digests, with which you can verify that a download is complete and error free, like on Ruby’s homepage.
We will do the same with our encrypted data, as a poor man’s message authentication code–a technique in cryptography to ensure that a message has not been tampered with:
poor_mans_mac = sha2.digest(encrypted)
Now all that’s left is to send an email to James’ employer with the Doomsday Device plans, and to give them a call to give them the IV and key.
Think of the Future
Security is not a state, it is a process. You should write your security-aware code in such a way that you don’t depend on a particular cryptographic algorithm. Ruby’s API (and OpenSSL’s own API) wrap encryption abstractly, so that you can swap out the algorithm you use at any time. This is also necessary for hashing algorithms: While there are no feasible attacks against SHA2 yet, the cryptanalysis only gets better over time, as the histories of MD5 and DES show.
Schneier’s Law states that “any person can invent a security system so clever that she or he can’t think of how to break it.” This is why Ruby’s developers use OpenSSL to do encryption, a widely tested and certified (in some variants!) cryptographic library, instead of writing their own library.
A mistake in your implementation can compromise your and your customer’s data, since so called “side channel attack” are used as a matter of course to attack cryptography.
Encryption Does Not Mean You Are Safe
It is important, and I cannot stress this enough, that you do not store encrypted data and the keys to access it on the same machine (ideally, you don’t store these things on the same network!), or do your encryption and decryption on the same machine that you store you encrypted data on. Whole libraries have been filled with books on how to design a secure system, from hardware to software. Above all, security is a mindset, and you have to be properly paranoid to secure your data and access to this data. Sooner or later, if you deploy, or are about to deploy, security relevant code have your code tested by outsiders. Penetration testing is worth your while.
Asymmetric encryption has been invented to solve one problem with encryption: It is not necessary for such a cipher to transmit the key. However, they have their own set of trade offs (key trust, and computational efficiency, among others).
The Safest Data is No Data
Like the fastest code is no code at all, if you don’t store data you don’t absolutely, positively have to store, don’t even bother with it. What you don’t have can’t be compromised.
This article is nothing but a superficial introduction to encryption in Ruby. There are dozens of standards and regulations that govern this vast topic. However, I have tried my best to give you, fellow Rubyists, enough knowledge about this topic for you to know which questions you should ask, which is, in the end, much more important than the code itself. Now go forth, and hash an encrypt and decrypt, and, above all, have fun doing it!
I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Thanks!