Wednesday, March 24, 2010

CryptoStream - direction, flushing, and length

I was tooling around with some encryption code, trying to learn a bit more about the .NET encryption APIs, and it occurred to me that some of the documentation for the general usage of CryptoStream was a little bit confusing. I am referring to the "direction" of the stream, and what the actual effect of the CryptoStreamMode instance is in the constructor.

The reason for my initial confusion was the EncryptTextToFile and DecryptTextFromFile methods the documentation uses in its sample. These methods kind of make it look as if you should use CryptoStreamMode.Read if you are doing decryption, and CryptoStreamMode.Write if you are doing encryption. This is not the case - you are free to use either direction no matter what kind of transformation you are applying.

My confusion probably originated from my one-track mind when it comes to streams: In all my experience with BizTalk pipeline components, I don't do a whole lot of writing to streams, and so I default to thinking of streams as objects that either access a resource, like the file system or the network, or as machines that apply a transforms as you pull data through them. You can layer these machines like Matryoshka dolls, with the data source at the center, and pull the data all the way out, and what you'll get out is the grand sum of all the transformations applied in order, without necessarily having to get the full result of each transformation in sequence. This is exactly how CryptoStream works in Read mode - the first argument to the constructor is the stream you are wrapping. The second argument, the ICryptoTransform, determines the transformation that the stream applies as you pull the original stream through. Just like any other stream, you can Layer a CryptoStream and return it, and the consumer can use it just like any other stream. Keep in mind, though, that a CryptoStream in Read mode is read-only, and CryptoStreams are never seekable.

A CryptoStream in write mode works the other way: with the first argument to the constructor, you define the stream that goes inside of it. When you write to the CryptoStream, the result of the transform gets pushed to the underlying stream.

So which direction is preferable? It depends on the shape of the information you're putting in and getting out. If your input is not a stream, and is instead something like a string or a byte array that you know the length of, Write is probably a simpler option, because you can just write your data directly to it. If you're dealing with streams, Read is likely easier.

Just like every other kind of stream, don't forget to flush your CryptoStream. CryptoStream does have one twiddly bit about it - Flush is a no-op. For whatever reason, the important flushing method in CryptoStream is FlushFinalBlock. The documentation does a good job of hammering it home that you can call Close to make sure this gets done, but calling Dispose does it as well, which is good to know because "using" statements are generally preferable where applicable.

One more important bit that's easy to overlook - while you may be able to determine the length of a CryptoStream you are reading from at runtime, the length of the CryptoStream is not necessarily the size of the data you will get out of it, because it's transforming the data as it zips through. This goes for all streams - don't be lazy, and don't use the length of a stream to set the size of a byte array you're going to store the data in!

No comments: