C# Hashing multiple byte array blocks

Question

I am trying to hash a file by reading 1024 bytes from a FileStream in a loop and using TransformBlock function. I need this to understand the mechanics of hashing multiple byte arrays into one hash. This would allow me to hash not only files, but also folders. I used this stackoverflow question: Hashing multiple byte[]'s together into a single hash with C#? and this msdn example: http://msdn.microsoft.com/en-us/library/system.security.cryptography.hashalgorithm.transformblock.aspx

Here is the code I have now:

public static byte[] createFileMD5(string path){
    MD5 md5 = MD5.Create();
    FileStream fs = File.OpenRead(path);
    byte[] buf = new byte[1024];
    byte[] newbuf = new byte[1024];

    int num; int newnum;

    num = fs.Read(buf,0,buf.Length);
    while ((newnum = fs.Read(newbuf, 0, newbuf.Length))>0)
    {
        md5.TransformBlock(buf, 0, buf.Length, buf, 0);
        num = newnum;
        buf = newbuf;
    }

    md5.TransformFinalBlock(buf, 0, num);

    return md5.Hash;
}

Unfortunately the hash which it calculates doesnt correspond to the one which I calculated using fciv.

Just to be sure: hexing algorithm which I use on the returned byte array:

    public static string byteArrayToString(byte[] ba)
    {
        StringBuilder hex = new StringBuilder(ba.Length * 2);
        foreach (byte b in ba)
            hex.AppendFormat("{0:x2}", b);
        return hex.ToString();
    }

Thomas Levesque · Accepted Answer · 2013-11-18 13:12:03Z

4

The length you pass to TransformBlock is wrong for the last block (unless the file size is a multiple of the buffer size). You need to pass the actual number of bytes read from the file:

md5.TransformBlock(buf, 0, newnum, buf, 0);

Also, I'm not sure why you use newbuf... the original buffer is used only for the first block, then you use newbuf for all subsequent blocks. There is no reason to use a second buffer here. For reference, here's the code I use to compute the hash of a file:

            using (var stream = File.OpenRead(path))
            {
                var md5 = MD5.Create();
                var buffer = new byte[8192];
                int read;
                while ((read = stream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    md5.TransformBlock(buffer, 0, read, buffer, 0);
                }
                md5.TransformFinalBlock(buffer, 0, 0);

                ...
            }

answered Nov 18, 2013 at 13:12

Thomas Levesque

294k73 gold badges639 silver badges769 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

black Over a year ago

I thought that I had to use TransformBlock on every block EXCEPT the last one, and use TransformFinalBlock on the last one. It was unclear in the other stackoverflow question: // For each block: md5.TransformBlock(block, 0, block.Length, block, 0); // For last block: md5.TransformFinalBlock(block, 0, block.Length);

Thomas Levesque Over a year ago

@black, actually I'm not sure the way I do it would work with all hash algorithms... I know it works for MD5 and SHA1, but perhaps other algorithms require that all blocks passed to TransformBlock have the same size.

Alex Essilfie Over a year ago

@black: Generally you do TransformBlock on every block except the last one. Then you call TransformFinalBlock on the last block. However, in a stream of an unknown length, you may not know if the last block has been processed until it is too late. For this reason a properly implemented algorithm is expected to finalise the hash when TransformFinalBlock is called with an empty array as in the answer. In an ideal case, however, TransformFinalBlock is expected to be called with the last block of data from the stream. PS: I tested both ways with MD5 & SHA256 and can confirm it works.

Collectives™ on Stack Overflow

C# Hashing multiple byte array blocks

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related