2

I'm trying to get a SecureString into the form of a byte[] which I can keep GC pinned, encoded in UTF-8 format. I have been successful in doing this but with UTF-16 (the default encoding), but I can't figure out how to do the encoding conversion without the chance of the GC creating a managed copy of the data somewhere (the data needs to be kept secure).

Here's what I have so far (Context: An algorithm to calculate the hash of a SecureString)

public static byte[] Hash(this SecureString secureString, HashAlgorithm hashAlgorithm)
{
  IntPtr bstr = Marshal.SecureStringToBSTR(secureString);
  int length = Marshal.ReadInt32(bstr, -4);
  var utf16Bytes = new byte[length];
  GCHandle utf16BytesPin = GCHandle.Alloc(utf16Bytes, GCHandleType.Pinned);
  byte[] utf8Bytes = null;

  try
  {
    Marshal.Copy(bstr, utf16Bytes, 0, length);
    Marshal.ZeroFreeBSTR(bstr);
    // At this point I have the UTF-16 byte[] perfectly.
    // The next line works at converting the encoding, but it does nothing
    // to protect the data from being spread throughout memory.
    utf8Bytes = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, utf16Bytes);
    return hashAlgorithm.ComputeHash(utf8Bytes);
  }
  finally
  {
    if (utf8Bytes != null)
    {
      for (var i = 0; i < utf8Bytes.Length; i++)
      { 
        utf8Bytes[i] = 0;
      }
    }
    for (var i = 0; i < utf16Bytes.Length; i++)
    { 
      utf16Bytes[i] = 0;
    }
    utf16BytesPin.Free();
  }
}

What's the best way to do this conversion and am I trying to do it in the correct place as I have it or should I do it earlier somehow? Could this be more memory efficient by skipping the UTF-16 byte[] step entirely?

2
  • It should be noted that SecureString is designed to be awkward to copy to a memory string, which is what utf8Bytes seems to be. Maybe XY problem here? Commented May 29, 2018 at 17:26
  • Yes, the utf8Bytes is hopefully going to be a managed array which I can pin and zero out after use. Commented May 29, 2018 at 17:28

2 Answers 2

3

I've found a way to do this the way I wanted. The code I have here isn't finished (needs better exception handling and memory management in the case of failure), but here it is:

[DllImport("kernel32.dll")]
static extern void RtlZeroMemory(IntPtr dst, int length);

public unsafe static byte[] HashNew(this SecureString secureString, HashAlgorithm hashAlgorithm)
{
  IntPtr bstr = Marshal.SecureStringToBSTR(secureString);
  int maxUtf8BytesCount = Encoding.UTF8.GetMaxByteCount(secureString.Length);
  IntPtr utf8Buffer = Marshal.AllocHGlobal(maxUtf8BytesCount);

  // Here's the magic:
  char* utf16CharsPtr = (char*)bstr.ToPointer();
  byte* utf8BytesPtr  = (byte*)utf8Buffer.ToPointer();
  int utf8BytesCount = Encoding.UTF8.GetBytes(utf16CharsPtr, secureString.Length, utf8BytesPtr, maxUtf8BytesCount);

  Marshal.ZeroFreeBSTR(bstr);
  var utf8Bytes = new byte[utf8BytesCount];
  GCHandle utf8BytesPin = GCHandle.Alloc(utf8Bytes, GCHandleType.Pinned);
  Marshal.Copy(utf8Buffer, utf8Bytes, 0, utf8BytesCount);
  RtlZeroMemory(utf8Buffer, utf8BytesCount);
  Marshal.FreeHGlobal(utf8Buffer);
  try
  {
    return hashAlgorithm.ComputeHash(utf8Bytes);
  }
  finally
  {
    for (int i = 0; i < utf8Bytes.Length; i++)
    {
      utf8Bytes[i] = 0;
    }
    utf8BytesPin.Free();
  }
}

It relies on obtaining pointers to both the original UTF-16 string and a UTF-8 buffer, then using Encoding.UTF8.GetBytes(Char*, Int32, Byte*, Int32) to keep the conversion within unmanaged memory.

Sign up to request clarification or add additional context in comments.

1 Comment

For anyone interested, here is the finished SecureString hashing code. By taking an Encoding parameter it can support any encoding (rather than just UTF-8).
0

Have you considered calling GC.Collect() after obtaining the hash?

According with the MSDN on GC.Collect:

Forces an immediate garbage collection of all generations. Use this method to try to reclaim all memory that is inaccessible. It performs a blocking garbage collection of all generations.

All objects, regardless of how long they have been in memory, are considered for collection; however, objects that are referenced in managed code are not collected. Use this method to force the system to try to reclaim the maximum amount of available memory.

From what I see in your code, it shouldn't keep any references to the objects used in the conversion. It all should be collected and disposed by the GC.

1 Comment

I'm quite new to this area of C#, but from what I have found so far, I can't assume that the GC won't copy the array to a new location before I zero it out and it gets collected. I can't take that risk. Would someone be able to confirm?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.