4

I have a generic method for serializing an array of any struct type into an array of bytes using Marshal.StructureToPtr and Marshal.Copy. The full code is:

    internal static byte[] SerializeArray<T>(T[] array) where T : struct
    {
        if (array == null)
            return null;
        if (array.Length == 0)
            return null;

        int position = 0;
        int structSize = Marshal.SizeOf(typeof(T));

        byte[] rawData = new byte[structSize * array.Length];

        IntPtr buffer = Marshal.AllocHGlobal(structSize);
        foreach (T item in array)
        {
            Marshal.StructureToPtr(item, buffer, false);
            Marshal.Copy(buffer, rawData, position, structSize );
            position += structSize;
        }
        Marshal.FreeHGlobal(buffer);

        return rawData;
    }

It works flawlessly 99.99% of the time. However, for one of my Windows 7 users, with certain input data this code will predictably cause the following non-.NET exception:

The data area passed to a system call is too small. (Exception from HRESULT: 0x8007007A).

Unfortunately I do not have access to the user's machine in order to attach a debugger, and I have not been able to replicate the issue even when dealing with the exact same input data as my user. This occurs only on the one user's machine and only with certain input data, but on her machine it happens every time with that same input data, so it's definitely not random.

The application targets .NET 4.5.

Can anyone see anything wrong with this code? My only guess is there is some mismatch occurring between what Marshal.SizeOf is reporting and the actual size of the data structure, thus leading to insufficient memory being allocated for the structure.

If it matters, here is the structure being serialized when the error occurs (it's a representation of character positions resulting from OCR):

public struct CharBox
{
    internal char Character;
    internal float Left;
    internal float Top;
    internal float Right;
    internal float Bottom;
}

As you can see all the fields should be constant size all the time, so my initial allocation of a single fixed-length segment of unmanaged memory into which to serialize each struct shouldn't be a problem (should it?).

While I would welcome alternative or improved methods of doing the serialization, I'm far more interested in nailing down this particular bug. Thanks!

Update Thanks to TnTnMn's pointing out to me that char is not a blittable type, I looked for unicode characters in the input to see if they were marshaling correctly. Turns out, they are NOT.

For the CharBox { 0x2022, .15782328, .266239136, .164901689, .271627158 }, the serialization (in hex) should be:

22 20 00 00 (Character*)

6D 9C 21 3E (Left)

7F 50 88 3E (Top)

FD DB 28 3E (Right)

B7 12 8B 3E (Bottom)

(* Since I wasn't using explicit layout, it padded to four bytes; I'm now frustrated with myself for needlessly increasing the data size by 11%...)

Instead, it is serializing as:

95 00 00 00 (Character)

6D 9C 21 3E (Left)

7F 50 88 3E (Top)

FD DB 28 3E (Right)

B7 12 8B 3E (Bottom)

So it is marshaling char 0x2022 as 0x95 instead. As it happens, 0x2022 Unicode and 0x95 ANSI are both the bullet character. Thus this is not random but rather it's marshaling everything to ANSI, which as I now recall is standard procedure if you don't specify a CharSet.

Ok, so this at least confirms there is some unintended behavior going on, and further gives us a good working theory as to what conditions (namely, a unicode character in the struct) might be leading to the error.

What it does not explain is why this would raise an exception at all, let alone why it isn't raised on any machine but this one user's. As to the former, a discrepancy in the byte size of unciode vs. ANSI would, I suppose, be consistent with the error message ("The data area passed to a system call is too small"), but the unmanaged buffer - which is sized to accommodate 4 full bytes for the char, would be larger than necessary, not smaller. Why would the CLR or the OS be upset about writing only 1 byte to an area intended for 2 and large enough for 4?

As to the latter, I thought perhaps the user might be on a lower version of .NET than everyone else, which could be the case if she's not getting all the Windows 7 updates. But I just tried it out on a VM with a fresh Windows 7 install and .NET 4.5 (the lowest version the application supports) and still can't reproduce the error. I'm trying to find out exactly what .NET version she's got in case it's 4.5.1 or something. Still, this seems like a long shot.

It seems the only way to know for sure will be to change the Character member to an int (to keep the padding the same for existing data) and only cast it to char when necessary, and then see if that changes the result on the user's machine. This'll also be a good opportunity to wrap each distinct Marshal call in an exception handler as John suggested to see which, exactly, is causing the error.

The good news is this is a pretty low priority feature, so I can let it fail safely even if it continues to occur.

Will report back. Thanks all.

9
  • Is the CharBox in the same DLL or EXE where the SerializeArray() function is getting called? If not, then internal is your problem I would think. Commented Jan 26, 2017 at 0:45
  • It's in the same module. Commented Jan 26, 2017 at 1:37
  • OK, next step would be to pinpoint which of the system calls is throwing that exception. When we use marshall, we religiously use try..catch statements around each call. Then you'll know where to look next. There could be all kinds of reasons for this generic error. Commented Jan 26, 2017 at 1:57
  • 1
    There is one thing in your code that has been nagging at the back of my mind: passing false for the fDeleteOld argument in StructureToPtr. This due to the warning about a possible memory leak if false is used. The doc's for [DestroyStructure ](msdn.microsoft.com/en-us/library/df3k5fh1(v=vs.110).aspx) imply no problem if all structure types are blittable. But the issue is that System.Char is not blittable; see: Blittable and Non-Blittable Types. Commented Jan 26, 2017 at 5:20
  • 1
    I didn't know that Char was not blittable either until I tried my first attempt at an alternative with getting a pinned GCHandle to the array. That would have reduced this down to a single Marshal.Copy call. I also don't know if what you are currently doing would cause a leak, but my intent was point out that there could be some alignment issue going on which you appear to be leaning towards now as well. Commented Jan 26, 2017 at 16:06

2 Answers 2

2

Well I found a solution that worked, though I still don't know why.

Here's what I changed. CharBox is now:

[StructLayout(LayoutKind.Explicit, CharSet = CharSet.Unicode)]
public struct CharBox
{
    [FieldOffset(0)]
    internal int Character;

    [FieldOffset(4)]
    internal float Left;

    [FieldOffset(8)]
    internal float Top;

    [FieldOffset(12)]
    internal float Right;

    [FieldOffset(16)]
    internal float Bottom;

    // Assists with error reporting
    public override string ToString()
    {
        return $"CharBox (Character = {this.Character}, Left = {this.Left}, Top = {this.Top}, Right = {this.Right}, Bottom = {this.Bottom})";
    }
}

And the actual method is now:

    internal static byte[] SerializeArray<T>(T[] array) where T : struct
    {
        if ( array.IsNullOrEmpty() )
            return null;            

        int position = 0;
        int structSize = Marshal.SizeOf(typeof(T));

        if (structSize < 1)
        {
            throw new Exception($"SerializeArray: invalid structSize ({structSize})");
        }

        byte[] rawData = new byte[structSize * array.Length];
        IntPtr buffer = IntPtr.Zero;

        try
        {
            buffer = Marshal.AllocHGlobal(structSize);
        }
        catch (Exception ex)
        {
            throw new Exception($"SerializeArray: Marshal.AllocHGlobal(structSize={structSize}) failed. Message: {ex.Message}");
        }

        try
        {
            int i = 0;
            int total = array.Length;
            foreach (T item in array)
            {
                try
                {
                    Marshal.StructureToPtr(item, buffer, false);
                }
                catch (Exception ex)
                {
                    throw new Exception($"SerializeArray: Marshal.StructureToPtr failed. item={item.ToString()}, index={i}/{total}. Message: {ex.Message}");
                }

                try
                {
                    Marshal.Copy(buffer, rawData, position, structSize);
                }
                catch (Exception ex)
                {
                    throw new Exception($"SerializeArray: Marshal.Copy failed. item={item.ToString()}, index={i}/{total}. Message: {ex.Message}");
                }

                i++;
                position += structSize;
            }
        }
        catch
        {
            throw;
        }
        finally
        {
            try
            {
                Marshal.FreeHGlobal(buffer);
            }
            catch (Exception ex)
            {
                throw new Exception($"Marshal.FreeHGlobal failed (buffer={buffer}. Message: {ex.Message}");
            }
        }

        return rawData;
    }

I was expecting just to get more detail on the error, but instead the user reported that it worked without any warning.

All the changes to SerializeArray were just for more detailed reporting, so the substantive changes, one or more of which were the winners, were:

  • Changing the char to an int (I would have used short but I wanted to stay compatible with existing data since this struct is used elsewhere, and previously it was using 4-byte padding).

  • Setting the struct layout to LayoutKind.Explicit and setting the explicit FieldOffsets; and

  • Specifying CharSet.Unicode in StructLayout - which admittedly probably did nothing since there are no more char's in the struct

My guess is that setting the layout to Explicit and the CharSet to Unicode would have been enough to allow Character to be a char again, but I'd rather not waste my customer's time with more trial and error since it is working. Hopefully someone else can opine as to what happened, but I'll probably post this to MSDN too in the hopes that one of the CLR gods might have some insight.

Thanks all especially TnTnMan because highlighting the issue with chars and blitting definitely motivated me trying these changes.

Sign up to request clarification or add additional context in comments.

2 Comments

For future reference, now that your structure only contains blittable types, you can use gchnd = GCHandle.Alloc(array, GCHandleType.Pinned) to obtain a GCHandle instance. Then you can do a single Marshal.Copy(gchnd.AddrOfPinnedObject, rawData, 0, rawData.Length) to load the byte array. No need to copy each item. You can use the same technique to recreate the array from a byte array.
Oh that's a great idea thank you. If you want feel free to post that as an answer because that is probably the best approach of all.
1

I do not see any obvious error in your existing methodology, so I have nothing to offer on that front. However since you stated:

I would welcome alternative or improved methods of doing the serialization

I would like to throw this out for your consideration. Use a MemoryMappedViewAccessor to perform the transformation from array of structures to byte array. This of course requires creating a MemoryMappedFile.

internal static byte[] SerializeArray<T>(T[] array) where T : struct
    {
    int unmananagedSize = Marshal.SizeOf(typeof(T));

    int numBytes = array.Length * unmananagedSize;
    byte[] bytes = new byte[numBytes];

    using (MemoryMappedFile mmf = MemoryMappedFile.CreateNew("fred", bytes.Length))
        {
        using (MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor(0, bytes.Length, MemoryMappedFileAccess.ReadWrite))
            {

            accessor.WriteArray<T>(0, array, 0, array.Length);
            accessor.ReadArray<byte>(0, bytes, 0, bytes.Length);

            }
        }

    return bytes;
    }

internal static T[] DeSerializeArray<T>(byte[] bytes) where T : struct
    {
    int unmananagedSize = Marshal.SizeOf(typeof(T));

    int numItems = bytes.Length / unmananagedSize;
    T[] newArray = new T[numItems];

    using (MemoryMappedFile mmf = MemoryMappedFile.CreateNew("fred", bytes.Length))
        {
        using (MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor(0, bytes.Length, MemoryMappedFileAccess.ReadWrite))
            {

            accessor.WriteArray<byte>(0, bytes, 0, bytes.Length);
            accessor.ReadArray<T>(0, newArray, 0, newArray.Length);

            }
        }
    return newArray;
    }

Depending on you usage, you may need to provide a mechanism for a unique name (where I used "fred") for the MemoryMappedFile.

5 Comments

Well that gets an upvote for originality! I'll try it out, thanks. Still hoping to nail down what I'm doing wrong here though (or if there's a bug in the framework).
that's a shared memory segment! Overkill...... and curious/doubtful about performance.
@john, I ran a few tests on Win 10 /64. Compiled as a 32 bit assembly, the MMF technique took about twice as long to run as the OP's version at 3.32 ms vs 1.75 ms to process 10000 items. When compiled as a 64 bit assembly, the MMF technique very slightly out performed the original version 1.48 ms vs 1.56 ms. Now if the Char fikled is changed to a property that maps to a blittable integer, the OP's method takes about half the time it previously did and the MMF times stayed the same.
@TnTinMn you just have too much time on your hands ;) Next problem: not multi thread friendly ;). Interesting approach though
Would have been interesting to try your commented suggestion of gchnd.AddrOfPinnedObject() too, which seems to work way faster than the original loop solution in the question, or the memory map.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.