2

I am trying to use the WriteConsoleOutputfunction from kernel32.dll, however I cannot get unicode characters to display correctly, they always display as the wrong characters.

I have attempted to use:

Console.OutputEncoding = System.Text.Encoding.UTF8;

Changing this to Encoding.Unicode does not work either.

[DllImport("kernel32.dll", SetLastError = true)]
private static extern bool SetConsoleOutputCP(uint wCodePageID);

public void SetCP(){
   SetConsoleOutputCP(65001);
}

I have tried using both of the above, each one individually and none with just about every combination of values.

I have also switched between all fonts (including the true type ones), however none of them seem to display the characters correctly.

Here is the code I am using to use WriteConsoleOutput

[DllImport("kernel32.dll", SetLastError = true, EntryPoint = "WriteConsoleOutputW", CharSet = CharSet.Unicode)]
static extern bool WriteConsoleOutputW(SafeFileHandle hConsoleOutput, CharInfo[] lpBuffer, Coord dwBufferSize, Coord dwBufferCoord, ref SmallRect lpWriteRegion);

[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Unicode)]
static extern SafeFileHandle CreateFile(string fileName, [MarshalAs(UnmanagedType.U4)] uint fileAccess, [MarshalAs(UnmanagedType.U4)] uint fileShare, IntPtr securityAttributes, [MarshalAs(UnmanagedType.U4)] FileMode creationDisposition, [MarshalAs(UnmanagedType.U4)] int flags, IntPtr template);

private static readonly SafeFileHandle h = CreateFile("CONOUT$", 0x40000000, 2, IntPtr.Zero, FileMode.Open, 0, IntPtr.Zero);

public static void RegionWrite(string s, int x, int y, int width, int height)
{           
    if (!h.IsInvalid)
    {
        int length = width * height;

        // Pad any extra space we have
        string fill = s + new string(' ', length - s.Length);

        // Grab the background and foreground as integers
        int bg = (int) Console.BackgroundColor;
        int fg = (int) Console.ForegroundColor;

        // Make background and foreground into attribute value
        short attr = (short)(fg | (bg << 4));

        CharInfo[] buf = fill.Select(c => 
        {
            CharInfo info = new CharInfo();

            // Give it our character to write
            info.Char.UnicodeChar = c;

            // Use our attributes
            info.Attributes = attr;

            // Return info for this character
            return info;

        }).ToArray();

        // Make everything short so we don't have to cast all the time
        short sx = (short) x;
        short sy = (short) y;
        short swidth = (short) width;
        short sheight = (short) height;

        // Make a buffer size out our dimensions
        Coord bufferSize = new Coord(swidth, sheight);

        // Not really sure what this is but its probably important
        Coord pos = new Coord(0, 0);

        // Where do we place this?
        SmallRect rect = new SmallRect() { Left = sx, Top = sy, Right = (short) (sx + swidth), Bottom = (short) (sy + sheight) };

        bool b = WriteConsoleOutputW(h, buf, bufferSize, pos, ref rect);
    }
    else
    {
        throw new Exception("Console handle is invalid.");
    }

}

Using this with standard ASCII characters works perfectly:

RegionWrite("Hello world", 4, 4, 10, 10);

However when I use anything above the standard ASCII range, it fails to display correctly:

RegionWrite("┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬┬", 4, 4, 10, 10); This outputs as two lines of ',' characters, this makes some sense as the "┬" character has a value of 9516, 9516 % 128 is 44 which is the ascii code for ','.

I know it is physically possible to output these characters as Console.Write("┬┬┬┬") works correctly. I am switching from Console.Write to WriteConsoleOutput as there is a significant performance increase.

Here is the code im using to set code pages:

public void Setup()
{
    Console.BufferHeight = Console.WindowHeight;
    Console.BufferWidth = Console.WindowWidth;

    Console.OutputEncoding = System.Text.Encoding.UTF8;

    SetConsoleOutputCP(65001);

    DefaultColor();
    Console.Clear();

    Console.ReadLine();

    RegionWrite("┬┬┬┬", 4, 4, 10, 10);

    Console.WriteLine("┬┬┬┬");

    Console.ReadLine();
}

Here are my structures:

[StructLayout(LayoutKind.Sequential)]
public struct Coord
{
    public short X;
    public short Y;

    public Coord(short X, short Y)
    {
        this.X = X;
        this.Y = Y;
    }
}

[StructLayout(LayoutKind.Explicit)]
public struct CharUnion
{
    [FieldOffset(0)] public char UnicodeChar;
    [FieldOffset(0)] public byte AsciiChar;
}

[StructLayout(LayoutKind.Explicit)]
public struct CharInfo
{
    [FieldOffset(0)] public CharUnion Char;
    [FieldOffset(2)] public short Attributes;
}

[StructLayout(LayoutKind.Sequential)]
public struct SmallRect
{
    public short Left;
    public short Top;
    public short Right;
    public short Bottom;
}

I assume I have screwed up one of the variables of WriteConsoleOutput but after hours of searching for answers i'm really not sure where i've gone wrong. Is there some internal set encoding function I need to use?

nvm fixed it

5
  • You should be using Console.OutputEncoding = System.Text.Encoding.Unicode; and forget the code page. Commented Feb 13, 2020 at 8:52
  • @MatthewWatson I have tried this, it makes no difference. Commented Feb 13, 2020 at 8:55
  • Well you also need to specify Charset.Unicode in the declaration for CharInfo, otherwise it will default to ANSI: [StructLayout(LayoutKind.Explicit, CharSet=CharSet.Unicode)] (and maybe in some of your other structs as well, not sure about those). Commented Feb 13, 2020 at 9:00
  • Ugh facepalm, thank you so much. Commented Feb 13, 2020 at 9:05
  • Don't post answers inside questions. If you have an answer to your question, post it as an answer. Commented Feb 13, 2020 at 9:26

1 Answer 1

4

Simple solution, change

[StructLayout(LayoutKind.Explicit)]
public struct CharUnion
{
    [FieldOffset(0)] public char UnicodeChar;
    [FieldOffset(0)] public byte AsciiChar;
}

to

[StructLayout(LayoutKind.Explicit, CharSet=CharSet.Unicode)]
public struct CharUnion
{
    [FieldOffset(0)] public char UnicodeChar;
    [FieldOffset(0)] public byte AsciiChar;
}

This is because it will default to ANSI meaning your unicode characters get automatically turned into ANSI, hence ┬ into ,

Sign up to request clarification or add additional context in comments.

1 Comment

I had to add the CharSet attribute on other methods as well (like WriteConsoleOutput) to make it works

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.