I have a byte array which I got back from a FileStream.Read and I would like to turn that into a string. I'm not 100% sure of the encoding - it's just a file i saved to disk - how do I do the conversion? Is there a .NET class that reads the byte order mark and can figure out the encoding for me?
6 Answers
See how-to-guess-the-encoding-of-a-file-with-no-bom-in-net.
Since strings are Unicode, you must specify an encoding on conversion. Text streams (even ReadAllText() ) have an active encoding inside, usually some sensible default.
Comments
Try something like this:
buffer = Encoding.Convert( Encoding.GetEncoding("iso-8859-1"), Encoding.UTF8, buffer );
newString = Encoding.UTF8.GetString( buffer, 0, len );
1 Comment
If File.ReadAllText will read the file correctly, then you have a couple of options.
Instead of calling BeginRead, you could just call File.ReadAllText asynchronously:
delegate string AsyncMethodCaller(string fname);
static void Main(string[] args)
{
string InputFilename = "testo.txt";
AsyncMethodCaller caller = File.ReadAllText;
IAsyncResult rslt = caller.BeginInvoke(InputFilename, null, null);
// do other work ...
string fileContents = caller.EndInvoke(rslt);
}
Or you can create a MemoryStream from the byte array, and then use a StreamReader on that.
Comments
How much do you know about the file? Could it really be any encoding? If so, you'd need to use heuristics to guess the encoding. If it's going to be UTF-8, UTF-16 or UTF-32 then
new StreamReader(new MemoryStream(bytes), true)
will detect the encoding for you automatically. Text is pretty nasty if you really don't know the encoding though. There are plenty of cases where you really would just be guessing.
Comments
There is no simple way to get the encoding, but as mentioned above use
string str = System.Text.Encoding.Default.GetString(mybytearray);
if you have no clue of what the encoding is. If you are in europe the ISO-8859-1 is probably the encoding you have.
string str = System.Text.Encoding.GetEncoding("ISO-8859-1").GetString(mybytearray);
Comments
System.IO.File.ReadAllText does what you want.