Question: Is there a more efficient way to parse bits out of a byte array to integer values? If so, what would that be?
Data is currently read off a stream with a packet consisting of a series of bytes (which are held in a byte array). The data is compressed in these bytes such that a value may spread across multiple bytes (in sequence). The bits making up the values vary in size depending on the "type" of packet (which is held in the first 5 bits (starting with MSB) of the first byte. E.g.,
byte[] bytes = {0x73, 0xa4};
// yields: 0111001110100100
// vals: [ 1 ][ 2 ]3[4]
Currently I use an extension method:
public static string ConvertToBinaryString(this byte[] bytes)
{
return string.Join("", bytes.Select(x => Convert.ToString(x, 2).PadLeft(8, '0')));
}
to convert the byte array to a binary string. Then I use one of these two extension methods to convert the binary string into an int or array of int:
static readonly Regex IsBinary = new Regex("^[01]{1,32}$", RegexOptions.Compiled);
public static bool TryParseBits(this string toParse, int start, int length, out int intVal)
{
intVal = -1;
if (!IsBinary.IsMatch(toParse)) return false;
if ((start + length + 1) > toParse.Length) return false;
intVal = Convert.ToInt32(toParse.Substring(start, length), 2);
return true;
}
public static bool TryParseBits(this string toParse, Queue<int> lengths, out List<int> vals)
{
vals = new List<int>();
if (!IsBinary.IsMatch(toParse)) return false;
var idx = 0;
while (lengths.Count > 0)
{
var l = lengths.Dequeue();
if ((idx + l) > toParse.Length) return false;
vals.Add(Convert.ToInt32(toParse.Substring(idx, l), 2));
idx += l;
}
return true;
}
Example Use:
int type;
var success = "0111001110100100".TryParseBits(0, 5, out type);
results in type = 14
Background: The stream that is being read can deliver up to 12 packets/second. There is some preprocessing that occurs prior to the having to parse the bytes and there is significant post processing that occurs on the values. The data packets are split across four threads using Parallel.ForEach. The values are never greater than 28 bits so I don't worry about a sign bit when converting to int.
"01110". That's equal to 0x0E, not 0x07.