3

Question: Is there a more efficient way to parse bits out of a byte array to integer values? If so, what would that be?

Data is currently read off a stream with a packet consisting of a series of bytes (which are held in a byte array). The data is compressed in these bytes such that a value may spread across multiple bytes (in sequence). The bits making up the values vary in size depending on the "type" of packet (which is held in the first 5 bits (starting with MSB) of the first byte. E.g.,

byte[] bytes = {0x73, 0xa4};
// yields: 0111001110100100
// vals:   [ 1 ][  2  ]3[4]

Currently I use an extension method:

public static string ConvertToBinaryString(this byte[] bytes)
{
    return string.Join("", bytes.Select(x => Convert.ToString(x, 2).PadLeft(8, '0')));
}

to convert the byte array to a binary string. Then I use one of these two extension methods to convert the binary string into an int or array of int:

static readonly Regex IsBinary = new Regex("^[01]{1,32}$", RegexOptions.Compiled);
public static bool TryParseBits(this string toParse, int start, int length, out int intVal)
{
    intVal = -1;
    if (!IsBinary.IsMatch(toParse)) return false;
    if ((start + length + 1) > toParse.Length) return false;
    intVal = Convert.ToInt32(toParse.Substring(start, length), 2);
    return true;
}
public static bool TryParseBits(this string toParse, Queue<int> lengths, out List<int> vals)
{
    vals = new List<int>();
    if (!IsBinary.IsMatch(toParse)) return false;
    var idx = 0;
    while (lengths.Count > 0)
    {
        var l = lengths.Dequeue();
        if ((idx + l) > toParse.Length) return false;
        vals.Add(Convert.ToInt32(toParse.Substring(idx, l), 2));
        idx += l;
    }
    return true;
}

Example Use:

int type;
var success = "0111001110100100".TryParseBits(0, 5, out type);

results in type = 14

Background: The stream that is being read can deliver up to 12 packets/second. There is some preprocessing that occurs prior to the having to parse the bytes and there is significant post processing that occurs on the values. The data packets are split across four threads using Parallel.ForEach. The values are never greater than 28 bits so I don't worry about a sign bit when converting to int.

3
  • Are your byte arrays always 2/4 bytes? Are they always in Big Endian format? (Byte 0 is the MSB in your example) Commented Nov 19, 2014 at 16:40
  • They are anywhere from 56 to 112 bytes in Big Endian format. Though I know how many bytes there are prior to having to parse those bytes. There are a few cases where the byte arrays are larger, but I can easily parse them to this range. Commented Nov 19, 2014 at 21:16
  • In your example, the first five bits are "01110". That's equal to 0x0E, not 0x07. Commented Nov 25, 2014 at 5:09

3 Answers 3

2

For completeness, this is what I came up with based on an answer that was apparently deleted.

public static int ParseBits(this byte[] bytes, int start, int length)
{
    // Need to reverse the array to make it usable with BitArray
    Array.Reverse(bytes);
    var ba = new BitArray(bytes);
    var idx = 0;
    var shft = length - 1;
    // Iterate backwards through the bits and perform bitwise operations
    for (var i = start + length - 1; i >= 0; i--)
    {
        idx |= (Convert.ToInt32(ba.Get(i)) << shft);
        shft--;
    }
    return idx;
}
Sign up to request clarification or add additional context in comments.

Comments

1

Have you tried something like a bitmask?
Knowing that for eg last 4 bits of the first byteArray is our first val we can do:

byte[] bytes = { 0x73, 0xa4 };
int v1 = bytes[0] & 0x0F;
//second val:
int v2 = bytes[2] & 0xF0;

OR
Before applying a mask just store everything in a lager nr.

 int total = 0;
 total = total | bytes[0];
 total = total << 8;
 total = total | bytes[1];
 //now the 2 bytes array is stored in a number in our case total will be: 111001110100100
 //after storing apply bit masks as described above:
 // to get the LAST 3 bytes
 int var1 = total & 0x7 //mask with last 3 bytes
 total = total >> 3; //take out last 3 bytes
 int var2 = total & 0x1 //last byte.
 total = total >>1;
 //and so on

3 Comments

This will not work because the bits specifying the values are not necessarily 8 bits each.
Actually my first pass was just parsing involved bytes per value and using masks, ORs and SHIFTs. But it was tedious and error prone to given all the "type" possibilities... though it may be the most speed efficient.
I added a parsing mechanism to extract the bits as you wanted. let me know if it worked
0

you just need to learn about bitwise operations.

byte[] bytes = {0x73, 0xa4};
// yields: 0111001110100100
// vals: [ 1 ][ 2 ]3[4]

int bits = bytes[1] | bytes [0] << 8;
int v1 = (bits & 0xf800) >> 11;
int v2 = (bits & 0x07f0) >> 5;
int v3 = (bits & 0x0008) >> 3;
int v4 = bits & 0x0007;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.