0

I am in the process of developing a simple virus scanner, and I was searching for speed improvements on the following function:

Public Function FindAInB(ByRef byteArrayA() As Byte, ByRef byteArrayB() As Byte) As Integer
    Dim startmatch As Integer = -1
    Dim offsetA As Integer = 0
    Dim offsetB As Integer = 0

    For offsetB = 0 To byteArrayB.Length - 1
        If byteArrayA(offsetA) = byteArrayB(offsetB) Then
            If startmatch = -1 AndAlso offsetB < byteArrayB.Length - 8 Then
                startmatch = offsetB
            End If
            offsetA += 1
            If offsetA = byteArrayA.Length Then
                Exit For
            End If
        Else
            offsetA = 0
            startmatch = -1
        End If
    Next
    Return startmatch
End Function

I need it to be turbo fast because it's searching for about 7800 byte arrays in a selected file's bytes. Kind of hard to explain but is there an alternative for the code above or a way to speed it up?

Thanks In Advance!

3
  • it all depends on what the data looks like. is it sorted or in random order? Commented Feb 9, 2011 at 17:33
  • How large is the arrays - the array you are searching through and the array you are searching for, respectively ? Commented Feb 9, 2011 at 17:36
  • Well, strings() is an array of 7281 byte arrays (which are virus signatures), the array i'm searching through is a readAllBytes() of a selected file. searching is done in a do while loop. Commented Feb 9, 2011 at 17:45

1 Answer 1

1

You should check out string search algorithms like Boyer-Moore.

Although you're not actually searching text, you are searching for strings of bytes within a larger string of bytes, so these type of algorithms could help out considerably.

Sign up to request clarification or add additional context in comments.

5 Comments

The problem is that i have to search for a sequence of bytes (that exist in an array) in an array of bytes
@Seif: Yes, I know. These kind of algorithms can still help. Instead of searching for a small sequence of characters inside a big sequence of characters you're searching for a small sequence of bytes inside a big sequence of bytes. It's exactly the same thing in principle.
Plus i'm trying to avoid loops that waste time
@Seif: Well, the Wikipedia page describes the algorithm and has an example implementation in C. I suppose you'd need to translate that into VB.
@luke: yes but how could that be implemented in byte array searching?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.