0

I have a long string of random letters and I need to remove a couple of the front letters a few at a time. By using the replace function, if I replace a piece of string that then repeats later on, it removes the piece of string entirely from the long string instead of just the beginning.

Is there a way to remove a piece of string without using the replace function? The code below might clear up some of the confusion.

    Dim protein As String
    protein = "GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVLTALGGILKKKEGHHEAELKPLAQSHATKHKIPIKYLEFISDAIIHVLHSKHRPGDFGADAQGAMTKALELFRNDIAAKYKELGFQG"

    Dim IndexPosition




    For Each index In protein
        If index = "K" Or index = "R" Then

            IndexPosition = InStr(protein, index)  
            Dim NextPosition = IndexPosition + 1
            Dim NextLetter = Mid(protein, NextPosition, 0)

            If NextLetter <> "P" Then

                Dim PortionToCutOut = Mid(protein, 1, IndexPosition)
                protein = Replace(protein, PortionToCutOut, "")  
                Console.WriteLine(PortionToCutOut)



            End If


        End If
    Next index
2
  • String.Substring Method will be your best bet, figure out what you want to keep or remove, then concatenate what you need Commented Mar 23, 2021 at 14:18
  • @JayV you're a legend, thank you. How can I mark this as answered? Commented Mar 23, 2021 at 14:29

2 Answers 2

1

Regex might be a simpler way to solve this:

Regex.Replace(protein, "^(.*?)[KR][^P]", "$1")

It means "from the start of the string, for zero or more captured characters up to the first occurrence of K or R followed by anything other than P, replace it with (the captured string)"

GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETL
^^^^^^^^^^^^^^^^^
captured string||
               xx

Everything underlined with ^^^ is replaced by everything apart from the xx bit

It makes a single replacement, because that's what I interpreted you required when you said:

By using the replace function, if I replace a piece of string that then repeats later on, it removes the piece of string entirely from the long string instead of just the beginning

However if you do want to replace all occurrences of "K OR R followed by not P" it gets simpler:

Regex.Replace(protein, "[KR][^P]", "")

This is "K or R followed by anything other than P", replace with "nothing"

Sign up to request clarification or add additional context in comments.

Comments

0

There are several issues with your code. The first issue that is likely to throw an exception is that you're modifying a collection in a For/Each loop.

The second issue that is less severe in immediate impact, but just as important in my opinion is that you're using almost exclusively legacy Visual Basic methods.

The third issue is that you're not using the short-circuit operator OrElse in your conditional statement. Or will evaluate the right-hand side of your condition regardles of if the left-hand side is true whereas OrElse won't bother to evaluate the right-hand side if the left-hand side is true.

In terms of wanting to remove a piece of the String without using Replace, well you'd use Substring as well.

Consider this example:

Dim protein = "GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVLTALGGILKKKEGHHEAELKPLAQSHATKHKIPIKYLEFISDAIIHVLHSKHRPGDFGADAQGAMTKALELFRNDIAAKYKELGFQG"
Dim counter = 0
Do While counter < protein.Length - 2
    counter += 1
    Dim currentLetter = protein(counter)
    Dim nextLetter = protein(counter + 1)
    If (currentLetter = "K"c OrElse currentLetter = "R"c) AndAlso nextLetter <> "P"c Then
        protein = protein.Substring(0, counter) & protein.Substring(counter + 1)
    End If
Loop

Example: https://dotnetfiddle.net/vrhRdO

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.