0

Hoping for help on a regex replacement code in VBScript that matches the length of the string found. There are quite a few variations found on the web, but my skill level is not extensive so making sense of them has been a long and frustrating. Perhaps I'm using the wrong tool, not sure anymore.

The challenge is adjusting a field in a fixed length file which is not formatted correctly for the recipient. There is a field that contains text of variable length, a dash, and more text. The variable length string and the dash need to each be replaced with a space. This will maintain the positioning in the fixed-width file.

The regex I put together is: strOriginal = "\b([A-Z0-9]{2,4}[-])"

This is working well for the data file matches. Can an expert guide me in the correct replacement?

Thank you, Drew

3
  • Show us some posible input values, matches and outputs Commented Jun 5, 2014 at 0:40
  • It's hard to say without input/output..but try escaping - like \-. In a character class - means "range", however most languages are smart enough to know that [-] is not a range. You can also try just taking it out of a character class: \b([A-Z0-9]{2,4}-). Commented Jun 5, 2014 at 0:46
  • Thank you for your replies. Below is an input/match/output example. Input:" W 121ST ST 3A8-2B " Matches: "3A8-" Output: " W 121ST ST 2B " It's not clear during in the edit window but the position of the 2B needs to remain static. Commented Jun 5, 2014 at 8:47

1 Answer 1

2

You need to use a replace function for your regular expression

Option Explicit

Function replaceFunction(matchString, position, fullString)
    replaceFunction = Space(Len(matchString))
End Function 

Dim originalString
    originalString = "W 121ST ST 3A8-2B"

Dim changedString

    With New RegExp
        .Global = True
        .Pattern = "\b[A-Z0-9]{2,4}-"
        changedString = .Replace( originalString, GetRef("replaceFunction") )
    End With

    WScript.Echo "[" & originalString & "]" 
    WScript.Echo "[" & changedString & "]"

The function arguments are the string section that matches the pattern in the regular expression, the position in the input string where the match has been found and the full string being processed.

In your case, for each match, the function is called and it will return a spaces string, the same length that the matching string.

EDITED to adapt to comments

Option Explicit

Function replaceFunction(matchString, prefix, toSpaces, position, fullString)
    replaceFunction = prefix & Space(Len(toSpaces))
End Function 

Dim originalString
    originalString = "W 121ST ST 3A8-2B" & vbCRLF & "W 12-ST ST 3A8-2B"

Dim changedString

    With New RegExp
        .Global = True
        .Multiline = True
        .Pattern = "^(.{10,}\b)([A-Z0-9]{2,4}-)"
        changedString = .Replace( originalString, GetRef("replaceFunction") )
    End With

    WScript.Echo  originalString 
    WScript.Echo  changedString 

Now it handles a multiline input, and for each of the lines, it searchs the first 10 (or more) characters, followed by the string to replace with spaces. The regular expression define two capture groups that are passed as arguments to the replace function. This function will return the first capture group (the prefix) followed by the adecuated number of spaces to replace the non needed string.

Sign up to request clarification or add additional context in comments.

8 Comments

@user3709192, answer updated/corrected. There was a typo error on function name.
Thank you for your expertise, this solution is working very well for most scenarios. I am working on logic to prevent the replacement from matching column positions before 12. For instance, if the input is "W 12-ST ST 3A8-2B" the regex being used replaced the street name in addition to unit code segment. Can you suggest an adjustment to the expression to accomplish?
@user3709192, as you can see in the code, one of the parameters of the replace function is the position inside the string where the match happens. Check it and if position is lower than 12 return the matchString, else the spaces.
I agree, for one row it makes sense. For multiple rows, every later replace will always be a position greater than 12, so the same input on row 2 would also be replaced. This does not solve.
@user3709192, until now we were talking about a string, not multiple rows. So, not knowing the content of the rows, will it be more than a match in each row, only one, none?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.