1

I've started working at a new job, which it turns out has a giant pile of completely unorganized, non-standardized file names across a pile of directories (too many to do manually). Initially my plan was to use a simple VBA script to use string compare of the first 13 characters, then if they don't match, place a string with the ideal date format (utilising the date created of the document) at the front, but then I noticed several patterns already exist and by doing my original plan I would just be creating another problem in the future (by having incorrect date codes behind my ideal string). Therefor after research I realised Regex patterns should be the way to go.

My ideal starting format is this: "yyyy.mm.dd - " (ie. "2014.11.20 - " I tried creating my first expression to match this but have had no luck so far:

^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+

Can someone please tell me where I am going wrong? My search through online tutorials has left me more confused than when I started.

The plan from there is to match other common date formats (below) in the directories and replace them with the "ideal", any help with regex patterns that would identify them would be greatly appreciated.

"yymmdd " "yyyy mm dd - " "yyyymmdd " "yyyymmdd - "

My plan would be to use a simple IF vba function, finding which the name matches to and doing the neccessary VBA strings manipulations to create the correct, standard format.

For example if the current name of the file is this "141003 xxxxxx" it would be replaced with "2014.10.03 - xxxxx", etc.

Thanks very much for your help in advance.

1
  • 1
    Use ([1-2][0-9])([0-9][0-9])\.(0[1-9]|1[0-2])\.(0[1-9]|[1-2][0-9]|3[0-1]) if you want to exclude false positives such as 2017.13.32 Commented Apr 18, 2017 at 12:11

3 Answers 3

2

In your expression you've put four digit groups delimited by three dots. Apparently, dates have only three digit groups with two dots. So the regex for the first date pattern is:

^[0-9]{4}\.[0-9]{2}\.[0-9]{2}

Demo: https://regex101.com/r/vUigcj/1

Please notice the {4} and {2} quantifiers which required exactly four and two digits respectively, as opposed to more relaxed "one or more digits" condition provided by the + quantifier.

A more generic regex covering all the patterns you've listed is

^(?:[0-9]{2})?[0-9]{2}[ .]?[0-9]{2}[ .]?[0-9]{2} (?:- )?

Demo: https://regex101.com/r/vUigcj/2

Explanation:

  • ^ - start of the string anchor
  • (?: - non-capturing group start
    • [0-9]{2} - the first two digits of year
  • ) - end of the non-capturing group
  • ? - make this group optional (allows to omit century digits)
  • [0-9]{2} - the last two digits of a year
  • [ .] - a space or a dot - date delimiter
  • ? - make this delimiter optional
  • [0-9]{2} - two digits of month
  • [ .]? - another optional date delimiter
  • [0-9]{2} - two digits of day
  • - space (literally)
  • (?:- )? - optionally followed by a dash and space
Sign up to request clarification or add additional context in comments.

6 Comments

Wow, thank you so much for that, I've tried to make one just to represent the "yymmdd - " that exists. ^[0-9]{6}( - ) Would that work?
@MSalty: You're welcome! In the ^[0-9]{6}[ - ] regex you'd better remove the brackets (^[0-9]{6} - ), otherwise it would fail. The dash in square brackets (unless it's the first or the last char in the brackets) has a special meaning. It defines a range (just like in [0-9] it defines a range of digits from 0 through 9). In the [ - ] the dash defines a range from ` ` (space) to ` ` (space), which is simply ` ` (space).
Edit: I can't believe I included one too many sets of dots :| too much screen time for me today. You're a great teacher, thank you!
Would it be good practice to use ^[0-9]{6}( \- ) or use the brackets like you suggested?
I'd personally go for literal representation whenever you need literal match (i.e. write a space, a dash and a space literally) without any extra stuff. Provided a regex normally surrounded by some delimiters, the space characters are clearly distinguishable and don't introduce much confusion (i.e. in VB you're most likely to have it in double quotes: "^[0-9]{6} - "). Also, I don't recommend you to escape a dash when it's not in square brackets since it may confuse some Regex engines.
|
1

The Pattern for yyyy.mm.dd , for example 2014.11.20 is:

(^[0-9]{4})(.)([0-9]{2})(.)([0-9]{2})

Note: great site for RegEx training and testing: RegEx101

1 Comment

Thank you very much for the site, it has proved to be very very handy.
1

Here is a sample VBA function which handle all your needs :

Dim regEx As New RegExp

Function ReplaceDates(text As String, pattern As String, Optional centuryPrefix As String)
    Dim replacement As String
    Dim fullMatch As String

    With regEx
        .Global = False
        .MultiLine = True
        .IgnoreCase = False
        .pattern = pattern
    End With

    If regEx.test(text) Then
        Set matches = regEx.Execute(text)
        fullMatch = matches(0).Value
        replacement = Replace(text, fullMatch, centuryPrefix & matches(0).SubMatches(0) & "." & matches(0).SubMatches(1) & "." & matches(0).SubMatches(2) & " - ")
        ReplaceDates = replacement
    End If
End Function

Sub test()
    Dim pattern1 As String
    Dim pattern2 As String
    Dim pattern3 As String

    ' will match "140324 xxx"
    pattern1 = "^(\d{2})(\d{2})(\d{2})\s"
    ' will match "2014 03 24 - xxx"
    pattern2 = "^(\d{4})\s(\d{2})\s(\d{2})\s-\s"
    ' will match "20140324 xxx"
    pattern3 = "^(\d{4})(\d{2})(\d{2})\s"

    Debug.Print ReplaceDates("141024 xxxxxx ", pattern1, "20")
    Debug.Print ReplaceDates("2014 03 24 - xxxxxx ", pattern2)
    Debug.Print ReplaceDates("20140324 xxxxxx ", pattern3)
End Sub

2 Comments

Updated solution
That is awesome, I wasn't sure how the replacement function worked, but you have sorted that out perfectly. I'll post my complete solution once I get to work tomorrow.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.