-2

I have a strings test01, abcd02, xyz05 from those strings, I will have last 2 characters are always numbers. From those strings, I want a regex expression to capture test, abcd, xyx. How can I capture it?

6
  • Something like (test|abcd|xyz)\d\d. What stopped you from writing this yourself? Commented Sep 6, 2023 at 17:54
  • @Luatic i just gave those strings as an example, sorry if i was not clear. I need to capture the word except last 2 letters is my ask here. Commented Sep 6, 2023 at 17:55
  • 1
    So (.*)..$ will capture everything before the last 2 characters. Commented Sep 6, 2023 at 17:56
  • @Barmar that works, regex101.com/r/MF7S5A/1. Thanks Commented Sep 6, 2023 at 18:02
  • 1
    Obviously it works, it's trivial. What problem were you having coming up with this? Commented Sep 6, 2023 at 19:16

2 Answers 2

2

A few questions:

  • Could your string have more or less than 2 digits?
  • If it's fixed to two digits, then why not just dropping the 2 last chars and not use a regular expression?
  • Is it because we have to validate the input? Typically, what about "# @123"?

If you have to check that it's ending with digits, then don't use the solution (.*)..$ proposed in the comments as . matches any character and you'll get, for example, "Hel" out of "Hello". It has the same effect as just truncating your string.

I would personally be more precise and also take in consideration only words, to avoid matching something like "12345" or "!#@123".

I would suggest this:

/^(\p{L}+)\d+$/u

Explanation:

  • The u flag at the end is for unicode, so that you can handle special chars, such as emojis or other special characters, not knowing what is your input text.
  • With PCRE, you can use unicode character classes. This can help you match a word character in any language with \p{L}, which means Letter. It's about the same as \w but with the handling of multiple codepoint sequences.
  • If the end of your string must be digits then you can use \d+. If it really has to be only 2 digits, then replace it by \d{2}.

const strings = [
  'test01',  // Ok
  'abcd02',  // Ok
  'test123', // More than 2 digits, perhaps ok also?
  'vidéo05', // Accented chars in the word, ok or not?
  '123456',  // Only digits => should it match? maybe not!
  '####03',  // Not word chars before the digits... hmm, no match.
  'Hello'    // No digits at all... no match.
];

const regex = /^(\p{L}+)\d+$/u;

strings.forEach(string => {
  const match = regex.exec(string);
  if (match) {
    console.log(`Word found in "${string}" is "${match[1]}"`);
  }
  else {
    console.log(`Does NOT match "${string}"`);
  }
});

With PCRE you'll get the same: https://regex101.com/r/bvY3dg/1

Sign up to request clarification or add additional context in comments.

4 Comments

An exceptionally good answer for an exceptionally lazy question. I don't have your tolerance, but this is good stuff - well done.
@halfer Thanks a lot! Yes, you are totally right, I probably shouldn't have replied to the rather lazy question. He probably even didn't see my answer, haha. I'll be less tolerant in the future, as you say!
It's actually fine - since you have some upvotes I think the question won't be deleted. But some questions are so terrible that folks will vote to delete, and any good answers will vanish with them.
It perhaps is not a requirement that any given question author responds to or appreciates an answer, since good ones will be appreciated by future readers anyway.
0

This regex worked for me:

(.*)..$

https://regex101.com/r/MF7S5A/1

1 Comment

That pattern is too generous - the two dots at the end anchor will match any characters, not just digits.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.