0

Here is a reproducible example of what is needed. Lets say we have word HIV/AIDS. My question is how to write a regular expression to search for a string like this and replace it with strings HIV_AIDS.

This is the search pattern I have been able to write. Is this good in practice?

txt='DDD/VCD'  #python 3.x

re1='((?:[a-z][a-z0-9_]*))' # Variable Name 1
re2='(\\/)' # Any Single Character 1
re3='((?:[a-z][a-z0-9_]*))' # Variable Name 2

rg = re.compile(re1+re2+re3,re.IGNORECASE|re.DOTALL)
m = rg.search(txt)

if m:
var1=m.group(1)
c1=m.group(2)
var2=m.group(3)
print ("("+var1+")"+"("+c1+")"+"("+var2+")"+"\n")

If my above code is good enough then please help me writing further code to replace the string(The sample I have already mentioned above).

I am still a beginner in regular expressions and want to write a simple regular expression for this using python-3.5x and above. I found re library in python but I am trying to write it without using the library. Any help will be appreciated. Thank you.

5
  • This might be better suited for codereview.stackexchange.com. Commented Feb 24, 2017 at 11:19
  • 1
    "[I] want to write a simple regular expression ... I am trying to write it without using the library" How is this supposed to work? Do you want to write your own regex engine? Or do you want to do it without regex? Commented Feb 24, 2017 at 11:20
  • @tobias_k I meant writing a directly by matching and replacing them using if/else conditions. Commented Feb 24, 2017 at 11:22
  • I want to ask you is that a better way to write a regex or using a library like "re" is a good way of doing it? Commented Feb 24, 2017 at 11:23
  • 2
    You can't use a regex without re library.. Commented Feb 24, 2017 at 11:26

1 Answer 1

2

You can use re.sub function. Replace should occur only if the compiler finds a match otherwise it would return the input string without any modifications.

>>> re.sub(r'(?i)\b([a-z][a-z0-9_]*)/([a-z][a-z0-9_]*)\b', r'\1_\2', 'DDD/VCD')
'DDD_VCD'
>>> 

or

Compile the regex if necessary.

reg = re.compile(r'(?i)\b([a-z][a-z0-9_]*)/([a-z][a-z0-9_]*)\b')
reg.sub(r'\1_\2', 'DDD/VCD')

\b word boundary which helps to separate word chars from non-word chars or (_vice_versa_)

Sign up to request clarification or add additional context in comments.

4 Comments

Is it necessary to compile the regex?
And why \b at the end in reg variable. Please help me understanding it. i understand rest of them.
If you willing to use the same regex in many places then it's better to compile it. If you wanna do it on the fly then use the first option.
Compiling the regex is not necessary in ordinary use cases: while compiling is expensive, the re module itself keep a cache of recently used regexps, so it won't recompile regexps used in a loop, or a frequently called function. (And for a single use, it does not make any difference, either). Only code using hundreds of dynamically generated regexps which might be reused would benefit of a custom mechanism to cache pre-compiled regexps.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.