How can i search and replace using python regex

Question

I want to make the function which find for string in the array and then replace the corres[ponding element from the dictionary. so far i have tried this but i am not able to figure out few things like

How can escape special characters
I can i replace with match found. i tried \1 but it didn't work

dsds

def myfunc(h):
        myarray = {
                "#":"\\#",
                "$":"\\$",
                "%":"\\%",
                "&":"\\&",
                "~":"\\~{}",
                "_":"\\_",
                "^":"\\^{}",
                "\\":"\\textbackslash{}",
                "{":"\\{",
                "}":"\\}"                
                    }
        pattern = "[#\$\%\&\~\_\^\\\\\{\}]"
        pattern_obj = re.compile(pattern, re.MULTILINE)
        new = re.sub(pattern_obj,myarray[\1],h)

        return new

georg · Accepted Answer · 2013-04-28 07:57:00Z

3

You're looking for re.sub callbacks:

def myfunc(h):
    rules = {
            "#":r"\#",
            "$":r"\$",
            "%":r"\%",
            "&":r"\&",
            "~":r"\~{}",
            "_":r"\_",
            "^":r"\^{}",
            "\\":r"\textbackslash{}",
            "{":r"\{",
            "}":r"\}"                
    }
    pattern = '[%s]' % re.escape(''.join(rules.keys()))
    new = re.sub(pattern, lambda m: rules[m.group()], h)
    return new

This way you avoid 1) loops, 2) replacing already processed content.

answered Apr 28, 2013 at 7:57

georg

216k57 gold badges324 silver badges401 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user196264097 Over a year ago

i am not able to figure out what will be input parameter to lambda , and how will that gets passed

georg Over a year ago

@user196264097: the parameter is a Match object (docs.python.org/dev/library/re.html#re.sub)

user196264097 Over a year ago

thanks man , i wonder you guys read every line of the documentation. i have usually gone through the docs but not line by line. I am usually stuck where i skip the lines :)

Anil Vaitla · Accepted Answer · 2013-04-28 07:34:18Z

1

You can try to use re.sub inside a loop that iterates over myarray.items(). However, you'll have to do backslash first since otherwise that might replace things incorrectly. You also need to make sure that "{" and "}" happen first, so that you don't mix up the matching. Since dictionaries are unordered I suggest you use list of tuples instead:

def myfunc(h):
    myarray = [
            ("\\","\\textbackslash")
            ("{","\\{"),
            ("}","\\}"),
            ("#","\\#"),
            ("$","\\$"),
            ("%","\\%"),
            ("&","\\&"),
            ("~","\\~{}"),
            ("_","\\_"),
            ("^","\\^{}")]

    for (val, replacement) in myarray:
        h = re.sub(val, replacement, h)
    h = re.sub("\\textbackslash", "\\textbackslash{}", h)

    return h

edited Apr 28, 2013 at 7:34

answered Apr 28, 2013 at 7:27

Anil Vaitla

2,97824 silver badges32 bronze badges

3 Comments

Aleksei Zyrianov Over a year ago

I feel that h.replace(val, replacement) could be faster than re.sub(val, replacement, h) and will fit our question's case.

Anil Vaitla Over a year ago

yes, I am not too familiar with the performance differences, but I think yours is more readable.

Aleksei Zyrianov Over a year ago

Thanks. Just was curious: yes, str.replace is faster than re.sub. 6.7x faster according to tests on my PC :)

Aleksei Zyrianov · Accepted Answer · 2013-04-28 08:05:39Z

1

I'd suggest you to use raw literal syntax (r"") for better readability of the code.
For the case of your array you may want just to use str.replace function instead of re.sub.

def myfunc(h):
    myarray = [
            ("\\", r"\textbackslash"),
            ("{", r"\{"),
            ("}", r"\}"),
            ("#", r"\#"),
            ("$", r"\$"),
            ("%", r"\%"),
            ("&", r"\&"),
            ("~", r"\~{}"),
            ("_", r"\_"),
            ("^", r"\^{}")]

    for (val, replacement) in myarray:
        h = h.replace(val, replacement)
    h = h.replace(r"\textbackslash", r"\textbackslash{}", h)

    return h

The code is a modification of @tigger's answer.

edited Apr 28, 2013 at 8:05

answered Apr 28, 2013 at 7:43

Aleksei Zyrianov

2,3821 gold badge25 silver badges33 bronze badges

Comments

Elazar · Accepted Answer · 2013-04-28 07:32:01Z

0

to escape metacharacters, use raw string and backslashes

r"regexp with a \* in it"

answered Apr 28, 2013 at 7:32

Elazar

22k4 gold badges51 silver badges68 bronze badges

Collectives™ on Stack Overflow

How can i search and replace using python regex

4 Answers 4

3 Comments

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related