1

Just for learning I am trying to replace all the special characters present in the keyboard to replace with underscore'_'

List of characters= ~!@#$%^&*()+|}{:"?><-=[]\;',./

string I created:

table = """123~!@#$%^&*()+|}{:"?><-=[]\;',./"""

import re

table1= re.sub(r'!~@#$%^&*()-+={}[]:;<.>?/\'"', '_', table)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/re.py", line 151, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/usr/lib64/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: unexpected end of regular expression

Unable to do so I am getting the above error.

How can I replace the special characters in the string using regex

8
  • Is that your exact string? It does not have any quotes. Try table = """...""" Also, what's with the 123 at the beginning? Commented Nov 14, 2017 at 21:02
  • Note that )-+ creates an invalid range. Always put the - at the end/start of the character class. Yeah, and use a character class :) Commented Nov 14, 2017 at 21:02
  • @tobias_k yes it is the exact string and nothing in front of 123 Commented Nov 14, 2017 at 21:04
  • 1
    There's also \W, which matches-all non-word characters: re.sub(r'\W', '_', some_string). docs.python.org/3/library/re.html#regular-expression-syntax Commented Nov 14, 2017 at 21:17
  • 1
    @Blurp Good point, but that would also replace e.g. whitespace. Might use something like [^\w\s] though, depending on what exactly OP wants to replace. Also, there seem to be chars like | what shall not be replaced (might be a mistake in the question, though). Commented Nov 14, 2017 at 21:23

2 Answers 2

2

You could use re.escape to escape all the special regex characters in the string, and then enclose the escaped string into [...] so it matches any of them.

>>> re.sub("[%s]" % re.escape('!~@#$%^&*()-+={}[]:;<.>?/\''), '_', table)
'123____________|___"_______\\__,__'

However, as you are not really using that regex as a regex, you might instead just check whether each character is in that string:

>>>''.join("_" if c in '!~@#$%^&*()-+={}[]:;<.>?/\'' else c for c in table)
'123____________|___"_______\\__,__'

Or to make the lookup a bit faster, create a set from the chars in that string first:

>>> bad_chars = set('!~@#$%^&*()-+={}[]:;<.>?/\'')
>>> ''.join("_" if c in bad_chars else c for c in table)
Sign up to request clarification or add additional context in comments.

Comments

1

Just put it in a character class and rearrange the position of some characters (namely -, escaping +):

import re
table = """123~!@#$%^&*()+|}{:"?><-=[]\;',./"""

table1 = re.sub(r'[-\+!~@#$%^&*()={}\[\]:;<.>?/\'"]', '_', table)
print(table1)
# 123____________|___________\__,__

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.