I have to remove any punctuation marks from the start and at the end of the word.
I am using re.sub to do it.
re.sub(r'(\w.+)(?=[^\w]$)','\1',text)
Grouping not working out - all I get is ☺. for Mihir4. in command line
If you have string with multiple words, such as
text = ".adfdf. 'df' !3423? ld! :sdsd"
this will do the trick (it will also work for single words, of course):
>>> re.sub(r'[^\w\s]*(\w+)[^\w\s]*', r'\1', text)
'adfdf df 3423 ld sdsd'
Notice the r in r'\1'. This is equivalent to '\\1'.
>>> re.sub(r'[^\w\s]*(\w+)[^\w\s]*', '\\1', text)
'adfdf df 3423 ld sdsd'
Further reading: the backslash plague
The string literal '\1' is equivalent to '\x01'. You need to escape it or use raw string literal to mean backreference group 1.
BTW, you don't need to use the capturing group.
>>> re.sub(r'^[^-\w]+|[^-\w]$', '', 'Mihir4.')
'Mihir4'
'Mihir4.' with text as you did in the question.
[\w+\-]+which will give me words only separated by a-. But my question is how to replace a string with any pattern .str.strip(as long astextis just that one word)