find and remove some substrings from a long list of string in python

Question

I need to read a list of string and removing some special character. I wrote code which works but I am looking for a way to write this code efficient.Because, I need to do this process for 1 million long lists(e.g each list has 100000 words).

I wrote example to clear my question.

input:
 str= ['short', 'club', 'edit', 'post\C2', 'le\C3', 'lundi', 'janvier', '2008'] 
 specialSubString=['\C2','\C3','\E2'] 

output:
 str= ['short', 'club', 'edit', 'post', 'le', 'lundi', 'janvier', '2008']

My code:

ml=len(str)
for w in range(0,ml):
   for i in range(0, len(specialSubString)):
       token=specialSubString[i]
       if token not in str[w]: 
          continue
       else:
          l= len(token)
          t= str[w]
          end= len(t)-l
          str[w]=t[:end]
          break

for w in str:
    print w

TigerhawkT3 · Accepted Answer · 2017-03-10 04:28:44Z

3

Create a string with all the special characters you'd like to remove, and strip them off the right side:

strings = ['short', 'club', 'edit', 'post\C2', 'le\C3', 'lundi', 'janvier', '2008']
special = ''.join(['\C2','\C3','\E2']) # see note

Note at this point that \ is a special character and you should escape it whenever you use it, to avoid ambiguity. You can also simply create a string literal rather than using str.join.

special = '\\C2\\C3\\E2' # that's better

strings[:] = [item.rstrip(special) for item in strings]

answered Mar 10, 2017 at 4:28

TigerhawkT3

49.5k6 gold badges66 silver badges101 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

find and remove some substrings from a long list of string in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related