1

I need to read a list of string and removing some special character. I wrote code which works but I am looking for a way to write this code efficient.Because, I need to do this process for 1 million long lists(e.g each list has 100000 words).

I wrote example to clear my question.

input:
 str= ['short', 'club', 'edit', 'post\C2', 'le\C3', 'lundi', 'janvier', '2008'] 
 specialSubString=['\C2','\C3','\E2'] 

output:
 str= ['short', 'club', 'edit', 'post', 'le', 'lundi', 'janvier', '2008'] 

My code:

ml=len(str)
for w in range(0,ml):
   for i in range(0, len(specialSubString)):
       token=specialSubString[i]
       if token not in str[w]: 
          continue
       else:
          l= len(token)
          t= str[w]
          end= len(t)-l
          str[w]=t[:end]
          break

for w in str:
    print w

1 Answer 1

3

Create a string with all the special characters you'd like to remove, and strip them off the right side:

strings = ['short', 'club', 'edit', 'post\C2', 'le\C3', 'lundi', 'janvier', '2008']
special = ''.join(['\C2','\C3','\E2']) # see note

Note at this point that \ is a special character and you should escape it whenever you use it, to avoid ambiguity. You can also simply create a string literal rather than using str.join.

special = '\\C2\\C3\\E2' # that's better

strings[:] = [item.rstrip(special) for item in strings]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.