Python 3.6
I'd like to remove a list of strings from a string. Here is my first poor attempt:
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = list(filter(lambda x: x not in items_to_remove, string.split(' ')))
print(result)
output:
['test']
But this doesn't work if x isn't nicely spaced. I feel there must be a builtin solution, hmm There must be a better way!
I've had a look at this discussion on stack overflow, exact question as mine...
Not to waste my efforts. I timed all the solutions. I believe the easiest, fastest and most pythonic is the simple for loop. Which was not the conclusion in the other post...
result = string
for i in items_to_remove:
result = result.replace(i,'')
Test Code:
import timeit
t1 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = list(filter(lambda x: x not in items_to_remove, string.split(' ')))
''', number=1000000)
print(t1)
t2 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
def sub(m):
return '' if m.group() in items_to_remove else m.group()
result = re.sub(r'\w+', sub, string)
''',setup= 'import re', number=1000000)
print(t2)
t3 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = re.sub(r'|'.join(items_to_remove), '', string)
''',setup= 'import re', number=1000000)
print(t3)
t4 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = string
for i in items_to_remove:
result = result.replace(i,'')
''', number=1000000)
print(t4)
outputs:
1.9832003884248448
4.408749988641971
2.124719851741177
1.085117268194475
forloop, will replace substrings as well. Try changing the order of youritems_to_removeto:['is', 'this', 'a', 'string']and you'll see what I'm talking about.