Initially having working script like this to go over the csv files in the folder and substitute a sub-string:
import fileinput
import os
import glob
#### Directory and file mask
this = r"C:\work\PythonScripts\Replacer\*.csv"
output_folder = "C:\\work\\PythonScripts\\Replacer\\"
#### Get files
files = glob.glob(this)
#### Section to replace
text_to_search = 'z'
replacement_text = 'ZZ_Top'
#### Loop through files and lines:
for f in files:
head, tail = os.path.split(f)
targetFileName = os.path.join(head, output_folder, tail)
with fileinput.FileInput(targetFileName, inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(text_to_search, replacement_text), end='')
There came a necessity to substitute several Word quotes and long hyphen. So I thought of using something like this in the cycle above:
s = '’ ‘ ’ ‘ ’ – “ ” “ – ’'
print(s)
print(s.replace('’', '\'').replace('‘', '\'').replace('–','-').replace('“','"').replace('”','"'))
==>
’ ‘ ’ ‘ ’ – “ ” “ – ’
' ' ' ' ' - " " " - '
But then I came across the following comment of using regex sub function: https://stackoverflow.com/a/765835
So I tried it and it worked fine on its own:
import re
def multisub(subs, subject):
# "Simultaneously perform all substitutions on the subject string."
pattern = '|'.join('(%s)' % re.escape(p) for p, s in subs)
substs = [s for p, s in subs]
replace = lambda m: substs[m.lastindex - 1]
return re.sub(pattern, replace, subject)
print(multisub([('’', '\''), ('‘', '\''), ('–','-'), ('“','"'), ('”','"')], '1’ 2‘ 1’ 2‘ 1’ 3– 4“ 5” 4“ 3– 2’'))
==>
1' 2' 1' 2' 1' 3- 4" 5" 4" 3- 2'
But as soon as I sticked it to the original script it runs but doesn't modify the file:
import fileinput
import os
import glob
import re
#### Directory and file mask
this = r"C:\work\PythonScripts\Replacer\*.csv"
output_folder = "C:\\work\\PythonScripts\\Replacer\\"
#### RegEx substitution func
def multisub(subs, subject):
# "Simultaneously perform all substitutions on the subject string."
pattern = '|'.join('(%s)' % re.escape(p) for p, s in subs)
substs = [s for p, s in subs]
replace = lambda m: substs[m.lastindex - 1]
return re.sub(pattern, replace, subject)
#### Get files
files = glob.glob(this)
#### Loop through files and lines:
for f in files:
head, tail = os.path.split(f)
targetFileName = os.path.join(head, output_folder, tail)
with fileinput.FileInput(targetFileName, inplace=True, backup='.bak') as file:
for line in file:
print(multisub([('’', '\''), ('‘', '\''), ('–','-'), ('“','"'), ('”','"')], line), end='')
What could be wrong here?