Python multiple sub regex

Question

Initially having working script like this to go over the csv files in the folder and substitute a sub-string:

import fileinput
import os
import glob

#### Directory and file mask
this = r"C:\work\PythonScripts\Replacer\*.csv"
output_folder = "C:\\work\\PythonScripts\\Replacer\\"

#### Get files
files = glob.glob(this)

#### Section to replace
text_to_search = 'z'
replacement_text = 'ZZ_Top'

#### Loop through files and lines:
for f in files:
    head, tail = os.path.split(f)
    targetFileName = os.path.join(head, output_folder, tail)

    with fileinput.FileInput(targetFileName, inplace=True, backup='.bak') as file:
        for line in file:
            print(line.replace(text_to_search, replacement_text), end='')

There came a necessity to substitute several Word quotes and long hyphen. So I thought of using something like this in the cycle above:

s = '’ ‘ ’ ‘ ’ – “ ” “ – ’'
print(s)
print(s.replace('’', '\'').replace('‘', '\'').replace('–','-').replace('“','"').replace('”','"'))

==>

’ ‘ ’ ‘ ’ – “ ” “ – ’
' ' ' ' ' - " " " - '

But then I came across the following comment of using regex sub function: https://stackoverflow.com/a/765835

So I tried it and it worked fine on its own:

import re

def multisub(subs, subject):
 #   "Simultaneously perform all substitutions on the subject string."
    pattern = '|'.join('(%s)' % re.escape(p) for p, s in subs)
    substs = [s for p, s in subs]
    replace = lambda m: substs[m.lastindex - 1]
    return re.sub(pattern, replace, subject)

print(multisub([('’', '\''), ('‘', '\''), ('–','-'), ('“','"'), ('”','"')], '1’ 2‘ 1’ 2‘ 1’ 3– 4“ 5” 4“ 3– 2’'))

==>

1' 2' 1' 2' 1' 3- 4" 5" 4" 3- 2'

But as soon as I sticked it to the original script it runs but doesn't modify the file:

import fileinput
import os
import glob
import re

#### Directory and file mask
this = r"C:\work\PythonScripts\Replacer\*.csv"
output_folder = "C:\\work\\PythonScripts\\Replacer\\"

#### RegEx substitution func
def multisub(subs, subject):
 #   "Simultaneously perform all substitutions on the subject string."
    pattern = '|'.join('(%s)' % re.escape(p) for p, s in subs)
    substs = [s for p, s in subs]
    replace = lambda m: substs[m.lastindex - 1]
    return re.sub(pattern, replace, subject)

#### Get files
files = glob.glob(this)

#### Loop through files and lines:
for f in files:
    head, tail = os.path.split(f)
    targetFileName = os.path.join(head, output_folder, tail)

    with fileinput.FileInput(targetFileName, inplace=True, backup='.bak') as file:
        for line in file:
            print(multisub([('’', '\''), ('‘', '\''), ('–','-'), ('“','"'), ('”','"')], line), end='')

What could be wrong here?

jdaz · Accepted Answer · 2020-07-08 21:44:59Z

1

Your code actually works for me as is when I test it, but you have a lot of unnecessary processing in there that may be introducing errors. The big advantage of using fileinput over regular open is that it can loop through lines in multiple files without needing another loop to open each file individually. So try this and see if it works:

#### Get files
files = glob.glob(this)

#### Loop through files and lines:
for line in fileinput.input(files, inplace=True, backup='.bak'):
    print(multisub([('’', '\''), ('‘', '\''), ('–','-'), ('“','"'), ('”','"')], line), end='')

answered Jul 8, 2020 at 21:44

jdaz

6,0812 gold badges24 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

RandyMcKay Over a year ago

For some reason, having a csv file containing the chars in the folder I end up with empty file using this code...

RandyMcKay · Accepted Answer · 2020-09-04 17:29:59Z

1

It appears that the code is working by itself. The missing piece was that it was running on Windows, so I had to add PYTHONUTF8 system variable with value of 1 to Environment Variables. After this the original code was working fine.

answered Sep 4, 2020 at 17:29

RandyMcKay

3163 silver badges17 bronze badges

Collectives™ on Stack Overflow

Python multiple sub regex

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related