I'm using PDFminer to convert pdf to html file.
Wrong Code:
def pdf2html(filename, path):
outfile_name = filename.split('.')[0] + '.html'
cmd = ['pdf2txt.py', '-o', path + outfile_name, path + filename]
print ' '.join(cmd)
subprocess.call(cmd, shell=True)
filename = "040214_MOOCs.pdf"
path = "/Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/"
pdf2html(filename, path)
The above code is supposed to run "pdf2txt.py -o /Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/040214_MOOCs.html /Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/040214_MOOCs.pdf" in the shell.
But there's no output(040214_MOOCs.html) using above code. If I run the command in shell, it generates output with no problem.
Then I tried following script and it works, the only difference is using os.system instead of subprocess.call:
def pdf2html(filename, path):
outfile_name = filename.split('.')[0] + '.html'
cmd = ['pdf2txt.py', '-o', path + outfile_name, path + filename]
print ' '.join(cmd)
os.system(' '.join(cmd))
filename = "040214_MOOCs.pdf"
path = "/Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/"
pdf2html(filename, path)
Also, in wrong code, if I set shell=False, the code also works, why that's the case? Why subprocess doesn't work in this case while os.system works? Very confusing, need explanation.