1

I'm using PDFminer to convert pdf to html file.

Wrong Code:

def pdf2html(filename, path):
    outfile_name = filename.split('.')[0] + '.html'
    cmd = ['pdf2txt.py', '-o', path + outfile_name, path + filename]
    print ' '.join(cmd)
    subprocess.call(cmd, shell=True)

filename = "040214_MOOCs.pdf"
path = "/Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/"
pdf2html(filename, path)

The above code is supposed to run "pdf2txt.py -o /Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/040214_MOOCs.html /Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/040214_MOOCs.pdf" in the shell.

But there's no output(040214_MOOCs.html) using above code. If I run the command in shell, it generates output with no problem.

Then I tried following script and it works, the only difference is using os.system instead of subprocess.call:

def pdf2html(filename, path):
    outfile_name = filename.split('.')[0] + '.html'
    cmd = ['pdf2txt.py', '-o', path + outfile_name, path + filename]
    print ' '.join(cmd)
    os.system(' '.join(cmd))

filename = "040214_MOOCs.pdf"
path = "/Users/andy/GoogleDrive/Debate/intelligencesquaredus/data/"
pdf2html(filename, path)

Also, in wrong code, if I set shell=False, the code also works, why that's the case? Why subprocess doesn't work in this case while os.system works? Very confusing, need explanation.

1 Answer 1

1

This is likely because of shell mismatch. Can you try running your subprocess call without shell=True?

Sign up to request clarification or add additional context in comments.

1 Comment

With shell=True, the shell that is invoked could be different from the one you are currently using. Consequently it will not have the same PATH, etc.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.