Wrapping bash scripts in python

Question

I just found this great wget wrapper and I'd like to rewrite it as a python script using the subprocess module. However it turns out to be quite tricky giving me all sorts of errors.

download()
{
    local url=$1
    echo -n "    "
    wget --progress=dot $url 2>&1 | grep --line-buffered "%" | \
    sed -u -e "s,\.,,g" | awk '{printf("\b\b\b\b%4s", $2)}'

    echo -ne "\b\b\b\b"
    echo " DONE"
}

Then it can be called like this:

file="patch-2.6.37.gz"
echo -n "Downloading $file:"
download "http://www.kernel.org/pub/linux/kernel/v2.6/$file"

Any ideas?

Source: http://fitnr.com/showing-file-download-progress-using-wget.html

You'll need to how us what you have tried in Python so that we'll be able to help you. — UltraInstinct
– UltraInstinct, Commented Dec 5, 2013 at 7:06
Basically nothing yet..! I am currently lost in the subprocess documentation..! The ideal thing to do here would be an insightful explanation of a proposed solution so that I can properly grasp the concept of the subprocess module and expand on it. — stratis
– stratis, Commented Dec 5, 2013 at 7:12
Allright, so far I did this:wgetExecutable = '/usr/bin/wget' grepExecutable = '/usr/grep' wgetParameters = ['--progress=dot', "link_to_file"] grepParameters = ['--line-buffered', "%"] wgetPopen = subprocess.Popen([wgetExecutable] + wgetParameters, stdout=subprocess.PIPE) — stratis
– stratis, Commented Dec 5, 2013 at 10:22
grepPopen = subprocess.Popen([grepExecutable] + grepParameters, stdin=wgetPopen.stdout) however I get an error in stdin=wgetPopen.stdout OSError: [Errno 2] No such file or directory — stratis
– stratis, Commented Dec 5, 2013 at 10:29
Note that there is also an sh module (with that name) that can take care of the bridge between bash and python! — PascalVKooten
– PascalVKooten, Commented Dec 9, 2013 at 7:50

dfarrell07 · Accepted Answer · 2014-10-18 18:02:51Z

5

+100

I think you're not far off. Mainly I'm wondering, why bother with running pipes into grep and sed and awk when you can do all that internally in Python?

#! /usr/bin/env python

import re
import subprocess

TARGET_FILE = "linux-2.6.0.tar.xz"
TARGET_LINK = "http://www.kernel.org/pub/linux/kernel/v2.6/%s" % TARGET_FILE

wgetExecutable = '/usr/bin/wget'
wgetParameters = ['--progress=dot', TARGET_LINK]

wgetPopen = subprocess.Popen([wgetExecutable] + wgetParameters,
                             stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

for line in iter(wgetPopen.stdout.readline, b''):
    match = re.search(r'\d+%', line)
    if match:
        print '\b\b\b\b' + match.group(0),

wgetPopen.stdout.close()
wgetPopen.wait()

edited Oct 18, 2014 at 18:02

dfarrell07

3,0482 gold badges24 silver badges28 bronze badges

answered Dec 9, 2013 at 4:53

Tim Pierce

5,7041 gold badge18 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Tim Pierce Over a year ago

It does. Try on a smaller file. Or wait a little longer. :-)

stratis Over a year ago

Your code seems to update on some sort of intervals and in this file for example the first progress indication is only after 25%. However I need the progress to be instantaneous from the start just like the bash script..!

Tim Pierce Over a year ago

On my machine the behavior of this script is identical to the behavior of the bash script you posted. They both produce line-buffered output at the same rate. I'd be happy to adjust the script to do something different but I'm not able to reproduce the behavior you're talking about. I suspect that you're just seeing different response times for different files.

Tim Pierce Over a year ago

Ah: I get results closer to what you describe if I use awk -W interactive in the bash script. I'll poke at this some more later and see if I need to do something special to force line-buffered output in subprocess.

jfs Over a year ago

+1. wgetPopen.stdout might be destroyed (I expect so, but I don't know). As well as with ordinary files, it is better to close them explicitly (with-statement is used for the files) without relying on garbage collection (that is complex and hard to reason about). if not obj says "if obj empty or zero" (the test for None should be written as if obj is None) without concerning with types e.g., in Python 3 pipe.readline() may return b'' or '' that are different types and if not line works for both. And It supports both Python 2/3 from the same source.

|

jfs · Accepted Answer · 2013-12-09 18:54:32Z

If you are rewriting the script in Python; you could replace wget by urllib.urlretrieve() in this case:

#!/usr/bin/env python
import os
import posixpath
import sys
import urllib
import urlparse

def url2filename(url):
    """Return basename corresponding to url.

    >>> url2filename('http://example.com/path/to/file?opt=1')
    'file'
    """
    urlpath = urlparse.urlsplit(url).path  # pylint: disable=E1103
    basename = posixpath.basename(urllib.unquote(urlpath))
    if os.path.basename(basename) != basename:
        raise ValueError  # refuse 'dir%5Cbasename.ext' on Windows
    return basename

def reporthook(blocknum, blocksize, totalsize):
    """Report download progress on stderr."""
    readsofar = blocknum * blocksize
    if totalsize > 0:
        percent = readsofar * 1e2 / totalsize
        s = "\r%5.1f%% %*d / %d" % (
            percent, len(str(totalsize)), readsofar, totalsize)
        sys.stderr.write(s)
        if readsofar >= totalsize: # near the end
            sys.stderr.write("\n")
    else: # total size is unknown
        sys.stderr.write("read %d\n" % (readsofar,))

url = sys.argv[1]
filename = sys.argv[2] if len(sys.argv) > 2 else url2filename(url)
urllib.urlretrieve(url, filename, reporthook)

Example:

$ python download-file.py http://example.com/path/to/file

It downloads the url to a file. If the file is not given then it uses basename from the url.

You could also run wget if you need it:

#!/usr/bin/env python
import sys
from subprocess import Popen, PIPE, STDOUT

def urlretrieve(url, filename=None, width=4):
    destination = ["-O", filename] if filename is not None else []
    p = Popen(["wget"] + destination + ["--progress=dot", url],
              stdout=PIPE, stderr=STDOUT, bufsize=1) # line-buffered (out side)
    for line in iter(p.stdout.readline, b''):
        if b'%' in line: # grep "%"
            line = line.replace(b'.', b'') # sed -u -e "s,\.,,g"
            percents = line.split(None, 2)[1].decode() # awk $2
            sys.stderr.write("\b"*width + percents.rjust(width))
    p.communicate() # close stdout, wait for child's exit
    print("\b"*width + "DONE")

url = sys.argv[1]
filename = sys.argv[2] if len(sys.argv) > 2 else None
urlretrieve(url, filename)

I have not noticed any buffering issues with this code.

sunus · Accepted Answer · 2013-12-10 07:32:10Z

2

I've done something like this before. and i'd love to share my code with you:)

#!/usr/bin/python2.7
# encoding=utf-8

import sys
import os
import datetime

SHEBANG = "#!/bin/bash\n\n"

def get_cmd(editor='vim', initial_cmd=""):
    from subprocess import call
    from tempfile import NamedTemporaryFile
    # Create the initial temporary file.
    with NamedTemporaryFile(delete=False) as tf:
        tfName = tf.name
        tf.write(initial_cmd)
    # Fire up the editor.
    if call([editor, tfName], shell=False) != 0:
        return None
        # Editor died or was killed.
        # Get the modified content.
    fd = open(tfName)
    res = fd.read()
    fd.close()
    os.remove(tfName)
    return res

def main():
    initial_cmd = "wget " + sys.argv[1]
    cmd  = get_cmd(editor='vim', initial_cmd=initial_cmd)
    if len(sys.argv) > 1 and sys.argv[1] == 's':
        #keep the download infomation.
        t = datetime.datetime.now()
        filename = "swget_%02d%02d%02d%02d%02d" %\
                (t.month, t.day, t.hour, t.minute, t.second)
        with open(filename, 'w') as f:
            f.write(SHEBANG)
            f.write(cmd)
            f.close()
            os.chmod(filename, 0777)
    os.system(cmd)

main()


# run this script with the optional argument 's'
# copy the command to the editor, then save and quit. it will 
# begin to download. if you have use the argument 's'.
# then this script will create another executable script, you 
# can use that script to resume you interrupt download.( if server support)

so, basically, you just need to modify the initial_cmd's value, in your case, it's

wget --progress=dot $url 2>&1 | grep --line-buffered "%" | \
    sed -u -e "s,\.,,g" | awk '{printf("\b\b\b\b%4s", $2)}'

this script will first create a temp file, then put shell commands in it, and give it execute permissions. and finally run the temp file with commands in it.

answered Dec 10, 2013 at 7:32

sunus

8389 silver badges12 bronze badges

2 Comments

jfs Over a year ago

i'd love to give you some feedback :) You could call(filename) instead of os.system(cmd). To format datetime, you could use .strftime() method. with-statement closes files automatically that is the point of using it in the first place, no need to call f.close() by hand (unindent chmod in this case). If you want to make script executable by your user: os.chmod(filename, os.stat(filename).st_mode | stat.S_IEXEC) (or | 0111 for +x). To avoid leaking files, move code inside with Named..File() as tf: call tf.flush() before call([editor..) then tf.seek(0); res=tf.read()

sunus Over a year ago

@J.F.Sebastian wow, thank you, man! it's a script i wrote long time ago. I was a bad python programmer back then:) thank you for pointing that out!

Henk Langeveld · Accepted Answer · 2013-12-10 09:29:15Z

1

vim download.py

#!/usr/bin/env python

import subprocess
import os

sh_cmd = r"""
download()
{
    local url=$1
    echo -n "    "
    wget --progress=dot $url 2>&1 |
        grep --line-buffered "%"  |
        sed -u -e "s,\.,,g"       |
        awk '{printf("\b\b\b\b%4s", $2)}'

    echo -ne "\b\b\b\b"
    echo " DONE"
}
download "http://www.kernel.org/pub/linux/kernel/v2.6/$file"
"""

cmd = 'sh'
p = subprocess.Popen(cmd, 
    shell=True,
    stdin=subprocess.PIPE,
    env=os.environ
)
p.communicate(input=sh_cmd)

# or:
# p = subprocess.Popen(cmd,
#    shell=True,
#    stdin=subprocess.PIPE,
#    env={'file':'xx'})
# 
# p.communicate(input=sh_cmd)

# or:
# p = subprocess.Popen(cmd, shell=True,
#    stdin=subprocess.PIPE,
#    stdout=subprocess.PIPE,
#    stderr=subprocess.PIPE,
#    env=os.environ)
# stdout, stderr = p.communicate(input=sh_cmd)

then you can call like:

file="xxx" python dowload.py

edited Dec 10, 2013 at 9:29

Henk Langeveld

8,5461 gold badge46 silver badges58 bronze badges

answered Dec 9, 2013 at 6:26

atupal

17.3k6 gold badges34 silver badges43 bronze badges

3 Comments

Martijn Pieters Over a year ago

Why use sh as the command, and use shell=True? Why not run sh_cmd directly?

atupal Over a year ago

@MartijnPieters Because the sh_cmd is not a "shell command", so we use sh to run it. In linux shell, we can use sh script.sh , and we can also use a PIPE or stdin to run some command, such as:cat some_file | sh or curl http://xxx.xx | sh and so on. For shell=Ture, From the docs, is says:The shell argument (which defaults to False) specifies whether to use the shell as the program to execute. If shell is True, it is recommended to pass args as a string rather than as a sequence.

Martijn Pieters Over a year ago

If you set shell=True a shell is used to run the command you pass in. You quoted the documentation yourself there.

securecurve · Accepted Answer · 2013-12-09 07:45:50Z

0

In very simple words, considering you have script.sh file, you can execute it and print its return value, if any:

import subprocess
process = subprocess.Popen('/path/to/script.sh', shell=True, stdout=subprocess.PIPE)
process.wait()
print process.returncode

answered Dec 9, 2013 at 7:45

securecurve

5,8276 gold badges49 silver badges84 bronze badges

2 Comments

atupal Over a year ago

And ensure the script.sh has execute permission(chmod +x script.sh) or Popen('sh /path/to/script.sh', shell=True ...)

securecurve Over a year ago

sure, it must have an execute permission +X, otherwise, it will give you an error, then, the above python code should work like a charm!

Collectives™ on Stack Overflow

Wrapping bash scripts in python

5 Answers 5

10 Comments

Comments

2 Comments

3 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

10 Comments

Comments

2 Comments

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related