What is a good way to source a Bash script from within Python? [duplicate]

Question

I have a basic sourcing function:

def source(
    fileName = None,
    update   = True
    ):
    pipe = subprocess.Popen(". {fileName}; env".format(
        fileName = fileName
    ), stdout = subprocess.PIPE, shell = True)
    data = pipe.communicate()[0]
    env = dict((line.split("=", 1) for line in data.splitlines()))
    if update is True:
        os.environ.update(env)
    return(env)

When I try to use it to source a particular script, I get the following error:

>>> source("/afs/cern.ch/sw/lcg/contrib/gcc/4.8/x86_64-slc6/setup.sh")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in source
ValueError: dictionary update sequence element #51 has length 1; 2 is required

This arises from the following lines returned by the executable env:

BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`
}

The closing chain bracket is on line 51.

How should one source a Bash script from within Python in a robust, sensible way such that errors like this (and any other likely ones you can think of) are avoided?

Why are you sourcing a shell script in python like this? What are you trying to do? Make shell variables into python variables? — Etan Reisner
– Etan Reisner, Commented Feb 12, 2015 at 0:05
What do you expect to happen when these lines are encountered? What you're running into is Bash shell script code, not just environment variables. — user149341
– user149341, Commented Feb 12, 2015 at 0:05
Etan Reisner I'm actually trying to run a Python script that sets up a certain environment that then makes Python modules (that are actually bound to a larger infrastructure) available for import. duskwuff I'm trying to make the environment that would be created by sourcing the shell script available in the Python environment that has run the source procedure. In what way should I be doing this? The basic approach I have currently is not reliable enough at all. — d3pd
– d3pd, Commented Feb 12, 2015 at 0:14
But isn't a "python module" simply a folder with python files in it? (with at least one file called __init__.py). I will try to answer your question, but your stated goal doesn't make sense to me. I don't know how a python script creates a python module. Unless you are dynamically making files in the filesystem in the script — Alexander Bird
– Alexander Bird, Commented Feb 12, 2015 at 0:32
Also, (and someone can correct me if I'm wrong), you simply can't change the python process's ENV vars by creating a subprocess. When the subprocess will sources the bash script, only the subprocess has its ENV vars changed. Thus, after it exits, your script's process will have no changes. — Alexander Bird
– Alexander Bird, Commented Feb 12, 2015 at 0:35

rici · Accepted Answer · 2015-02-12 03:46:13Z

The line you are seeing is the result of the script doing the following:

module() { eval `/usr/bin/modulecmd bash $*`; }
export -f module

That is, it is explicitly exporting the bash function module so that sub(bash)shells can use it.

We can tell from the format of the environment variable that you upgraded your bash in the middle of the shellshock patches. I don't think there is a current patch which would generate BASH_FUNC_module()= instead of BASH_FUNC_module%%()=, but iirc there was such a patch distributed during the flurry of fixes. You might want to upgrade your bash again now that things have settled down. (If that was a cut-and-paste error, ignore this paragraph.)

And we can also tell that /bin/sh on your system is bash, assuming that the module function was introduced by sourcing the shell script.

Probably you should decide whether you care about exported bash functions. Do you want to export module into the environment you are creating, or just ignore it? The solution below just returns what it finds in the environment, so it will include module.

In short, if you're going to parse the output of some shell command which tries to print the environment, you're going to have three possible issues:

Exported functions (bash only), which look different pre- and post-shellshock patch, but always contain at least one newline. (Their value always starts with () { so they are easy to identify. Post shellshock, their names will be BASH_FUNC_funcname%% but until you don't find both pre- and post-patched bashes in the wild, you might not want to rely on that.)
Exported variables which contain a newline.
In some case, exported variables with no value at all. These actually have the value of an empty string, but it is possible for them to be in the environment list without an = sign, and some utilities will print them out without an =.

As always, the most robust (and possibly even simplest) solution would be to avoid parsing, but we can fall back on the strategy of parsing a formatted string we create ourselves, which is carefully designed to be parsed.

We can use any programming language with access to the environment to produce this output; for simplicity, we can use python itself. We'll output the environment variables in a very simple format: the variable name (which must be alphanumeric), followed by an equal sign, followed by the value, followed by a NUL (0) byte (which cannot appear in the value). Something like the following:

from subprocess import Popen, PIPE

# The commented-out line really should not be necessary; it's impossible
# for an environment variable name to contain an =. However, it could
# be replaced with a more stringent check.
prog = ( r'''from os import environ;'''
       + r'''from sys import stdout;'''
       + r'''stdout.write("\0".join("{k}={v}".format(kv)'''
       + r'''                       for kv in environ.iteritems()'''
      #+ r'''                       if "=" not in kv[0]'''
       + r'''            ))'''
       )

# Lots of error checking omitted.    
def getenv_after_sourcing(fn):
  argv = [ "bash"
         , "-c"
         , '''. "{fn}"; python -c '{prog}' '''.format(fn=fn, prog=prog)]
  data = Popen(argv, stdout=PIPE).communicate()[0]
  return dict(kv.split('=', 1) for kv in data.split('\0'))

Lesmana · Accepted Answer · 2015-02-12 11:12:46Z

-1

I think it is generally better to use bash directly to set the environment and then invoke the python script in the already set environment. This is taking advantage of one of the core unix/linux principles: a child process inherits a copy of the environment of the parent process.

If I understood your situation correctly then you have some bash scripts which set some environment which you want to have in your python scripts. Those python scripts then use that prepared environment to set some more environment for some more tools.

I suggest following setup:

a bash wrapper
- set the environment using bash scripts
- invoke your python setup script (the python script inherits the environment from the bash script)
your current python scripts sans the subprocess and environment reading
- starts in environment prepared by bash script above
- continue work to prepare environment for next tools

This way you can use each scripts in their "native environment".

An alternative would be to translate the bash scripts to python manually.

answered Feb 12, 2015 at 11:12

Lesmana

27.3k12 gold badges84 silver badges87 bronze badges

1 Comment

Davide Over a year ago

This may or may not possible in the general case. In my case, I need to load modules from python, depending on some programmatically defined cases, so your approach will not work for that

Collectives™ on Stack Overflow

What is a good way to source a Bash script from within Python? [duplicate]

2 Answers 2

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Linked

Related