1

I'm working on a Nextflow pipeline that uses a custom module. This module includes a Python script (script_1.py) located in a nested folder <module-dir>/resources/usr/bin. The script_1.py has been made executable and the nextflow.enable.moduleBinaries has been set to true in the ./nextflow.config file. However, when I try to run the pipeline, I get an error that the Python script cannot be found.

Module directory structure

modules/
└── local/
    └── mymodule/
        ├── environment.yml
        ├── main.nf
        ├── resources/
        │   └── usr/
        │       └── bin/
        │           └── script_1.py
        └── work/

Error message

Here's the error I get when running the pipeline:

Caused by:
  Process `MyProcess (1)` terminated with an error exit status (2)

Command executed:

  python script_1.py

  cat <<-END_VERSIONS > versions.yml
      "MyProcess":
          python: $(python --version 2>&1 | sed 's/Python //g')
  END_VERSIONS

Command exit status:
  2

Command output:
  (empty)

Command error:
  python: can't open file 'script_1.py': [Errno 2] No such file or directory

What I tried

In my main.nf, I had the following:

#!/usr/bin/env nextflow

include { MyProcess } from './modules/local/mymodule/main.nf'

And in my ./modules/local/mymodule/main.nf, I had the following:

#!/usr/bin/env nextflow

process MyProcess{
    conda "${moduleDir}/environment.yml"

    input:
    path(input_folder)
    
    output:
    path("data.csv")
    path "versions.yml"                , emit: versions

    script:
    """
    python script_1.py ${input_folder}

cat <<-END_VERSIONS > versions.yml
    "${task.process}":
        python: \$(python --version 2>&1 | sed 's/Python //g')
    END_VERSIONS
    """ 
    
}

But script_1.py is never found, and the process fails.

My question

Is this the correct way to reference such scripts in module in Nextflow pipelines?

1 Answer 1

1

I suspect this is because you are not treating the python script like a binary as the language suggests in the the docs.

You use python script_1.py, which tells the process to use python to look for a script locally rather than invoking the script as a binary. Instead, you should treat the script as a binary using script_1.py, but ensuring the shebang in the script is pointing to the correct interpreter.

I usually just read scripts in as value channels since it's easier, and you don't need to use wave-containers on GCP/AWS, so this answer is just from my interpretation of the docs. Hope it works.

Sign up to request clarification or add additional context in comments.

1 Comment

The shebang in the script pointing to python interpreter solved the problem! Thank you! I also moved the script from the <module-dir>/resources/usr/bin to the ./bin directory and the process worked fine as well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.