2

I'm running a Python script inside a Conda-based Docker container that processes geospatial data. The script runs a two step GDAL workflow, it uses gdaldem colorrelief to create a colorized GeoTIFF, and gdal2tiles.py to generate map tiles from that result.

Gdaldem completes successfully every time. However, the script hangs indefinitely as soon as it calls gdal2tiles.py... It produces no error output, and surprisingly, even the timeout argument in subprocess.run does not trigger an exception, the whole process just freezes with these log:

2025-08-02 09:53:22,602 - INFO - Successfully created GeoTIFF: /app/geotiffs/skjav/reflectivity/reflectivity_20250802T094500Z.tif
2025-08-02 09:53:22,636 - INFO - Step 1: Colorizing /app/geotiffs/skjav/reflectivity/reflectivity_20250802T094500Z.tif with gdaldem.
2025-08-02 09:53:22,672 - INFO - Successfully colorized GeoTIFF to /app/static/tiles/skjav/reflectivity/20250802T094500Z/colorized.tif
2025-08-02 09:53:22,672 - INFO - Step 2: Generating tiles from /app/static/tiles/skjav/reflectivity/20250802T094500Z/colorized.tif with gdal2tiles.py.
<-- HANGS HERE -->

The code snippet in question:

    try:
        logging.info(f"coloring with with gdaldem.")
        color_map_content = create_color_map_file(product_config['cmap'], product_config['vmin'],
                                                  product_config['vmax'])
        with open(color_file_path, 'w') as f:
            f.write(color_map_content)

        cmd_colorize = ['gdaldem', 'color-relief', geotiff_path, color_file_path, colorized_tiff_path, '-alpha']
        subprocess.run(cmd_colorize, check=True, capture_output=True, text=True, timeout=60)
        logging.info(f"colored geotiff to {colorized_tiff_path}")

  
        logging.info(f"generating tiles {colorized_tiff_path} with gdal2tiles.py.")
        cmd_gdal2tiles = [
            'gdal2tiles.py',
            '--profile=raster',
            '--zoom=5-12',
            '--webp-quality=90',
            colorized_tiff_path,
            output_tile_dir
        ]
        subprocess.run(cmd_gdal2tiles, check=True, capture_output=True, text=True, timeout=180)

        logging.info(f"success - {output_tile_dir}")
        return output_tile_dir

What could cause gdal2tiles.py to hang so completely that it ignores the timeout from Python's subprocess module? Is there a known issue with running gdal2tiles.py non interactively from a Python script inside a Docker container that could lead to this kind of deadlock?

Ruled out Environment Path Issues: I added a diagnostic log (shutil.which('gdal2tiles.py')) which confirmed the script is correctly finding the modern version of gdal2tiles.py inside the conda environment (/opt/conda/envs/radar-env/bin/gdal2tiles.py).

Ruled out Multiprocessing: The hang occurs even with the --processes flag removed from the command.

Ruled out output format: The hang persists whether I use --webp-quality=90 or remove it to default to png tiles.

I also tried to replaced subprocess.run with the lower-level subprocess.Popen and proc.communicate(timeout=) this also hung and failed to trigger the TimeoutExpired exception.

4
  • 1
    you show try: but where is except: ? Maybe you have except: pass and it hides some errors. Commented Aug 2 at 11:06
  • @furas yeah sorry, I messed up and didn't include the catch part: except Exception as e: logging.error(f"An unexpected error occurred during tile generation: {e}", exc_info=True) return None finally: #TODO: for now this is manual clutter cleanup, look into better solutions if os.path.exists(color_file_path): os.remove(color_file_path) if os.path.exists(colorized_tiff_path): os.remove(colorized_tiff_path) Commented Aug 2 at 11:17
  • 1
    always put it in question, not in comment. It will be more readable (because comment doesn't allow to format it) and more people may see it - so more people may help Commented Aug 2 at 12:03
  • @furas Thank you for the tips and guidance, this is my first time asking here, and I didn't think you could edit your question after it was approved. But next time I will pay more attention to these Commented Aug 2 at 14:50

1 Answer 1

3

The issue is likely caused by gdal2tiles.py producing too much output, which fills up the Python subprocess output buffers and causes it to hang silently. Even with a timeout set, the process won’t exit if those buffers are full. To fix this, remove capture_output=True from your subprocess.run call so the output flows directly to the terminal and doesn't get stuck. This usually resolves the deadlock when running GDAL tools inside a Docker container.

Sign up to request clarification or add additional context in comments.

3 Comments

This actually gives me something in the log, so it's a step at least, now my app hangs at the log: "radar_backend | Generating Base Tiles:", which I assume comes from gdal2tiles.py, so at least it's showing a little life.
Nevermind, this actually solved my issue, thank you so much
if this solved your problem then you could mark it as accepted answer - it will be information for others that problem was solved. And few minutes later you can upvote it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.