0

I am building a scanning service using FastAPI + Celery + PostgreSQL + SQLAlchemy + Nmap.

I have a Celery task that performs WHOIS, DNS lookups, IP lookups, and then port scanning using python-nmap. The issue is that when I add the port scanning part inside my Celery task, my entire API becomes very slow (requests take 10–20 seconds).

If I restart the application without triggering the task, the API works fine and responds in milliseconds.

Here’s a simplified version of my code:


@tasks_router.post(
    "/create",
    response_model=schemas.TaskRead,
    summary="Create a new scan task",
    description=(
            "Create a task for scanning an entity (domain or IP). The task is queued to the background worker.\n\n"
            "- For domains: performs WHOIS and DNS lookups.\n"
            "- For IPs: performs geolocation lookup via ipinfo.io."
    ),
)
def create_task(payload: schemas.TaskCreate, db: Session = Depends(get_db)):
    entity = payload.entity
    is_domain = not entity.replace(".", "").isdigit()  # crude check (schema regex already validated)

    task = models.IpDomainScanTask(
        user_id=payload.user_id,
        entity=entity,
        is_domain=is_domain,
        status=models.TaskStatus.pending,
        progress_message="Task created"
    )
    db.add(task)
    db.commit()
    db.refresh(task)

    # Send to Celery worker
    try:
        scan_entity.delay(str(task.id))
    except Exception as e:
        task.progress_message = f"Queueing failed: {e}"
        task.status = models.TaskStatus.failed
        db.commit()
        raise HTTPException(status_code=status.HTTP_503_SERVICE_UNAVAILABLE, detail="Task queue unavailable")

    return task

import nmap

def scan_ip(ip: str, mode: str = "fast") -> dict:
    scanner = nmap.PortScanner(nmap_search_path=[r"C:\Program Files (x86)\Nmap\nmap.exe"])
    arguments = "--top-ports 100 -sV --version-light -T4" if mode == "fast" else "-A"
    scanner.scan(ip, arguments=arguments)
    results = {"host": ip, "ports": []}
    for host in scanner.all_hosts():
        for proto in scanner[host].all_protocols():
            for port in scanner[host][proto]:
                port_info = scanner[host][proto][port]
                results["ports"].append({
                    "port": port,
                    "service": port_info.get("name", ""),
                    "state": port_info.get("state", "")
                })
    return results

And the Celery task:



@celery.task(name="tasks.scan_entity")
def scan_entity(task_id: str):
    db = CelerySessionLocal()
    try:
        try:
            task_uuid = uuid.UUID(task_id)
        except Exception:
            return {"error": "Invalid task id"}
        task = (
            db.query(models.IpDomainScanTask)
            .filter(models.IpDomainScanTask.id == task_id)
            .first()
        )
        #,....other scaning

        # when ever i add this block my all API's get very slow if remove it my api call is fast
        update_progress(db, task, "Starting port scan")
        db.close()
        os_records = []
        port_records = []
        for idx, ip in enumerate(['1.1.1.1.1'], start=1):

            # IP lookup

            # Port scanning
            scan_results = scan_ip(ip)

            # Store OS detection results
            for os_info in scan_results.get("os", []):
                os_record = models.OperatingSystem(
                    task_id=task.id,
                    ip=ip,
                    name=os_info["name"],
                    accuracy=os_info["accuracy"]
                )
                os_records.append(os_record)

            # Store port scan results
            for port_info in scan_results.get("ports", []):
                port_record = models.PortScan(
                    task_id=task.id,
                    ip=ip,
                    port=port_info["port"],
                    protocol=port_info["protocol"],
                    state=port_info["state"],
                    service=port_info["service"],
                    product=port_info["product"],
                    version=port_info["version"],
                    extra_info=port_info["extra_info"]
                )
                port_records.append(port_record)
            # update_progress(db, task, f"Port scan {idx}/{len(checkip)} completed")
        db = CelerySessionLocal()  # Reconnect to the database after closing it
        db.add_all(os_records)
        db.add_all(port_records)
        db.commit()

        update_progress(db, task, "Scan completed successfully", models.TaskStatus.completed)
        db.close()
        return {"task_id": str(task.id), "status": "completed"}
    except Exception as e:
        task = (
            db.query(models.IpDomainScanTask)
            .filter(models.IpDomainScanTask.id == task_id)
            .first()
        )
        update_progress(db, task, f"Scan failed: {str(e)}", models.TaskStatus.failed)
        return {"task_id": str(task.id), "status": "failed", "error": str(e)}
    finally:
        db.close()

Problem:

  • Without the port scanning block, everything works fast.
  • With port scanning inside Celery, my FastAPI endpoints become very slow (10–20s).
  • Even requests unrelated to the task are affected.

My setup:

  • FastAPI (running with Uvicorn)
  • Celery (with Redis as broker)
  • PostgreSQL + SQLAlchemy
  • python-nmap

Question:

  • Why does running Nmap scans inside a Celery worker cause my API to slow down?
  • Is this a DB session/connection pool issue (since I close/reopen SessionLocal)?
  • Or is it because Nmap blocks the event loop or consumes too many resources?
  • What’s the best practice to run heavy I/O tasks (like port scanning) in Celery without affecting FastAPI performance?
5
  • Can you confirm how Celery workers are being deployed - specifically, how many workers and concurrency level you're using (--concurrency)? Also, are FastAPI and Celery running on the same machine or container, and what are the system specs (CPU/memory)? Commented Aug 21 at 11:26
  • maybe you could use logging to write information which function (or part of function) is executed and how long it takes. Commented Aug 21 at 12:58
  • Currently, I’m running the application in development mode on Windows with the following command: celery -A app.celery_app.celery worker --loglevel=info --pool=solo. Both FastAPI and Celery are running on the same machine. Commented Aug 21 at 12:58
  • I actually tried adding logging, and it confirmed that the slowdown only happens when the port scanning part runs inside the Celery task. If I remove the port scanning block, everything else executes quickly and the API stays responsive. Commented Aug 21 at 13:02
  • It's an expected behavior when using the solo option. See Concurrency. Commented Aug 26 at 5:13

1 Answer 1

0

The most critical piece of information you revealed in the comments. You use the solo concurrency. This means that the Celery worker will take care of all Celery communication, AND execution of the tasks. Most importantly this means when task is executed NOTHING else will work! If there are other tasks that were sent to be executed, they will wait until the previous task finishes. Not only that, the worker itself will not respond to any Celery command.

This is even worse if you only have a SINGLE Celery worker running in "solo" mode. If you have many workers though running like this is should be fine. In general avoid running workers in "solo" mode in production - the prefork (default) concurrency is the best.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.