0

I have a pandas dataframe agent deployed in an Azure FastAPI app service.

            agent = create_pandas_dataframe_agent(
            llm,
            df,
            verbose=True,
            prefix=prefix,
            agent_type=AgentType.OPENAI_FUNCTIONS,
            agent_executor_kwargs={"memory": conversation_memory},
            return_intermediate_steps=True,
            allow_dangerous_code=True,
        )

The agent sometimes suggests generating a file from the data retrieved from the dataframe especially when the retrieved data is quite large.

How and where do I access the generated file to make it available for download?

I've implemented a workaround where, from the agent's intermediate steps, I extract the query that the agent generates from the the given prompt. I then execute that query on the original dataframe in order to retrieve the same data and make it available for download as a CSV file.

I suspect there's hopefully a more straightforward way of accessing any files that the agent generates than my current workaround.

Thanks.

1 Answer 1

0

Here are two solid approaches, from the most robust to a more direct workaround.

Approach 1: The "Custom Tool" Method (Most Robust)

This is the most professional and scalable solution. Instead of letting the agent use its default, generic file-saving tool, you create your own custom tool and give it to the agent. This gives you complete control over the process.

Your custom tool will do two things:

  1. Save the DataFrame to a CSV in a known, web-accessible directory on your server.

  2. Return the publicly accessible URL for that file as its final output.

Here’s how that would look in a FastAPI context:

1. Create a custom tool for your agent:

import pandas as pd
import uuid
from langchain.tools import tool

# Assume you have a '/static/downloads' directory that FastAPI can serve files from.
DOWNLOAD_DIR = "/app/static/downloads/"
BASE_URL = "https://your-azure-app-service.com" # Or your server's base URL

@tool
def save_dataframe_and_get_link(df: pd.DataFrame, filename_prefix: str = "export") -> str:
    """
    Saves a pandas DataFrame to a CSV file in a web-accessible directory
    and returns a public download link. Use this tool whenever you need to
    provide a file for the user to download.
    """
    try:
        # Generate a unique filename to avoid conflicts
        unique_id = uuid.uuid4()
        filename = f"{filename_prefix}_{unique_id}.csv"
        full_path = f"{DOWNLOAD_DIR}{filename}"

        # Save the dataframe
        df.to_csv(full_path, index=False)

        # Generate the public URL
        download_url = f"{BASE_URL}/downloads/{filename}"

        print(f"DataFrame saved. Download link: {download_url}")
        return f"Successfully saved the data. The user can download it from this link: {download_url}"
    except Exception as e:
        return f"Error saving file: {str(e)}"

# When you initialize your agent, you pass this tool in the `tools` list.
# agent_executor = create_pandas_dataframe_agent(..., tools=[save_dataframe_and_get_link])

2. Update your FastAPI to serve these static files:

from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles

app = FastAPI()

# This tells FastAPI to make the 'static' directory available to the public
app.mount("/static", StaticFiles(directory="static"), name="static")

# Your existing agent endpoint...
@app.post("/chat")
def handle_chat(...):
    # ... your agent runs and uses the custom tool ...
    result = agent.run(...)
    # The 'result' will now contain the download URL!
    return {"response": result}

I would strongly recommend Approach 1 for any production application. It gives you much more control and reliability

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you @Muhammad Mudassir. Your answer pointed me in the right direction. However, when the dataframe agent invokes the custom tool, the following error is thrown: 2 validation errors for dataframe_to_csv_and_link df Input should be an instance of DataFrame [type=is_instance_of, input_value='top_5_clients.csv', input_type=str] For further information visit errors.pydantic.dev/2.11/v/is_instance_of filename Field required [type=missing, input_value={'df': 'top_5.csv'}, input_type=dict]
corrected and more robust approach: import pandas as pd import uuid from langchain.tools import tool from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent from langchain_openai import OpenAI # --- Assume these are defined in your environment --- # This is the base URL where your files will be accessible BASE_URL = "your-app-service.com" # This is the local directory your static files are served from DOWNLOAD

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.