1

I'm running a minimal Python Flask app with one API endpoint which make a simple call to retrieve data from the a Cassandra Datastax DB inside a for loop.

# Day-2-Day Power
@app.route("/d2d_new_2/power")
def d2d_power():
    data = request.args
    result = get_data_op_power_d2d(data)
    return JsonResponse(response=result, status=HTTP_200_OK)

The problem is the memory keeps increasing if I trigger this request and does not decrease after the results were returned from the function.

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   104 134.4766 MiB 134.4766 MiB           1   @profile(stream=log_file)
   105                                         def get_data_op_power_d2d(data: dict) -> pd.DataFrame:
   106 134.4766 MiB   0.0000 MiB          14       my_list = [string.strip() for string in data.get("books").split(",")]
   107 134.4766 MiB   0.0000 MiB           1       if data.get("as_of_op_old") == 'null' or data.get("as_of_op_new") == 'null':
   108                                                 abort(abort(HTTP_404_NOT_FOUND,
   109                                                             f'No data previous for as_of_op_old = {data.get("as_of_op_old")} '))
   110                                         
   111 134.4766 MiB   0.0000 MiB           1       as_of_op_old = parser.parse(data.get("as_of_op_old"))
   112 134.4766 MiB   0.0000 MiB           1       as_of_op_old = as_of_op_old.strftime('%Y-%m-%d')
   113 134.4766 MiB   0.0000 MiB           1       as_of_op_new = parser.parse(data.get("as_of_op_new"))
   114 134.4766 MiB   0.0000 MiB           1       as_of_op_new = as_of_op_new.strftime('%Y-%m-%d')
   115 134.4766 MiB   0.0000 MiB           1       ts_start = parser.parse(data.get("ts_start"))
   116 134.4766 MiB   0.0000 MiB           1       ts_end = parser.parse(data.get("ts_end"))
   117                                         
   118 143.6016 MiB   0.0000 MiB          12       for book_name in my_list:
   119 143.6016 MiB   6.7500 MiB          22           op_df_old = pd.DataFrame(get_op_power_data(
   120 143.6016 MiB   0.0000 MiB          11               as_of=as_of_op_old, book_op=book_name, ts_start=ts_start, ts_end=ts_end
   121                                                 ))
   122                                         
   123 143.6016 MiB   2.3750 MiB          22           op_df_new = pd.DataFrame(get_op_power_data(
   124 143.6016 MiB   0.0000 MiB          11               as_of=as_of_op_new, book_op=book_name, ts_start=ts_start, ts_end=ts_end
   125                                                 ))
   126                                         
   127 143.6016 MiB   0.0000 MiB          11           del op_df_old
   128 143.6016 MiB   0.0000 MiB          11           del op_df_new        
   129 143.6016 MiB   0.0000 MiB          11           gc.collect()
   130                                         
   131 143.6016 MiB   0.0000 MiB           1       return pd.DataFrame()

This is the db call:

def get_op_power_data(as_of, book_op, ts_start, ts_end):
    try:
        execution = session.execute("SELECT time, value, is_peak FROM series_op_power WHERE as_of=%s AND name = %s AND time >= %s AND time < %s", 
        (as_of, book_op, ts_start, ts_end))
        return list(execution)
    except Exception as e:
        return []

I have the same example with a for loop where I store something in some local variables and delete them afterwards and the memory decreases afterwards (at least for one variable we can see it).

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   120 109.5000 MiB 109.5000 MiB           1   @profile(stream=log_file)
   121                                         def get_data_op_power_d2d(data: dict):
   122 109.5000 MiB   0.0000 MiB          14       my_list = [string.strip() for string in data.get("books").split(",")]
   123 109.5000 MiB   0.0000 MiB           1       if data.get("as_of_op_old") == 'null' or data.get("as_of_op_new") == 'null':
   124                                                 abort(HTTP_404_NOT_FOUND, f'No data previous for as_of_op_old = {data.get("as_of_op_old")} ')
   125                                         
   134 109.5000 MiB   0.0000 MiB           1       results_list = []
   135                                         
   136 125.0117 MiB  -0.4062 MiB          12       for book_name in my_list:
   
   148 117.4180 MiB   7.2188 MiB          11            a = [1] * (10 ** 6)
   149 269.9414 MiB 1677.5938 MiB         11            b = [2] * (2 * 10 ** 7)
   150 117.4180 MiB -1678.0820 MiB        11            del b
   151 125.0117 MiB   7.1875 MiB          11            del a
   152                                             
   153 125.0117 MiB   0.0000 MiB           1       log_file.flush()
   154                                         
   155 125.0117 MiB   0.0000 MiB           1       del my_list
   156 125.0117 MiB   0.0000 MiB           1       gc.collect()
   157 125.0117 MiB   0.0000 MiB           1       return results_list

I have multiple operations inside the initial function and there is a memory leak in Grafana when I monitor the app so I tried to reduce it even further by removing the database connection and just reading the data from csv files but I still see the RAM memory keeps going up and doesn't decrease back to a previous state.

enter image description here

Logs to show the memory increase :

System Memory: 64.0% used (9.16 GB / 15.44 GB)
App Process: 0.7% used (109.11 MB RSS, 836.41 MB VMS)

[2025-09-18 11:10:34] Iteration 29878
----------------------------------------
System Memory: 64.3% used (9.20 GB / 15.44 GB)
App Process: 0.8% used (122.17 MB RSS, 915.45 MB VMS)

[2025-09-18 11:10:39] Iteration 29879
----------------------------------------
System Memory: 64.1% used (9.17 GB / 15.44 GB)
App Process: 0.8% used (122.17 MB RSS, 915.45 MB VMS)

And the logs to show the endpoint call:

127.0.0.1 - - [18/Sep/2025 11:10:32] "GET /d2d_new_2/power?books=DE_STC&as_of_op_old=2025-08-06&as_of_op_new=2025-08-15&ts_start=2025-01-01T00:00:00.000%2B01:00&ts_end=2026-01-01T00:00:00.000%2B01:00&aggr=quarter HTTP/1.1" 200 -
127.0.0.1 - - [18/Sep/2025 11:10:34] "GET /info/memory HTTP/1.1" 200 -
127.0.0.1 - - [18/Sep/2025 11:10:39] "GET /info/memory HTTP/1.1" 200 -
127.0.0.1 - - [18/Sep/2025 11:10:44] "GET /info/memory HTTP/1.1" 200 -

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.