I am using np.delete(), to drop a specific band from my ndarray. However, while profiling the memory usage with memory profiler, I noticed that after using np.delete, the memory usage doubles, even though I would expect it slightly decrease.
Here the full example:
import numpy as np
def clean_data(raster_np):
# Build column names
scl_index = 0
scl = raster_np[:, scl_index]
# Create mask for invalid SCL values
invalid_scl_mask = np.isin(scl, [0, 1, 2, 3, 6, 7, 8, 9, 10, 11, 12])
# Set rows to NaN where SCL is invalid
raster_np[invalid_scl_mask, :] = np.nan
# Drop SCL column
raster_np = np.delete(raster_np, scl_index, axis=1)
# Replace 0s with NaN
raster_np[raster_np == 0] = np.nan
return raster
# Function call
raster, meta = load_s2_tile(...)
raster = clean_data(raster)
Here the profiling output (See line 33):
Line # Mem usage Increment Occurrences Line Contents
=============================================================
20 5647.9 MiB 5647.9 MiB 1 @profile
21 def clean_data(raster_np):
22 # Build column names
23 5647.9 MiB 0.0 MiB 1 scl_index = 0
24 5647.9 MiB 0.0 MiB 1 scl = raster_np[:, scl_index]
25
26 # Create mask for invalid SCL values
27 5762.9 MiB 115.0 MiB 1 invalid_scl_mask = np.isin(scl, [0, 1, 2, 3, 6, 7, 8, 9, 10, 11, 12])
28
29 # Set rows to NaN where SCL is invalid
30 5762.9 MiB 0.0 MiB 1 raster_np[invalid_scl_mask, :] = np.nan
31
32 # Drop SCL column
33 10821.6 MiB 5058.8 MiB 1 raster_np = np.delete(raster_np, scl_index, axis=1)
34
35 # Replace 0s with NaN
36 10821.8 MiB 0.2 MiB 1 raster_np[raster_np == 0] = np.nan
37
38
39 10821.8 MiB 0.0 MiB 1 return raster
If someone could point out why this is the case and how to avoid this, that would be great! I would not expect this behaviour as I do not have any other references to raster
sclis a reference to the olderraster_nb. So both old and newraster_nphas to continue to exist in memory. Whensclwon't be used any more (or will be overwritten, for example with a slice of the theraster_np), then the oldraster_npcan be garbage collected (assuming that there isn't any other references outside that partial code)