0

I'd like to average my minute data to 10-min averages of those 10 timepoints, however, I have important data in columns.

This is the first thing I tried df1.resample('10min', on='Date_Time').mean(d).dropna()

I used dropna() to remove the added rows of 10-min data by the resample function. Using the dropna() globally however, results in this results in lots of missing data.

I have tried:

numerics = (df1.select_dtypes(include=['datetime','number']).resample('10min', on='Date_Time')).mean(d).dropna()

strings = df1.select_dtypes(include=['datetime','object']).resample("10min", on='Date_Time').first().dropna()

df_10m = numerics.join(strings)

this is the top 5 rows of the df

{'Date_Time': {0: Timestamp('2020-02-04 10:00:00'), 1: Timestamp('2020-02-04 10:01:00'), 2: Timestamp('2020-02-04 10:02:00'), 3: Timestamp('2020-02-04 10:03:00'), 4: Timestamp('2020-02-04 10:04:00')}, 'Tg (ºC)': {0: 35.58856153846154, 1: 35.72681666666667, 2: 35.94865000000001, 3: 36.24185000000001, 4: 36.51074166666667}, 'Tg (C) HSRI': {0: 46.03922307692308, 1: 46.179199999999994, 2: 46.330241666666666, 3: 46.40623333333333, 4: 46.40038333333332}, 'Ta (C)': {0: 31.03922307692308, 1: 31.179199999999998, 2: 31.330241666666666, 3: 31.406233333333333, 4: 31.400383333333327}, 'RH (%)': {0: 35.54615384615385, 1: 35.074999999999996, 2: 35.083333333333336, 3: 35.208333333333336, 4: 35.25833333333333}, 'WS': {0: 2.3515384615384614, 1: 2.3275, 2: 2.454166666666666, 3: 2.3666666666666667, 4: 2.5333333333333328}, 'Tr (C)': {0: 86.03059100103826, 1: 85.92113760389992, 2: 87.10761095323471, 3: 86.41824958826783, 4: 87.81529421945606}, 'm': {0: 80, 1: 80, 2: 80, 3: 80, 4: 80}, 'Ht': {0: 1.8, 1: 1.8, 2: 1.8, 3: 1.8, 4: 1.8}, 'PB': {0: 101.9, 1: 101.9, 2: 101.9, 3: 101.9, 4: 101.9}, 'vcomb': {0: 2.651538461538461, 1: 2.6275, 2: 2.754166666666666, 3: 2.6666666666666665, 4: 2.8333333333333326}, 'vself': {0: 0.3, 1: 0.3, 2: 0.3, 3: 0.3, 4: 0.3}, 'VO2': {0: 14, 1: 14, 2: 14, 3: 14, 4: 14}, 'VO2.1': {0: 1.12, 1: 1.12, 2: 1.12, 3: 1.12, 4: 1.12}, 'RER': {0: 0.8, 1: 0.8, 2: 0.8, 3: 0.8, 4: 0.8}, 'Tcl': {0: 36, 1: 36, 2: 36, 3: 36, 4: 36}, 'Tsk': {0: 36, 1: 36, 2: 36, 3: 36, 4: 36}, 'ε': {0: 0.95, 1: 0.95, 2: 0.95, 3: 0.95, 4: 0.95}, 'Ar/AD': {0: 0.35, 1: 0.35, 2: 0.35, 3: 0.35, 4: 0.35}, 'Icl': {0: 1.03, 1: 1.03, 2: 1.03, 3: 1.03, 4: 1.03}, 'Re,cl': {0: 0.023, 1: 0.023, 2: 0.023, 3: 0.023, 4: 0.023}, 'AD': {0: 1.9917607971689137, 1: 1.9917607971689137, 2: 1.9917607971689137, 3: 1.9917607971689137, 4: 1.9917607971689137}, 'AD:m': {0: 248.9700996461142, 1: 248.9700996461142, 2: 248.9700996461142, 3: 248.9700996461142, 4: 248.9700996461142}, 'Ta': {0: 304.18922307692304, 1: 304.32919999999996, 2: 304.48024166666664, 3: 304.5562333333333, 4: 304.5503833333333}, 'Pa': {0: 1.600406759479592, 1: 1.5918245414256496, 2: 1.60593338043295, 3: 1.6186268793904253, 4: 1.6203871372827314}, 'Pa.1': {0: 12.00305069609694, 1: 11.938684060692372, 2: 12.044500353247125, 3: 12.139701595428189, 4: 12.152903529620486}, 'Psa': {0: 4.5023345321866906, 1: 4.538345093159372, 2: 4.577482319523848, 3: 4.5972834444225095, 4: 4.595756475394181}, 'PB.1': {0: 764.25, 1: 764.25, 2: 764.25, 3: 764.25, 4: 764.25}, 'Abs Humid': {0: 0.01141684979152761, 1: 0.011350403625066736, 2: 0.011445325373048706, 3: 0.011532912296143744, 4: 0.011545676119064242}, 'v': {0: 2.651538461538461, 1: 2.6275, 2: 2.754166666666666, 3: 2.6666666666666665, 4: 2.8333333333333326}, 'M (W)': {0: 376.959744, 1: 376.959744, 2: 376.959744, 3: 376.959744, 4: 376.959744}, 'M(w/m2)': {0: 189.25954589316655, 1: 189.25954589316655, 2: 189.25954589316655, 3: 189.25954589316655, 4: 189.25954589316655}, 'H (W)': {0: 376.959744, 1: 376.959744, 2: 376.959744, 3: 376.959744, 4: 376.959744}, 'H (W/m2)': {0: 189.25954589316655, 1: 189.25954589316655, 2: 189.25954589316655, 3: 189.25954589316655, 4: 189.25954589316655}, 'fcl': {0: 1.3193000000000001, 1: 1.3193000000000001, 2: 1.3193000000000001, 3: 1.3193000000000001, 4: 1.3193000000000001}, 'hc': {0: 10.429773835114272, 1: 10.372937644718489, 2: 10.670143570681152, 3: 10.465437098406353, 4: 10.853123250589423}, 'hr': {0: 2.8152285937257235, 1: 2.8138458658081724, 2: 2.8288587842794275, 3: 2.820129539046607, 4: 2.8378387547428283}, 'h': {0: 13.245002428839996, 1: 13.186783510526661, 2: 13.499002354960579, 3: 13.285566637452959, 4: 13.690962005332251}, 'to': {0: 42.727652144400366, 1: 42.86024221094555, 2: 43.01897961346187, 3: 43.083643584372176, 4: 43.09396778675954}, 'σ': {0: 5.67e-08, 1: 5.67e-08, 2: 5.67e-08, 3: 5.67e-08, 4: 5.67e-08}, 'Rcl': {0: 0.15965000000000001, 1: 0.15965000000000001, 2: 0.15965000000000001, 3: 0.15965000000000001, 4: 0.15965000000000001}, 'C + R (W)': {0: -61.7854635996826, 1: -62.92983381965766, 2: -64.78260666774776, 3: -65.10727499145821, 4: -65.71445913620975}, 'W/m2)': {0: -31.020523994399518, 1: -31.595076029765245, 2: -32.52529458348094, 3: -32.688300263767424, 4: -32.99314818808373}, 'Ps,sk': {0: 5.94114568997784, 1: 5.94114568997784, 2: 5.94114568997784, 3: 5.94114568997784, 4: 5.94114568997784}, 'he': {0: 172.09126827938547, 1: 171.15347113785506, 2: 176.057368916239, 3: 172.67971212370483, 4: 179.07653363472548}, 'LR': {0: 16.5, 1: 16.5, 2: 16.5, 3: 16.5, 4: 16.5}, 'λ': {0: 2427, 1: 2427, 2: 2427, 3: 2427, 4: 2427}, 'Ereq': {0: 409.36381964806634, 1: 410.5260936917001, 2: 412.55058725207795, 3: 412.99813912411855, 4: 413.6137153206719}, 'Emax': {0: 283.93655875039144, 1: 284.24761905934344, 2: 284.6055046911687, 3: 282.899686582697, 4: 284.41267301639743}, 'Ereq.1': {0: 205.5286057592536, 1: 206.1121467373097, 2: 207.12858082078773, 3: 207.353282437908, 4: 207.66234374558525}, 'Emax.1': {0: 142.55555142664647, 1: 142.71172495380603, 2: 142.89140799221803, 3: 142.03497075793953, 4: 142.79459331695918}, 'ωreq': {0: 1.4417439636856977, 1: 1.4442551710731937, 2: 1.4495523819883425, 3: 1.4598748556880825, 4: 1.4542731550391408}, 'H.S.I.': {0: 144.17439636856977, 1: 144.42551710731937, 2: 144.95523819883425, 3: 145.98748556880824, 4: 145.4273155039141}, 'r': {0: 0.5, 1: 0.5, 2: 0.5, 3: 0.5, 4: 0.5}, 'Sreq': {0: 411.0572115185072, 1: 412.2242934746194, 2: 414.25716164157546, 3: 414.706564875816, 4: 415.3246874911705}, 'Sreq.1': {0: 1214.4291312179967, 1: 1217.8771629914465, 2: 1223.8830771384264, 3: 1225.210795918275, 4: 1227.0369799377163}, 'Slimit': {0: 1214.4291312179967, 1: 1217.8771629914465, 2: 1223.8830771384264, 3: 1225.210795918275, 4: 1227.0369799377163}, 'Cres + Eres': {0: 29.381387951616322, 1: 29.36348412795755, 2: 29.19176341566985, 3: 29.06887986733963, 4: 29.060487815537876}, 'Cres + Eres.1': {0: 14.75146412831249, 1: 14.742475185622073, 2: 14.656259655859772, 3: 14.594563719025947, 4: 14.590350335665013}, 'Heat': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}, 'Sweat Rate': {0: 2.1442913121799667, 1: 2.1787716299144644, 2: 2.2388307713842637, 3: 2.2521079591827498, 4: 2.270369799377163}, 'Black Globe Temp': {0: 6.539223076923079, 1: 6.6791999999999945, 2: 6.830241666666666, 3: 6.906233333333333, 4: 6.900383333333323}, 'HSS Score': {0: 8.683514389103046, 1: 8.85797162991446, 2: 9.069072438050929, 3: 9.158341292516083, 4: 9.170753132710487}, 'Sunny Factor': {0: 4, 1: 4, 2: 4, 3: 4, 4: 4}, 'Td (C)': {0: 18.14845384615385, 1: 18.194199999999995, 2: 18.346908333333335, 3: 18.4479, 4: 18.452049999999993}, 'WBGT (C)': {0: 26.6959487578496, 1: 26.780697622069766, 2: 26.9482556650566, 3: 27.051211098622655, 4: 27.052240739589937}, 'e (hPa)': {0: 13.121397641817339, 1: 13.135092168116461, 2: 13.343533435258518, 3: 13.495869716597092, 4: 13.506929744503669}, 'Category': {0: 'High Risk', 1: 'High Risk', 2: 'High Risk', 3: 'High Risk', 4: 'High Risk'}, 'Stadium': {0: 'WACA', 1: 'WACA', 2: 'WACA', 3: 'WACA', 4: 'WACA'}, 'City': {0: 'Perth', 1: 'Perth', 2: 'Perth', 3: 'Perth', 4: 'Perth'}, 'State': {0: 'WA', 1: 'WA', 2: 'WA', 3: 'WA', 4: 'WA'}, 'BOM_station_id': {0: 9225, 1: 9225, 2: 9225, 3: 9225, 4: 9225}, 'BOM_station': {0: 'Perth Metro', 1: 'Perth Metro', 2: 'Perth Metro', 3: 'Perth Metro', 4: 'Perth Metro'}, 'BOM2_station_id': {0: 9021, 1: 9021, 2: 9021, 3: 9021, 4: 9021}, 'BOM2_station': {0: 'Perth Airport', 1: 'Perth Airport', 2: 'Perth Airport', 3: 'Perth Airport', 4: 'Perth Airport'}, 'Measurement_device': {0: 'EMU', 1: 'EMU', 2: 'EMU', 3: 'EMU', 4: 'EMU'}, 'Longitude': {0: 115.880637, 1: 115.880637, 2: 115.880637, 3: 115.880637, 4: 115.880637}, 'Latitude': {0: -31.959454, 1: -31.959454, 2: -31.959454, 3: -31.959454, 4: -31.959454}, 'Day': {0: 'Day 1', 1: 'Day 1', 2: 'Day 1', 3: 'Day 1', 4: 'Day 1'}, 'measurement_type': {0: 'EMU', 1: 'EMU', 2: 'EMU', 3: 'EMU', 4: 'EMU'}, 'WS_conv': {0: 2.3515384615384614, 1: 2.3275, 2: 2.454166666666666, 3: 2.3666666666666667, 4: 2.5333333333333328}, 'WS_unconverted': {0: 2.3515384615384614, 1: 2.3275, 2: 2.454166666666666, 3: 2.3666666666666667, 4: 2.5333333333333328}, 'Numeric_Category': {0: 3, 1: 3, 2: 3, 3: 3, 4: 3}, 'venue_type': {0: 'Major', 1: 'Major', 2: 'Major', 3: 'Major', 4: 'Major'}, 'time': {0: datetime.time(10, 0), 1: datetime.time(10, 1), 2: datetime.time(10, 2), 3: datetime.time(10, 3), 4: datetime.time(10, 4)}}

However, this too seems to result in lost data.

5
  • 1
    If change .dropna() to .dropna(how='all') it working like need? Commented Oct 16, 2023 at 6:59
  • Are missing values in original DataFrame? Commented Oct 16, 2023 at 6:59
  • thanks, @jezrael, this still results in many missing rows. I receive 1000 rows from a 27,674 row df, which should result in around 2,767 rows of data. So missing around 1700 rows still. Missing values are in the original df, yes Commented Oct 16, 2023 at 7:13
  • So if remove dropna and check output problem is in some rows are missing values instead numbers/strings/times? Commented Oct 16, 2023 at 7:17
  • yes data still seems to be missing from the resampled df, without using dropna. Still waiting on that custom code to run. Will get back to you later Commented Oct 16, 2023 at 7:28

1 Answer 1

1

You can use this alternative with custom aggregate function mean for numeric else first value if exist. Last remove only rows with all missing values by DataFrame.dropna with how='all' parameter:

f = lambda x: x.mean() if np.issubdtype(x.dtype, np.number) else next(iter(x), None)
out = df1.resample('10min', on='Date_Time').agg(f).dropna(how='all')
Sign up to request clarification or add additional context in comments.

1 Comment

running this now, it's taking some time... will update once it runs. Thanks

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.