I am currently working on a dataframe function that assigns values of a numpy array of shape 2 to a given column of a dataframe using the polars library in Python.
I have a dataframe df with the following columns : ['HZ', 'FL', 'Q']. The column 'HZ'takes values in [0, EC + H - 1] and the column 'FL' takes values in [1, F].
I also have a numpy array q of shape (EC + H, F), and I want to assign its values to the column 'Q' in this way :
if df['HZ'] >= EC, then df['Q'] = q[df['HZ']][df['F'] - 1].
You can find below the pandas instruction that does exactly what I want to do.
df.loc[df['HZ'] >= EC, 'Q'] = q[df.loc[df['HZ'] >= EC, 'HZ'], df.loc[df['HZ'] >= EC, 'F'] - 1]
Now I want to do it using polars, and I tried to do it this way:
df = df.with_columns(pl.when(pl.col('HZ') >= EC).then(q[pl.col('HZ')][pl.col('F') - 1]).otherwise(pl.col('Q')).alias('Q'))
And I get the following error :
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
I understand that I don't give numpy the good format of indexes to get the corresponding value in the array, but I don't know how to replace it to get the desired behavior.
Thanks by advance
hz, fl = df.filter(pl.col("HZ") >= EC).select(pl.col("HZ"), pl.col("FL") - 1)then useq[hz, fl]TypeError: did not expect value [...] of type <class 'numpy.ndarray'>, maybe disambiguate with pl.lit or pl.colwhere[...]seems to be a line of my initial numpy array