84 questions
6
votes
1
answer
132
views
Split a column of string into list of list
How could I split a column of string into list of list?
Minimum example:
import polars as pl
pl.Config(fmt_table_cell_list_len=6, fmt_str_lengths=100)
df = pl.DataFrame({'test': "A,B,C,1\nD,E,F,...
4
votes
1
answer
100
views
How do I write a query like (A or B) and C in Polars?
I expected either a or b would be 0.0 (not NaN) and c would always be 0.0. The Polars documentation said to use | as "or" and & as "and". I believe I have the logic right:
(((...
3
votes
3
answers
129
views
Order of columns in a plotnine bar plot using a polars dataframe
I'm quite new to the packages polars and plotnine and have the following code:
import polars as pl
import polars.selectors as cs
from plotnine import *
df = pl.read_csv('https://raw.githubusercontent....
3
votes
1
answer
115
views
Tidypolars - strange error with inequality join
I get a strange error with an inequality inner_join on datetime columns:
library(polars)
library(tidypolars)
library(dplyr)
library(lubridate)
x <- tibble(id = c(1, 1, 2, 2),
t = as....
3
votes
1
answer
108
views
Modify list of arrays in place
I have a df like:
# /// script
# requires-python = ">=3.13"
# dependencies = [
# "polars",
# ]
# ///
import polars as pl
df = pl.DataFrame(
{
"points"...
2
votes
1
answer
112
views
Horizontal cumulative sum + unnest bug in polars
When I use horizontal cumulative sum followed by unnest, a "literal" column is formed that stays in the schema even when dropped.
Here is an example:
import polars as pl
def ...
2
votes
1
answer
69
views
Polars `concat_arr` no longer takes in `pl.col` as parameter?
The following used to work:
pl.concat_arr(pl.col("X[m]", "Y[m]", "Z[m]")).alias("Antenna_position[m]"),
but now (polars 1.31.0) I get an error:
Traceback (most ...
2
votes
1
answer
119
views
How to format expr in polars by rust?
Polars python, format a column like this
df = pl.DataFrame({
"a": [0.15, 0.25]
})
result = df.with_columns(
pl.format("{}%", (pl.col("a") * 100).round(1))
)
print(...
2
votes
1
answer
167
views
Polars Dataframe from nested dictionaries as columns
I have a dictionary of nested columns with the index as key in each one. When i try to convert it to a polars dataframe, it fetches the column names and the values right, but each column has just one ...
2
votes
1
answer
326
views
Memory issues with Polars streaming when computing outer products on large dataset"
I'm working with a large dataset (~14M rows) using Polars and encountering memory issues despite using the streaming engine. Here's my code:
import polars as pl
# Read CSV in streaming mode
df = pl....
2
votes
2
answers
234
views
How to add a group-specific index to a polars dataframe with an expression instead of a map_groups user-defined function?
I am curious whether I am missing something in the Polars Expression library in how this could be done more efficiently. I have a dataframe of protein sequences, where I would like to create k-long ...
2
votes
0
answers
102
views
Best way to trigger lazy evaluation in PySpark and Polars for benchmarking
I'm currently benchmarking PySpark vs the growing alternative Polars.
Basically I'm writing various queries (aggregations, filtering, sorting etc.) and measure the execution time, RAM and CPU. I ...
2
votes
0
answers
219
views
python polars numerous joins crashing
This is for a POC to see if polars can do some things faster/better/cheaper than a current SQL solution. The first test case involves a count(*) over an eight-table join. The eight tables are ...
1
vote
2
answers
169
views
How to specify relevant columns with read_excel
As far as I can tell, the following MRE conforms to the relevant documentation:
import polars
df = polars.read_excel(
"/Volumes/Spare/foo.xlsx",
engine="calamine",
...
1
vote
3
answers
287
views
How to add columns from one Polars lazyframe into another?
I have a Polars LazyFrame and would like to add to it columns from another LazyFrame. The two LazyFrames have the same number of rows and different columns.
I have tried the following, which doesn't ...
1
vote
1
answer
83
views
How to join/map a polars dataframe to a dict? [duplicate]
I have a polars dataframe, and a dictionary. I want to map a column in the dataframe to the keys of the dictionary, and then add the corresponding values as a new column.
import polars as pl
my_dict =...
1
vote
1
answer
57
views
Can you create multiple columns based on the same set of conditions in Polars?
Is it possible to do something like this in Polars? Like do you need a separate when.then.otherwise for each of the 4 new varialbles, or can you use struct to create multiple new variables from one ...
1
vote
1
answer
157
views
Using `is_in` in rust-polars
I am trying to subset a rust-polars dataframe by the names contained in a different frame:
use polars::prelude::*;
fn main() {
let mut df = df! [
"names" => ["a", &...
1
vote
1
answer
175
views
Polars Rust equivalent to pl.lit() (repeated value in df)
In python I can construct a dataframe with a repeated value like this:
import polars as pl
df = pl.DataFrame({"foo": [1,2]}).with_columns(bar=pl.lit("baz"))
Can this be done in ...
1
vote
1
answer
186
views
Why does polars kept killing the python kernel when joining two lazy frames and collecting them?
I have one dataframe: bos_df_3 that has about a 30k+ rows and another, taxon_ranked_only, with 6 million when I tried to join them using:
matching_df = (
pl.LazyFrame(bos_df_3)
.join(
other=...
1
vote
1
answer
117
views
Want to broadcast a NumPy array using `pl.lit()` in Polars
Goal
I have a NumPy array
true_direction = np.array([1,2,3]).reshape(1,3)
which I want to insert into a Polars DataFrame;
that is, repeat this array in every row of the DataFrame.
What I have tried
...
1
vote
1
answer
145
views
Making long link clickable in Polars output
I'm displaying a data frame in collab but finding that whilst the URL link is showing and being formatted, the underline and link don't go across the entire URL, which goes across a couple of lines, ...
1
vote
1
answer
123
views
Mutate polars column and keep original column name on custom expression
I trying to implement a custom expression in Rust polars to calculate the geomean of different columns, essentailly replicating the same behavior of .mean() expression where it will apply the ...
1
vote
1
answer
93
views
Converting a Rust `futures::TryStream` to a `polars::LazyFrame`
I have an application where I have a futures::TryStream. Still in a streaming fashion, I want to convert this into a polars::LazyFrame. It is important to note that the TryStream comes from the ...
1
vote
0
answers
82
views
How to read delta table and get empty columns in df?
In my file I have :
{
"Car": {
"Model": null,
"Color": null,
}
}
I use read_delta to read the file:
df = df.read_delta(path)
At the end, I have an empty df. ...