3,477 questions
3
votes
1
answer
75
views
How to pass argument to func in `pandas.resampler.agg()` when using dict input?
I am trying to resample a pandas dataframe, and for some columns I would like to sum on. additionally, I want to get None/nan as result when there is no rows in a resampling period. For aggregation on ...
1
vote
1
answer
88
views
How to prevent duplicate transaction calculations in a ClickHouse materialized view
I’m planning to use ClickHouse to calculate wallet balances based on transactions in my base table. However, there’s an issue: if something goes wrong and I end up inserting the same transactions into ...
2
votes
1
answer
134
views
Pyspark aggregations optimization
I have a huge dataframe with 3B rows. I'm running the PySpark code below with the Spark config.
spark = SparkSession\
.builder\
.appName("App")\
.config("spark....
5
votes
2
answers
166
views
Simpler forwarding of contained object
I have a proprietary file format definition that contains a header format:
class Header
{
public:
uint32_t checksum;
uint16_t impedance;
uint16_t type_of_data;
uint32_t ...
2
votes
1
answer
79
views
Problems refactoring pandas.DataFrame.groupby.aggregate to dask.dataframe.groupby.aggregate with custom aggregation
I would like to run groupby and aggregation over a dataframe where the aggregation joins strings with the same id.
The df looks like this:
In [1]: df = pd.DataFrame.from_dict({'id':[1,1,2,2,2,3], '...
0
votes
2
answers
119
views
Using StringAgg after filter & distinct
I'm using StringAgg and order as follows:
# Get order column & annotate with list of credits
if request.POST.get('order[0][name]'):
order = request.POST['order[0][name]']
...
0
votes
0
answers
62
views
Running OpenSearch term aggregations in parallel
We have a query calculating number of terms on multiple fields.
{
"query": {
"bool": {
"filter": [
{
"term": {
"...
0
votes
1
answer
101
views
Masked aggregations in pytorch
Given data and mask tensors are there a pytorch-way to obtain masked aggregations of data (mean, max, min, etc.)?
x = torch.tensor([
[1, 2, -1, -1],
[10, 20, 30, -1]
])
mask = torch.tensor([
...
2
votes
2
answers
76
views
Pandas groupby multiple columns, aggregate some columns, add a count column of each group [duplicate]
The data I am working with:
data (140631115432592), ndim: 2, size: 3947910, shape: (232230, 17)
VIN (1-10) object
County ...
-3
votes
1
answer
75
views
Monthly data aggregate to years in excel pivot table
I have a dataset of usage data per account, with multiple usage metrics for four years represented monthly as 'Jan.21'. I need yearly aggregates in columns in pivot table. Columns: 'Account name', '...
0
votes
1
answer
64
views
Kibana - Attempting to use Nested Aggregation in a condition in a Watcher
In Kibana Watcher, I'm trying to use the average results from a nested aggregation of a bucket on a condition within a Kibana Watcher but getting an null reference error when running a simulate on the ...
0
votes
0
answers
61
views
How to parse a 'COMPLEX<serlializablePairLongString>' in order to get only the String result from the query in Apache Druid?
I have a table definition in Apache Druid with a column as a complex type COMPLEX<serializablePairLongString> created by an ingestion aggregation.
So the column data is displayed like:
column
...
1
vote
1
answer
40
views
Sales data aggregation in R
I have daily sales data for multiple products in three stores. It looks something like this:
item_id
store_id
category_id
dept_id
date
event_name
daiy_price
a
tx_1
food
1
2012/12/24
6
a
tx_1
food
1
...
1
vote
2
answers
56
views
Why does this INNER JOIN query return all rows instead of just the one matching?
CREATE TABLE EMPLOYEE (
empId INTEGER AUTO_INCREMENT PRIMARY KEY,
name TEXT NOT NULL,
dept TEXT NOT NULL
);
INSERT INTO EMPLOYEE(name, dept) VALUES ('Clark', 'Sales');
INSERT INTO EMPLOYEE(name,...
0
votes
1
answer
18
views
How to get unique field values based on recent and highest priority documents in Elasticsearch?
I have an Elasticsearch index that stores documents with the following fields:
timestamp (date)
priority (integer)
user_id (string)
I need to find 5 unique user_id values from the most recent and ...
1
vote
0
answers
59
views
MongoDB Aggregations: Why is $group slow with indexed string fields but not on date?
Setup
Imagine this document shape:
{
dateString: "2024-10-31", // type string
realDate: ISODate("2024-10-31"), // type date
}
Assume both fields are indexed separately.
...
0
votes
1
answer
120
views
How to search embeddedDocuments fields and Root Document fields using Atlas Search
I have the following document in my collection
{
"_id": "101",
"PublisherName": "Big book publishers",
"Books": [
{
&...
-1
votes
1
answer
76
views
Why SQL HAVING SUM(column) comparison with number does not work? [closed]
Try to find the continents where all countries have a population <= 25000000 from the world table below as in Difficult Questions That Utilize Techniques Not Covered In Prior Sections.
name
...
0
votes
1
answer
519
views
Is it possible to aggregate grouped rows in AG Grid?
Some context:
I'm working on a React.js frontend app, using the AgGridReact component. I've managed to get various custom aggregations working, default aggs working, etc. In a typical scenario like ...
0
votes
0
answers
83
views
MongoDB Aggregation to calculate total revenue by each store for each month
I have a collection named sales with documents structured as follows:-
{
"_id": ObjectId("..."),
"date": ISODate("2024-06-15T00:00:00Z"),
"store": &...
0
votes
1
answer
61
views
How to sum by date for a list of items
I use MongoDB version 7. I store the payments data with order and buyer information. I have a few payments in a collection with order and buyer information, which I group by order id to the next JSON ...
0
votes
0
answers
39
views
Populate a ObjectId deep nested inside arrays and objects preserving document shape
I'm using MongoDB v5 with NodeJS (no Mongoose).
Let's say I have a collection structured the following way. This specific setup (an ObjectId nested inside an object contained inside an array contained ...
-2
votes
1
answer
52
views
Count rows from combining two tables [closed]
What Postgres query can I use for the below scenario?
Parent table:
Child table:
Expected result
Explanation:
Parent1 --> 4 equipments. Parent1 has ...
Child1 holding 3 equipment
Child2 holding ...
0
votes
0
answers
52
views
MongoDB server side pagination with lookup, inner join and limit
Data:
I have MongoDB setup to hold information about different devices. Each device can contain multiple nodes which have parent-child relationship. Each device can be applicable to multiple platforms ...
2
votes
0
answers
171
views
Complex groupby aggregation with cartesian product of multi-dimensional data over ManyToMany field
I have the following problem with a complex aggregation in postgres (16). The datamodel (CREATE Statements and ER-diagram) and an example of the required result set are to be found under the question.
...