3,941 questions
Advice
0
votes
3
replies
47
views
String manipulation: extract words under brackets
I'm not yet very familiar with the patterns in Lua's string.gsub function.
If I have a string like this:
Fishing Lure(+100 Fishing Skill)(1 hour)
and I want extract only the string "1 hour"...
3
votes
4
answers
237
views
Fast unnest complex column with data.table
I have a dataset where the column to unnest contains data with unequal rows and columns rather than data with equal dimensions. I'm looking for a fast approach to unnest this dataset using data.table.
...
3
votes
3
answers
214
views
Mutating detection data into binary
Currently I have a dataframe of bear detections that I want to convert into a binary detection history (14 columns of day1, day2, day3, etc. where:
actual_date_out = the date the camera was deployed, ...
0
votes
0
answers
20
views
Find conditions from multiple databases to have in a single database
I am currently working in a project where multiple databses are available to check for specific conditions of a patient.
Specifically, I have a "master" database in wide format, with one row ...
0
votes
1
answer
77
views
Setting a row number for each row in PySpark Dataframe
Currently I'm working with a large database using PySpark and stuck with a problem oh how to correctly set row numbers depending on condition
My dataframe is:
id_company id_client id_loan date
c1 ...
0
votes
3
answers
245
views
Update Object_construct nested in an Array_construct in Snowflake
Can anyone please help me with this scenario where I have might have multiple OBJECT_CONSTRUCT nested within an ARRAY_CONSTRUCT. I am not able to update one value of an element within it. I am using ...
0
votes
2
answers
170
views
Merge dataframes with conditions using PySpark
Currently I'm making calculations using PySpark and trying to match data from multiple dataframes on a specific conditions.
I'm new to PySpark and decided to ask for a help.
My first dataframe ...
0
votes
0
answers
78
views
Dropping rows whose row sum = zero keeping the original structure same
I have a dataframe containing incalculable rows and columns. The df is structured in such that until 6th row and 2nd column, I have string as input and the rest are numbers(floating points). I want to ...
0
votes
1
answer
98
views
Wrong variable comparison result when performing data.table merge of two table with duplicated keys
A collegue trying to do analysis came up with a code from chatgpt, doing something wrong, but that I don't understand.
Here is the example:
Let's consider a first table ( drugs: Patient have an id, ...
0
votes
0
answers
92
views
How to avoid burp suite from altering input dropdown values in java
We have an application which was tested from Burp suite, by intercepting and altering the values of the dropdown data in our application. Those fields are disabled when view through browser, but able ...
0
votes
2
answers
58
views
Correlation based on mutliple columns and rows
I have a data frame arranged along these lines:
theta
Rater
Case1
Case2
...
CaseN
theta1
rater1
score1
score2
...
scoreN
theta2
rater1
score1
score4
...
scoreN
theta1
rater2
score1
score2
...
scoreN
.....
0
votes
2
answers
82
views
Return a list of positions of all possible occurrences of a character in a string
In Python how do you get a list of all possible occurrences of a character or a substring in a given string
Input: "This is a string" , "s"
Output: [3,6,10]
Explanation: Returns ...
0
votes
1
answer
66
views
I need advice with data manipulation R: large data set
I have two data sets of bird detections. One by a human at randomly selected intervals of 2 minutes and one by a machine. I want to compare how well the machine did by checking if the detections ...
0
votes
1
answer
41
views
How can I format the entire tree recursively of PHP Nested Category array output?
I have a nested model category tree in array format as follows ....
$data = [
[
'categoryId' => '08adf337-a577-4038-86a6-a5cd16676dff',
'name' => 'ELECTRONICS',
'parentId' => 0,
...
0
votes
1
answer
75
views
Separating one mysql row into n different ones
My company's client has a table to store inventory data, this inventory separates products using a column called product_code, and it has another column called Qtd that stores how many of the same ...
0
votes
2
answers
76
views
how to filter rows in r in a dataframe with multiple columns based on names in a column from another dataframe?
I have a two dataframes of names, where dataframe one contains a single column of names whereas dataframe two contains multiple columns of names. How can I filter the second dataframe to only contain ...
0
votes
1
answer
67
views
How to consolidate two rows based on data source?
Using snowflake for this:
I have a query that produces a very simple table union from 5 difference data sources:
WITH personal_info_workday AS (
SELECT
'Workday' AS source,
CAST(w....
1
vote
0
answers
230
views
PyArrow Table manipulation: Unnest float-array column to individual columns
I have nested data stored in parquet files.
Polars was my main entrypoint for fast data formatting of this nested data, but for performance reasons, I'd like to use native arrow, using the PyArrow ...
1
vote
1
answer
128
views
restrict to those with data at specific age ranges in R
I have the following long format data frame with columns, id, age, and BMI. I have restricted the dataset such that only people (id) with at least 3 repeated measurements between age 2 weeks and 10 ...
2
votes
2
answers
100
views
How can I replicate rows in R based on the values of another row?
I am wondering how to write a function to replicate rows based on the value within a column, e.g. if there is a difference of > +-0.1 between one row and the next, that row is replicated so that ...
0
votes
1
answer
67
views
A way to concat data from child rows of parent row of a pivot table
I have a large pivot table with the following data. The bold text is the parent row of data and the following years (non-bold data) are the child data of the parent row. Is it possible in excel or ...
1
vote
0
answers
33
views
Create subset and calculate sums in Python based on a condition
I am currently doing some data manipulation procedures and have run into a problem of how to make subsets based on special conditions.
My example (dataframe) is like this:
Name ID Debt ...
0
votes
2
answers
71
views
r difference in each observation within Id
Assuming I have a dataset like this
id time cd4 sequence
1 -0.741958 548 1
1 -0.246407 893 2
1 0.243669 657 3
2 -2.7296369 464 1
2 -2.2505131 845 2
2 -0....
1
vote
1
answer
329
views
Power Query document not saving changes
I will preface this with I am brand new to Excel Power Query, just learned about it last night and made my source folder containing the csv files to merge (5 total - 2020, 2021, 2022, 2023,2024 (...
0
votes
1
answer
123
views
Pandas idxmin equivalent for mean
I am trying to filter a very large dataframe that looks like this:
unique id
x
y
1
1
2
1
2
3
1
3
4
2
1
2
2
2
3
2
3
4
to only contain the mean values for each unique id, (e.g. filtered on 'x') like ...