Skip to main content
Filter by
Sorted by
Tagged with
2 votes
1 answer
96 views

I am trying to convert a string that is a categorical data type into a numeric. I found out that I can use pandas.Categorical, unfortunately, accessing the codes attribute give me an error. Here is a ...
JA-pythonista's user avatar
5 votes
5 answers
194 views

I have a long list of items that I want to assign a number to that increases by one every time the value in the list changes. Basically I want to categorize the values in the list. It can be assumed ...
Regina Phalange's user avatar
0 votes
0 answers
32 views

When a label column of type string with high cardinality (>500) is selected in the data classification scenario, the model fails to train even with extended training time. Are there any solutions ...
Yaswanth reddy's user avatar
1 vote
1 answer
113 views

I am using polars to hash some columns in a data set. One column is contains lists of strings and the other column strings. My approach is to cast each column as type string and then hash the columns....
MikeB2019x's user avatar
  • 1,297
0 votes
0 answers
86 views

I am working with Pandas 1.5.3 and using pd.concat to merge two large DataFrames that contain categorical columns. Initially, everything works fine, but after running continuously at scale, the ...
Sanjay's user avatar
  • 1
-1 votes
1 answer
57 views

I have this data fram and I want to create an additional column that tells me the date the category was previously active. DF <- data.frame( Date = rep(c("10-12-2024", "10-17-2024&...
FPiper's user avatar
  • 83
2 votes
1 answer
103 views

I am currently working on a study, in which we aim to compare the post-surgical complications of patients who underwent a specific type of brain surgery. In one of our analyses, we would like to ...
Massimo Barbagallo's user avatar
2 votes
1 answer
85 views

In the Hmisc::describe documentation (at page 76) there is written: This function determines whether the variable is character, factor, category, binary, discrete numeric, and continuous numeric, and ...
robertspierre's user avatar
1 vote
2 answers
620 views

I'm new to ML and would like to know more about classification. I have a small dataset of n=600 scored samples and thousands of potential metrics, all categorical (True or False). Basically, I would ...
Marwan Haioun's user avatar
0 votes
1 answer
37 views

I am new on to R Programming I have tried several codes to analyse the data below in a way that each question will have their responses stacked on each other in a bar chat to no avail Cookies ...
Opeyemi Okolie's user avatar
2 votes
1 answer
200 views

While working with private data, I noticed that the ordinal logistic model fitted using the polr function from the MASS package, along with the confidence intervals provided by broom::tidy, does not ...
Isaac Victor Silva Rodrigues I's user avatar
0 votes
1 answer
117 views

I have replicated the choropleth map for discrete colors in R using the method suggested in this link: How to create a chloropleth map in R Plotly based on a Categorical variable? However, as you will ...
Lagrange's user avatar
1 vote
1 answer
92 views

I have 2 DFs with object type columns, which work fine with concatenation. Code df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', None]}) df2 = pd.DataFrame({'A': ['A4', 'A5'], 'B': [None, None]}) ...
Krishna's user avatar
  • 1,632
0 votes
1 answer
64 views

I have a dataset, df, with one dependent variable with levels "0" and "1" and one independent variable with levels "1" and "2". On performing logistic ...
vp_050's user avatar
  • 518
0 votes
3 answers
58 views

I have a pandas dataframe which I am trying to sort on the basis of values in a column, but the sorting is not alphabetical. The sorting is based on a "sorter" list (i.e. a list which gives ...
Alhpa Delta's user avatar
  • 3,700
1 vote
0 answers
38 views

I have a group of 30 participants who did a test for three times. Each time, The independent variable was changed and the participants reported their answers (dependent variable). The independent ...
gladys0313's user avatar
  • 2,699
0 votes
0 answers
35 views

I create a boxplot. I have two categorical variables, 'class' (ENVST 3210, ENVST 2050, and ENVST 555# - x-axis) and 'High Impact' (these are basically number of high impact learning experiences, viz, ...
Debolina Banerjee's user avatar
0 votes
2 answers
175 views

I am working on a huge denormalized table on a SQL server (10 columns x 130m rows). Take this as data example : import pandas as pd import numpy as np data = pd.DataFrame({ 'status' : ['pending', ...
FábioRB's user avatar
  • 457
0 votes
1 answer
24 views

I have data organised as in this example: data1 <- tibble(seq = factor(1:20), value = rnorm(20, 10, 2), par_a = c(rep("S1", 6), rep("S2", 14)), ...
Radek Jaźwiec's user avatar
0 votes
0 answers
98 views

I am having some trouble getting step_interact() from tidymodels to produce the desired set of predictor variables. I want to include pairwise interactions, but exclude all interactions which are ...
rpc_12345's user avatar
0 votes
1 answer
97 views

I apologize if this is redundant, but I have tried to look for solutions, and have not found anything that appears to be the answer to my question. So, I have time series data for a bunch of variables....
Colin's user avatar
  • 23
0 votes
1 answer
104 views

I would like to make a chart like in this picture instead of a barplot to represent frequencies of several categorical variables. This is a snippet of my data for the variable of interest: c("...
elsich's user avatar
  • 41
1 vote
1 answer
45 views

I got this error: Error in table(st2.affect) : attempt to make a table with >= 2^31 elements when I tried to use function (or any other proportions function) such as: proportions(table(st2.affect)...
Ola's user avatar
  • 23
0 votes
1 answer
939 views

i'm trying to use categorical variable support of XGBoost. I'm following XGBoost's own documentation for categorical data. (linked here : https://xgboost.readthedocs.io/en/stable/tutorials/categorical....
sena's user avatar
  • 11
1 vote
4 answers
596 views

I am having a dataset that has a variable called individuals with many options and it comes like that. I have observations for a given Day on different individuals (Individual_ID) The different ...
MIGUEL's user avatar
  • 39

1
2 3 4 5
36