Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
26 views

I have been set a task by my manager to try and predict insurance premiums based on some categories such as job description, number of people employed and turnover. I am comparing between K-Nearest ...
Red_bull's user avatar
0 votes
1 answer
58 views

Im working with Stackoverflow 2024 survey. In the csv file there are several multivalued variables (separated by ;). I want to apply One-hot encoding to the variables Employment and LanguageAdmire by ...
Lev's user avatar
  • 843
0 votes
0 answers
15 views

I am working on a binary classification task using an audio dataset, which is already divided into training and testing sets. However, I also need a validation set, so I split the training set into ...
GauravGiri's user avatar
0 votes
1 answer
47 views

I'm currently using MinMaxScaler() on my dataset. However, because my dataset is large I'm doing a first iteration pass in batches to compute the Min and Max Values for my Scaler. i'm using ...
Saffy's user avatar
  • 13
0 votes
0 answers
17 views

I'm working on a padas DataFrame that contains columns with lists and currently trying the method explode, but I'm not getting the desired output, instead, it does a Cartesian Product, combining all ...
buzzo's user avatar
  • 1
2 votes
0 answers
65 views

I am fine-tuning sam model for my dataset containing train_images and train_masks. I am able to create dict, but when calling last command i.e. to load dataset from dict, kernel dies. It happened ...
Sanju 's user avatar
  • 21
0 votes
1 answer
62 views

I want to train a simple neural network, which has embedding_dim as a parameter: class BoolQNN(nn.Module): def __init__(self, embedding_dim): super(BoolQNN, self).__init__() self....
samuel gast's user avatar
-1 votes
1 answer
189 views

I'm currently working with data of customers reviews on products from Sephora. my task to classify them to sentiments : negative, neutral , positive . A common technique of text preprocessing is to ...
read data's user avatar
1 vote
0 answers
23 views

When fitting the model in google collab there doesnt seem to be any problem. However, when I try to create an interface using streamlit and pickle, Target encoder doesnt work and I am unable to solve ...
user25546188's user avatar
0 votes
0 answers
52 views

I have to preprocess a feature which is basically a list of number codes enocoded as a string, and I want to encode it such that the output is an array of frequencies of each of these numbers. The ...
AKHIL GOPIKUMAR's user avatar
1 vote
2 answers
682 views

I am trying to build a custom sigmoid-shaped function because I want to scale my data during preprocessing. Basically, the goal is to obtain a sigmoid shaped function that outputs from 0 to 1 and only ...
cercio's user avatar
  • 89
1 vote
0 answers
85 views

I have the following input: data = { 'Group_A': ['0&1', '1&5', '0&5', '1&7', '3&8', '4&8', '3&5', '4&4'], 'Group_B': ['1&0', '5&7', '0&5'...
deepcurious's user avatar
0 votes
1 answer
838 views

I am working on a project involving Step Functions with SageMaker. I have an existing Step Function that I need to integrate SageMaker into, and I tried adding steps such as processing, model training,...
Gwenda Thomas's user avatar
-4 votes
1 answer
64 views

Sorry for the title, I know it might be pretty wide and not so much informative. I am facing a problem regarding the analysis of a data set. The participants of my experiments were randomly assigned ...
taboulet's user avatar
0 votes
1 answer
374 views

Trying to filter out rows in which the data of specific column start with a given substring. I have a pandas.DataFrame as shown below (simplified): price DRUG_CODE 123 A12D958 234 B564F3C ... ... I'm ...
Warren Chen's user avatar
0 votes
1 answer
33 views

from sklearn.compose import ColumnTransformer from sklearn.preprocessing import StandardScaler, OneHotEncoder, OrdinalEncoder from sklearn.pipeline import Pipeline from sklearn.model_selection import ...
s213439's user avatar
0 votes
0 answers
39 views

Trying to run an LSTM model where the data is separated into few columns in csv and i'm trying to prepare date from such csv's. Getting the error of ValueError: Failed to convert a NumPy array to a ...
Athul Srinivas's user avatar
1 vote
1 answer
2k views

I've uploaded a dataset on kaggle(approx. 73GB), and I'm trying to preprocess this data for model training purposes. This dataset has a large no. of missing values, which I am trying to interpolate ...
54m4gr4's user avatar
  • 13
0 votes
1 answer
620 views

I'm trying to make a simple lstm neural network. I've got time series data which I am splitting into sequences and batches using Pytorch's Dataset and DataLoader. To account for the variable lengths ...
D Danne's user avatar
  • 17
0 votes
0 answers
55 views

I'm new with python so I'm sorry if this is a basic one. However, after I ran the code, I got this: TypeError: cannot do positional indexing on RangeIndex with these indexers [ Year Average of PM ...
Sofia's user avatar
  • 1
0 votes
1 answer
102 views

I have 31 features to be input into an ML algorithm. Of these 22 feature values are in the range of 0 to 1 already. The remaining 9 features vary between 0 to 750. My doubt is if I choose to apply ...
rekha's user avatar
  • 7
1 vote
1 answer
38 views

I'm performing data analysis on a dataset with categorical labels are interrelated. My labels track experimental conditions. In my case, labels track concentrations of combinations of two chemicals ...
WoolyThomas's user avatar
0 votes
0 answers
95 views

from sklearn.preprocessing import MinMaxScaler values = df[['Close']] #values is floats ranging from 0.06 to 190.08 sc = MinMaxScaler() scaled_values = sc.fit_transform(values) descaled_values = sc....
haintaki's user avatar
0 votes
0 answers
57 views

There are 13000 values approximately for a given column. The below function works in a way that the input is a list of strings and does the NER tagging for each word in the list. On an average there ...
srinivas muralidharan's user avatar
0 votes
0 answers
93 views

I am now trying to run preprocessing tasks of DLRM with Apache Beam https://github.com/tensorflow/models/tree/master/official/recommendation/ranking/preprocessing. The dataset is Criteo Kaggle 10GB ...
Eric's user avatar
  • 1

1
2 3 4 5
10