Newest 'xgboost' Questions

-1 votes

0 answers

29 views

Impact of nulls in xgboost [closed]

I have a data set in which state is a one-hot encoded variable. Some states are allowed to use all predictors, some states are not allowed to use certain predictors. If I null out those variables as ...

just_a_guy

1

asked Nov 18 at 18:34

0 votes

0 answers

50 views

Why does XGBoost training (with DMatrix) write heavily to disk instead of using RAM?

I am training an XGBoost model in Python on a dataset with approximately 20k features and 30M records. The features are sparse, and I am using xgboost.DMatrix for training. Problem During training, ...

cool_heisenberg

51

asked Sep 19 at 14:39

2 votes

1 answer

264 views

How to make a python package that can have two different version of a dependency?

It is now not uncommon to have a python package that is distributed in a multitude of different "flavors". This happens often with machine learning packages, e.g. onnxruntime has many "...

MajorTom

385

asked Aug 8 at 15:17

0 votes

1 answer

83 views

None default values in XGBoost regressor model [closed]

I am encountering a problem regarding XGBoost regressor. It produces NONE' default values as shown in figure below. What could be the reason behind for getting 'NONE' default values for XSGBoost ...

tom

75

asked Jul 11 at 7:14

0 votes

1 answer

62 views

Time series patient visits for XGBoost classifier

I’m developing a tree-based model classifier (XGBoost) using some healthcare (patient visits) data. The data has a time dimension, and I want to observe if there is a longitudinal effect for the ...

Sasoo

1

asked Jun 17 at 12:39

0 votes

1 answer

102 views

How to tune the hyperparameter early-stopping of xgboost in mlr3 with auto_tune()?

I want to perform a XGBoost and tune some hyperparameters which are used to preprocess the data. (I reduce the noise of some spectrometry data by applying the Savitzky-Golay filter.) When training the ...

franzi-r

29

asked May 30 at 9:16

2 votes

1 answer

106 views

Why shap's explainer.model.predict() and model.predict don't match?

I have a machine learning model and I calculated SHAP on it using following code: import shap background = shap.kmeans(X_dev, k=100) explainer = shap.TreeExplainer(model, feature_perturbation="...

Adarsh Wase

1,931

asked May 10 at 19:07

0 votes

1 answer

114 views

When train a small XGBoost model on DataBricks, it will crash and show memory issue. But similar table actually works well

I meet a bug which blocks me a few days. I have a spark dataframe with 66 columns and 100K rows, I want to train a XGBoost model on DataBricks platform but will always crash. I generated a similar ...

HappyCoding

1

asked May 2 at 16:04

0 votes

1 answer

97 views

Time Series Forecasting Model with XGBoost and Dask Large Datasets Crashing

I'm building a time series forecasting model in Python to predict hourly kWh loads for different customer types at a utility company. The dataset contains ~81 million rows, with hourly load data for ~...

Jared

11

asked Mar 31 at 18:33

-1 votes

1 answer

59 views

generatePartialDependenceData function returns Error when used for multiclass classification model

I have build an XGBoost multiclass classification model using mlr and i want to visualize the partial dependence for some features. However, if i try to do so using generatePartialDependenceData() i ...

ChickenTartR

27

asked Mar 10 at 18:27

0 votes

1 answer

176 views

XGBoost does not predict properly on input that's equal to traning data [closed]

Why this quite simple example of XGBoost ML produces all-nulls even on input, that's equivalent to training data? This looks like a trivial case of input which should not require any fine tuning of ML,...

ishulz

165

asked Mar 5 at 6:59

1 vote

0 answers

48 views

Xgboost Signature converting categorical variable to string. Need to keep categorical variables throughout the process

As part of model logging, I observed an issue. Infer Signature is converting categorical variables into object. I need to log_model and register with variable as categorical, This is causing model ...

Raju Natra

23

asked Mar 3 at 2:25

0 votes

1 answer

82 views

Process hangs when multiprocessing with XGBoost model batch prediction

Here's a batch prediction case using multiprocessing. Steps: After with mp.Pool(processes=num_processes) as pool, there's a with Dataset(dataset_code) as data in the main process using websocket to ...

Jason

37

asked Feb 27 at 11:05

0 votes

0 answers

84 views

XGBoost bst.predict() output not matching with manual calculation from the (text) tree model for binary:logistic case

I am trying to validate the XGBoost output (booster.predict) for logistic regression wrt my understanding of output calculation via the trees built. I see a difference of around -1.58 factor in all my ...

Error

1

asked Feb 11 at 12:55

0 votes

0 answers

129 views

Predict day-ahead hourly electricity prices after having trained a model using historical data

I have trained a XGboost model with historical data from 2015 - 2024. I have added some features like weather data, electrcity consumption, generation from different sourses like neuclear, and other ...

Nafees Mohammad Adil

9

asked Jan 15 at 9:50

32 votes

4 answers

36k views

'super' object has no attribute '__sklearn_tags__'

I am encountering an AttributeError while fitting an XGBRegressor using RandomizedSearchCV from Scikit-learn. The error message states: 'super' object has no attribute '__sklearn_tags__'. This occurs ...

Varshith

333

asked Dec 18, 2024 at 11:45

0 votes

0 answers

91 views

ImportError from Pandas while running XGBoost model on python

I am trying to run a basic XGBoost model on python (v 3.8.5), however getting an error that I can not resolve. Appreciate your help, thanks! My code is as below: import seaborn as sns import pandas ...

Preetam Pal

49

asked Dec 14, 2024 at 19:59

3 votes

1 answer

604 views

XGBoost/ XGBRanker to produce probabilities instead of ranking scores

I have a dataset of the performance of students in exams which looks like: Class_ID Class_size Student_Number IQ Hours_Studied Score 1 3 3 101 10 ...

Ishigami

592

asked Dec 13, 2024 at 14:20

0 votes

0 answers

134 views

XGBoost and LGBM models size depends on training data size for a given set of params whereas Catboost doesnt

I am comparing models in a walk forward cross validation setup, under python 3.11. For a given set of hyperparameters, xgboost and LGBM models size (when pickled or saved using the library saving ...

g0bel1n

11

asked Dec 7, 2024 at 19:09

1 vote

1 answer

127 views

Holdout validation set- hyperparameter tuning

I have a large dataset and I have split it in: training set (80%) validation set (10%) test set (10%) On each set, I performed missing values imputation and feature selection (trained on the ...

Mark

15

asked Dec 3, 2024 at 13:26

0 votes

0 answers

63 views

Cannot import name XGBRegressor from xgboost (unknown location)

xgboost error unable to import XGBRegressor I have created an env on vscode to implement an end to end pipeline for a machine learning project. most of my code has been saved in github. I used a ...

Rahul Poojith

9

asked Nov 28, 2024 at 13:56

0 votes

2 answers

607 views

XGBoost Early Stopping Rounds

my code below keeps blowing up and I can't work out what is going on import optuna import xgboost as xgb from sklearn.model_selection import train_test_split from sklearn.metrics import ...

CraigBreezey

1

asked Nov 14, 2024 at 16:14

0 votes

1 answer

180 views

How to make shap.plots.scatter with xgboost.DMatrix holding missing data?

I have a dataset with missing data. They are encoded as NaN. This is fine for model fitting with XGBoost. When I want to understand the model, analyzing model importance with SHAP scatter plots, I am ...

LudvigH

4,934

asked Oct 31, 2024 at 11:45

0 votes

1 answer

243 views

More efficient way to stream data to AWS Batch Transform Job

I have a sagemaker process for training and running inference on data in sagemaker: processing job: read input csv files from s3 and clean up the data, output csv files to s3 processing job: read in ...

Olek

69

asked Oct 18, 2024 at 19:18

0 votes

1 answer

270 views

Error when calculating SHAP value in xgboost model - feature names are different?

I have trained an XGBoost model using caret and now, I am calculating the mean SHAP value of each predictor using the package SHAPforxgboost, using the following code: library(SHAPforxgboost) ...

a12456

1

asked Sep 24, 2024 at 10:48

Collectives™ on Stack Overflow

Impact of nulls in xgboost [closed]

Why does XGBoost training (with DMatrix) write heavily to disk instead of using RAM?

How to make a python package that can have two different version of a dependency?

None default values in XGBoost regressor model [closed]

Time series patient visits for XGBoost classifier

How to tune the hyperparameter early-stopping of xgboost in mlr3 with auto_tune()?

Why shap's explainer.model.predict() and model.predict don't match?

When train a small XGBoost model on DataBricks, it will crash and show memory issue. But similar table actually works well

Time Series Forecasting Model with XGBoost and Dask Large Datasets Crashing

generatePartialDependenceData function returns Error when used for multiclass classification model

XGBoost does not predict properly on input that's equal to traning data [closed]

Xgboost Signature converting categorical variable to string. Need to keep categorical variables throughout the process

Process hangs when multiprocessing with XGBoost model batch prediction

XGBoost bst.predict() output not matching with manual calculation from the (text) tree model for binary:logistic case

Predict day-ahead hourly electricity prices after having trained a model using historical data

'super' object has no attribute '__sklearn_tags__'

ImportError from Pandas while running XGBoost model on python

XGBoost/ XGBRanker to produce probabilities instead of ranking scores

XGBoost and LGBM models size depends on training data size for a given set of params whereas Catboost doesnt

Holdout validation set- hyperparameter tuning

Cannot import name XGBRegressor from xgboost (unknown location)

XGBoost Early Stopping Rounds

How to make shap.plots.scatter with xgboost.DMatrix holding missing data?

More efficient way to stream data to AWS Batch Transform Job

Error when calculating SHAP value in xgboost model - feature names are different?

Hot Network Questions