I'm trying to use random forest with grid search but this error shows up
ValueError: Invalid parameter classifier for estimator Pipeline(steps=[('tfidf_vectorizer', TfidfVectorizer()),
('rf_classifier', RandomForestClassifier())]).
Check the list of available parameters with `estimator.get_params().keys()`.
import numpy as np # linear algebra
import pandas as pd
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn import pipeline,ensemble,preprocessing,feature_extraction,metrics
train=pd.read_json('cleaned_data1')
#split dataset into X , Y
X=train.iloc[:,0]
Y=train.iloc[:,2]
estimators=pipeline.Pipeline([
('tfidf_vectorizer', feature_extraction.text.TfidfVectorizer(lowercase=True)),
('rf_classifier', ensemble.RandomForestClassifier())
])
print(estimators.get_params().keys())
params = {"classifier__max_depth": [3, None],
"classifier__max_features": [1, 3, 10],
"classifier__min_samples_split": [1, 3, 10],
"classifier__min_samples_leaf": [1, 3, 10],
# "bootstrap": [True, False],
"classifier__criterion": ["gini", "entropy"]}
X_train,X_test,y_train,y_test=train_test_split(X,Y, test_size=0.2)
rf_classifier=GridSearchCV(estimators,params, cv=10 , n_jobs=-1 ,scoring='accuracy',iid=True)
rf_classifier.fit(X_train,y_train)
y_pred=rf_classifier.predict(X_test)
metrics.confusion_matrix(y_test,y_pred)
print(metrics.accuracy_score(y_test,y_pred))
I've tried to add those params
param_grid = {
'n_estimators': [200, 500],
'max_features': ['auto', 'sqrt', 'log2'],
'max_depth' : [4,5,6,7,8],
'criterion' :['gini', 'entropy']
}
but still the same error
classifierin your pipeline - there is anrf_classifier.AutoML(algorithms=['Random Forest'], mode='Compete')and just fit AutoML to have hyperparameters search:automl.fit(X, y)