I want to write a code for MultiOutputClassifier in Python using scikit learn. I have text values so I used CountVectorizer(), and I want to find the best parameters for my model so I used GridSearchCV and model.best_params_.
Best parameter for decision tree and for MultiOutputClassifier.
I get the error and I do not know how to fix it, I looked everywhere:
ValueError: Invalid parameter criterion for estimator MultiOutputClassifier(estimator=DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, presort=False, random_state=None,
splitter='best'),
n_jobs=None). Check the list of available parameters with `estimator.get_params().keys()`.
How can I fix this error? This is the full code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import tree
from sklearn.multioutput import MultiOutputClassifier
from sklearn.metrics import accuracy_score
df = pd.DataFrame({"first":["yes", "no", "yes", "yes", "no"],
"second":["yes", "no", "no", "yes", "yes"],
"third":["true","true", "false", "true", "false"]})
#print(df)
features = df.iloc[:,-1]
results = df.iloc[:,:-1]
cv = CountVectorizer()
features = cv.fit_transform(features)
features_train, features_test, result_train, result_test = train_test_split(features, results, test_size = 0.3, random_state = 42)
tuned_tree = {'criterion':['entropy','gini'], 'random_state':[1,2,3,4,5,6,7,8,9,10,11,12,13]}
cls = GridSearchCV(MultiOutputClassifier(tree.DecisionTreeClassifier()), tuned_tree)
model = cls.fit(features_train, result_train)
acc_prediction = model.predict(features_test)
accuracy_test = accuracy_score(result_test, acc_prediction)
print(accuracy_test, model.best_params_)