I am trying to create a Streamlit application that predicts hypertension risk based on patient input data using the MLP (Multilayer Perceptron) model I trained. Below is my Keras model code:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Input
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Activation
from tensorflow.keras.utils import to_categorical
# Initialize the MLP model
model = Sequential()
# Input layer using Input() and hidden layer 1 with 10 neurons, ReLU activation
model.add(Input(shape=(X_train.shape[1],))) # Automatically adapt input shape based on data
model.add(Dense(24, activation='relu'))
# Dropout layer to reduce overfitting
model.add(Dropout(0.2))
# Second hidden layer with 5 neurons, ReLU activation
model.add(Dense(16, activation='relu'))
# Dropout layer to reduce overfitting
model.add(Dropout(0.1))
# Output layer with 4 neurons (for multi-class classification), softmax activation
model.add(Dense(4, activation='softmax'))
# Compile the model with Adam optimizer and categorical cross-entropy loss
model.compile(optimizer=Adam(learning_rate=0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Implement early stopping to avoid overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
# Ensure the target is correctly one-hot encoded
y_train = to_categorical(y_train, num_classes=4)
y_test = to_categorical(y_test, num_classes=4)
# Check the target dimensions after transformation
print("y_train shape:", y_train.shape)
print("y_test shape:", y_test.shape)
# Train the model with training data
history = model.fit(X_train, y_train,
epochs=100,
batch_size=32,
validation_data=(X_test, y_test),
callbacks=[early_stopping])
# Evaluate the model
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)
print(f"Test Loss: {test_loss}")
print(f"Test Accuracy: {test_acc}")
Then, I try to create a Streamlit interface with the following code:
import streamlit as st
import numpy as np
import joblib
# Load model
mlp = joblib.load("model1.pkl")
st.title("Hypertension Risk Prediction")
st.write("Enter patient data below to classify hypertension risk.")
# Form input features
age = st.number_input("Age", min_value=0, max_value=100)
gender = st.selectbox("Gender", ["Male", "Female"])
gender_int = 1 if gender == "Female" else 0
sbp = st.number_input("Systolic Blood Pressure", min_value=0, max_value=300)
dbp = st.number_input("Diastolic Blood Pressure", min_value=0, max_value=200)
hr = st.number_input("Heart Rate", min_value=0, max_value=200)
hot = st.selectbox("Hypertension History", ["Yes", "No"])
hot_int = 1 if hot == "Yes" else 0
dizzy = st.selectbox("Dizziness", ["Yes", "No"])
dizzy_int = 1 if dizzy == "Yes" else 0
com = st.selectbox("Comorbidities", ["Yes", "No"])
com_int = 1 if com == "Yes" else 0
# Prediction
if st.button("Predict"):
# Create input data array for prediction
input_data = np.array([[age, sbp, dbp, gender_int, hr, hot_int, dizzy_int, com_int]])
# Ensure the input shape matches the model's expectation
if input_data.shape[1] == mlp.input_shape[1]:
test = mlp.predict(input_data) # Get the predicted probabilities
kelas = np.argmax(test, axis=1)[0] # Get the class with the highest probability
# Display the classification results based on the predicted class
if kelas == 0:
st.success("The patient is classified as: **Prehypertension**")
elif kelas == 1:
st.warning("The patient is classified as: **Hypertension Stage 1**")
elif kelas == 2:
st.warning("The patient is classified as: **Hypertension Stage 2**")
else:
st.error("The patient is classified as: **Hypertension Stage 3**")
else:
st.error(f"Input data shape mismatch. Expected {mlp.input_shape[1]}, got {input_data.shape[1]}")
However, I am facing an issue where the Streamlit app always predicts Stage 3 no matter what input I provide. Additionally, the model has an average accuracy of 98% and a loss of 6%. Although there is data imbalance, it doesn't seem to affect Stage 3 predictions, and the recall, precision, and F1 score performance are good overall.
Does anyone know why this might be happening and how I can adjust my Streamlit code to ensure that it predicts correctly based on the model I trained?
.predict()in first script (after.fit())? If it also gives wrong results then problem is only model, not Streamlit...predict(). method after training the model, using this code:predictions = model.predict(X_test)print("Predictions on the test set: \n", predictions[:3])The predictions look correct. I’m pretty sure the issue lies in the Streamlit app or how the one-hot encoding is being handled within the Streamlit interface. There could be a mismatch in how input is processed or how the encoding is applied when the data is passed to the model in Streamlit.