Banglore House Price Prediction (Deploying Realtime) and Key Learnings of the project

Building a Web App for House Price Prediction: Deploying the Machine Learning Model

In our previous blog, we explored Exploratory Data Analysis (EDA) and built a predictive model to estimate house prices. After testing multiple regression models, XGBoost Regressor proved to be the most accurate, and we saved the trained model using pickle for future use. Now, it's time to take the next step—deploying our model as a web application!

A web app allows users to interact with the model in real-time, making predictions based on their input. In this blog, we will:

Set up a web framework using Streamlit to build an interactive UI.
Load the saved model and use it for predictions.
Create a user-friendly interface where users can enter property details and get instant price estimates.
Deploy the web app so it can be accessed by anyone online.

Step-by-Step Implementation

1. Import Required Libraries

First, we import the necessary libraries:

import streamlit as st
import pandas as pd
import pickle
import numpy as np
import sklearn
import xgboost


2. Load the Machine Learning Model
The trained model is saved in a Model.pkl file and loaded into the app using Pickle:
model=pickle.load(open('Model.pkl','rb'))



3. Load the Dataset
We load the cleaned dataset to populate options for dropdown menus:
house=pd.read_csv('Cleaned_data.csv')
area_type = house['area_type'].unique()
availability = house['availability'].unique()
location = house['location'].unique()



4. Set Up the Streamlit Interface
We use Streamlit widgets like selectbox and number_input to collect user input:
st.title("Car Price Predictor")

area_type_1 = st.selectbox("area type does your house belong to?", area_type)
availability_1 = st.selectbox("Is it availble now or will it be availble soon?", 
                              availability)
location_1 = st.selectbox("Which location is the house at?", location)
bath_1 = st.number_input("How many baths does the house have?", value=None, 
                          placeholder="Type a number...")
bhk_1 = st.number_input("How many rooms does the house have?", value=None, 
                         placeholder="Type a number...")
sqft_1 = st.number_input("What is the sqft of the house?", value=None, 
                          placeholder="Type a number...", min_value=300)



5. Make Predictions
When the "Predict" button is clicked, the app collects the inputs and passes them to the 
trained model for prediction:
if st.button('Predict'):
    prediction=model.predict(pd.DataFrame(columns=["area_type", "availability", "location", 
                                          "bath", "bhk", "sqft"], 
                                          data=np.array([area_type_1, availability_1, 
                                          location_1, bath_1, bhk_1, sqft_1]).reshape(1, 6)))
and display the predicted value:
    st.text("The house may cost approximately Rs. "+str(int(prediction[0])*100000))



Deploying the App
You can deploy the app using Streamlit Cloud or other platforms: 
Save the code as app.py 
Run locally with:
“Streamlit run app.py”
Example Output
Here’s how the app might look when deployed:
User Input Section
Select the area type, availability status location, 
          and enter details like square footage, bathrooms, and bedrooms. 
Prediction Result
On clicking "Predict," the app will display the estimated house price
Key Learnings from Bangalore House Price 
Prediction blog series
1. Data Cleaning and Preprocessing is Crucial
Handling Missing Values:
     Cleaning missing or inconsistent data is the foundation of a reliable
     model. For instance, we addressed missing entries in the bath column and
     extracted numerical values from ranges in the total_sqft column.
Outlier Removal:
     Filtering unrealistic data, such as outliers in price-per-square-foot or
     total square footage, helps improve model accuracy.
Feature Engineering:
     Adding derived features, such as price per square foot, provides more
     meaningful insights for the model.
2. Exploratory Data Analysis (EDA) Informs Feature Selection
Visualizing data distribution and
     relationships using tools like histograms and correlation heatmaps
     uncovers patterns and dependencies in the dataset.
For example, correlations between
     features like bath, bhk, and price influenced our feature selection for
     the predictive model.
3. Choosing the Right Model Matters
Model Comparison:
     Testing multiple regression algorithms, such as Linear Regression,
     Decision Trees, and XGBoost, ensures that we select the best-performing
     model.
XGBoost:
     In this project, XGBoost delivered the highest R² score, making it the
     ideal choice for deployment.
4. Model Deployment with Streamlit is Straightforward
Integration: The trained XGBoost model was serialized using pickle and loaded into the
     Streamlit app, enabling real-time predictions.
User-Friendly Interface: Dropdown menus, number inputs, and buttons make the app intuitive 
            for users to interact with.
5. Reproducibility is Key 
By saving the model pipeline,
     preprocessing steps, and cleaning logic, the workflow becomes replicable
     for future iterations or for expanding the app to other cities.
8. Web Application Deployment Completes the Workflow 
Deploying the app on platforms
     like Streamlit Cloud ensures the solution is accessible to end-users,
     bridging the gap between data science and real-world applications
These learnings emphasize the importance of a comprehensive workflow that spans data
cleaning, EDA, model selection, and deployment, culminating in an interactive
and practical tool for predicting house prices in Bangalore.
Summary
Comprehensive Data Preprocessing
Handling missing data, inconsistent formats, and outliers ensured the dataset was clean and 
ready for modeling.
Features such as price_per_sqft were engineered, and categorical variables (e.g., area_type and location) 
were handled using OneHotEncoding.
Pipeline Creation and Model Selection
A Pipeline was created using ColumnTransformer, StandardScaler, and XGBoostRegressor.
XGBoost was selected for its high R² score, demonstrating its suitability for the task.
Real-Time Web Application
A user-friendly Streamlit interface enabled interactive predictions based on inputs like area type, 
location, and square footage.
The app effectively bridges the gap between machine learning and end-user interaction.
The trained model (Model.pkl) was integrated into a Streamlit app, allowing real-time predictions.
Conclusion
This project highlights the synergy between data science and web development:
Clean data and powerful models like XGBoost deliver accurate predictions.
Streamlit apps able non-technical users to interact with machine learning models seamlessly.
The structured pipeline ensures scalability and easy adaptability for other datasets or cities.

Search This Blog

Machine Learning Projects

Banglore House Price Prediction (Deploying Realtime) and Key Learnings of the project

Building a Web App for House Price Prediction: Deploying the Machine Learning Model

Comments

Post a Comment