This project uses machine learning models (XGBoost, Random Forest, SVR) to predict electricity consumption of urban buildings based on the Seattle Building Energy Benchmarking 2016 dataset. It includes data preprocessing, model training, evaluation, and a fully interactive Flask dashboard.
Predicting-Urban-Building-Electricity-Consumption/
├── app.py # Flask web application
├── main.py # Main ML pipeline controller
├── data/
│ └── 2016-building-energy-benchmarking.csv
├── models/
│ └── train_xgboost.py # XGBoost training logic
├── preprocessing/
│ └── clean_data.py # Data preprocessing pipeline
├── evaluation/
│ └── evaluate_models.py # Model evaluation and plotting
├── outputs/
│ ├── X_train.csv, y_test.csv ...
│ ├── predictions_xgb.csv
│ ├── model_xgb.pkl
│ └── charts/
│ ├── feature_importance_xgb.png
│ ├── predicted_vs_actual_all_models.png
│ ├── model_comparison_metrics.png
│ └── residuals_analysis.png
├── templates/
│ ├── index.html
│ └── dashboard.html
└── static/ (optional)
-
Clone the repository (or copy the project folder)
-
Create virtual environment (optional but recommended)
python -m venv venv venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt
-
Run the ML Pipeline
python main.py
-
Launch the Dashboard
python app.py
| Module | Description |
|---|---|
| ✅ Preprocessing | Handles missing values, encoding, scaling, and feature selection |
| ✅ XGBoost | Trained with both default and tuned parameters |
| ✅ Evaluation | Calculates MAE, RMSE, R² + residual analysis and feature importance charts |
| ✅ Dashboard | Interactive Flask web app with chart visualization and modal enlargement |
Source: Kaggle - Seattle Building Energy Benchmarking 2016
- Total samples: ~3,300
- Features: Property type, GFA, EnergyStar score, electricity usage, etc.
- Target:
SiteEnergyUse(kBtu)
This project is developed for academic and educational purposes under Multimedia University (MMU).

