Churn Prediction Project

This project predicts customer churn using an artificial neural network (ANN) model trained on a dataset with features such as geography, gender, age, balance, credit score, tenure, number of products, has credit card, isActiveMember, estimated salary, and exit status. After thorough preprocessing, including feature scaling and encoding, the ANN model was developed to classify customers as likely to churn or retain. The solution is deployed on Streamlit Cloud, providing a user-friendly interface for real-time churn predictions, enabling businesses to implement proactive retention strategies.

Project Overview:

Problem Statement: Predict whether a customer is likely to churn and estimate their salary.

Goal: By estimating customer salary, the bank can target potential customers for loans, increasing business opportunities. Predicting customer churn enables the bank to take proactive measures to retain customers and reduce churn rates.

Solution: An Artificial Neural Network (ANN) model is used to train and predict both churn likelihood and customer salary

Generated Project Structure:

project_name/
│
├── data/                           # Data-related files
│   ├── raw/                        # Raw, unprocessed data files
│   └── processed/                  # Processed data files (if applicable in the future)
│
├── notebooks/                      # Jupyter notebooks for EDA and experimentation
│   ├── experiments.ipynb           # Experimentation notebook
│   ├── prediction.ipynb            # Prediction notebook
│   └── salaryregression.ipynb      # Salary regression notebook
│
├── pickle/                         # Pickle files for preprocessing
│   ├── label_encoder_gender.pkl    # Label encoder for gender
│   ├── onehot_encoder_geo.pkl      # One-hot encoder for geographical data
│   └── scaler.pkl                  # Scaler for normalization
│
├── logs/                           # Classification specific logs
│   ├── train/                      # Training logs
│   └── validation/                 # Validation logs
│
├── regression_logs/                # Regression-specific logs
│   ├── train/                      # Training regression logs
│   └── validation/                 # Validation regression logs
│
├── models/                         # Saved models
│   ├── models.h5                   # Classification model file
│   └── regression_model.h5         # Regression model file
│
├── reports/                        # Reports and visualizations
│   └── figures/                    # Plots and visualizations
│
├── app.py                          # Main application script
├── streamlit_regression.py         # Streamlit app for the regression model
├── requirements.txt                # Dependencies and libraries
├── README.md                       # Project overview
└── .gitignore                      # Git ignore file

Dataset Information:

Credit Score: The customer's credit score, which reflects their creditworthiness.
Geography: The geographic region where the customer resides (Spain, France, Germany).
Gender: The gender of the customer (Male or Female).
Age: The age of the customer.
Tenure: The length of time the customer has been with the bank.
Balance: The current bank balance of the customer.
Number of Products: The total number of products the customer uses with the bank.
Has Credit Card: Whether or not the customer holds a credit card with the bank.
Is Active Member: Indicates how actively the customer is using the bank's services.
Estimated Salary: The estimated annual salary of the customer.
Exited: Whether or not the customer is likely to churn (exit the bank).

Project Workflow:

Data Collection:
The dataset was obtained from Kaggle as the primary resource.
Data Preprocessing:
Data preprocessing was performed using the Pandas library in Python, which involved cleaning and transforming the data.
Feature Engineering:
Relevant features were selected and engineered to enhance model performance.
Model Selection:
Artificial Neural Networks (ANN) were chosen as the model for predicting customer churn and estimating salary.
Model Training and Optimization:
The ANN model was trained using the preprocessed data, followed by optimization to improve accuracy and performance.
Prediction:
The model predicts both the estimated salary of the customer and whether the customer is likely to churn.
Deployment:
The model was deployed on Streamlit Cloud for interactive real-time predictions, and the code was checked into GitHub for version control and contribution tracking.

Installation:

Provide the steps for setting the environment

Clone the repository

git clone https: https://github.com/saichakka10/ANN-Churn-Prediction.git

Change directory

cd ANN-Churn-Prediction

Create a conda environment

conda create -p venv python==3.11 -y

Activate conda environment

conda activate venv

Install required packages

pip install -r requirements.txt

Deactivate the environment

conda deactivate

Usage:

Experiments:

The experiments.ipynb file is used for data cleaning, preprocessing, splitting the dataset into training and testing sets, and implementing an Artificial Neural Network (ANN) for the classification task. The model predicts whether the customer is likely to churn or not.

Prediction:

The prediction.ipynb file is used to make predictions based on the trained classification model. It outputs the likelihood of customer churn.

Salary Regression:

The salaryregression.ipynb file is used for data cleaning, preprocessing, splitting the dataset into training and testing sets, and implementing an ANN model for regression. This model predicts the estimated salary of the customer.

Evaluation:

Metrics Used:

Classification:
Accuracy is used as the evaluation metric to measure how well the model classifies customer churn.
Regression:
The Mean Absolute Error (MAE) is used to evaluate the regression model's performance in predicting the customer's salary.

Contributing:

Fork the repository.
Create a new branch:
```
git checkout -b feature-branch
```
git commit -m 'Add feature'
git push origin feature-branch
This version is structured with better clarity and consistency, making it easier to read and follow the steps.

Deployment:

The deployment of the project has been done on Streamlit Cloud for enhanced user interaction. The following files are used in the deployment:

app.py: This file serves as the main application that provides an interactive user interface. It facilitates the prediction of customer churn, allowing users to input data and view results in real-time.
streamlit_regression.py: This file specifically handles the regression model used for predicting the customer’s estimated salary. It ensures that users can interact with the model and get immediate predictions.

Streamlit Cloud is a cloud-based platform that simplifies the process of deploying and sharing machine learning models as interactive web apps. Streamlit Cloud allows developers to deploy apps directly from a GitHub repository without the need to manage infrastructure. It automatically updates whenever the repository is updated, streamlining deployment and making it easy to share your projects with collaborators or stakeholders.

Key Features of Streamlit Cloud:

Interactive Interface: Streamlit provides an easy-to-use interface to display data, plots, and real-time predictions.
Quick Deployment: With minimal setup, you can deploy your app directly from GitHub, making it accessible to others instantly.
Automatic Updates: When changes are made to the codebase or GitHub repository, the app is automatically updated without requiring manual intervention.
Easy Sharing: You can share your app with others by simply providing a link, making it ideal for presentations and collaborations.

License:

This project is licensed under the MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Churn Prediction Project

Table of Contents:

Project Overview:

Generated Project Structure:

Dataset Information:

Project Workflow:

Installation:

Clone the repository

Change directory

Create a conda environment

Activate conda environment

Install required packages

Deactivate the environment

Usage:

Experiments:

Prediction:

Salary Regression:

Evaluation:

Metrics Used:

Contributing:

Deployment:

Key Features of Streamlit Cloud:

License:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Churn_Modelling.csv		Churn_Modelling.csv
LICENSE		LICENSE
README.md		README.md
app.py		app.py
experiments.ipynb		experiments.ipynb
hyperparametertuningann.ipynb		hyperparametertuningann.ipynb
label_encoder_gender.pkl		label_encoder_gender.pkl
model.h5		model.h5
onehot_encoder_geo.pkl		onehot_encoder_geo.pkl
prediction.ipynb		prediction.ipynb
regression_model.h5		regression_model.h5
requirements.txt		requirements.txt
salaryregression.ipynb		salaryregression.ipynb
scaler.pkl		scaler.pkl
streamlit_regression.py		streamlit_regression.py

License

saichakka10/Churn-Prediction-Using-ANN

Folders and files

Latest commit

History

Repository files navigation

Churn Prediction Project

Table of Contents:

Project Overview:

Generated Project Structure:

Dataset Information:

Project Workflow:

Installation:

Clone the repository

Change directory

Create a conda environment

Activate conda environment

Install required packages

Deactivate the environment

Usage:

Experiments:

Prediction:

Salary Regression:

Evaluation:

Metrics Used:

Contributing:

Deployment:

Key Features of Streamlit Cloud:

License:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages