Melbourne, Australia Housing Price Prediction
- Reason For Our Project
- Data Source
- Technologies
- Approach
- Flowchart
- Extract, Transform, Load
- Exploratory Data Analysis
- Regression Model Results
- Best model
- The Website
- Going Forward
- Contributors
For the average consumer, finding appropriate housing within a set budget can be a daunting task. Predicting housing prices are expected to help people who plan to buy a house so they can know the price range in the future and be able to plan their finances. Housing price predictions are also beneficial for property investors to know the trend of housing prices in a certain location.
This is a house prices dataset for Melbourne, Australia from September 2017 to September 2019. This data was scraped from publicly available results posted every week from Domain.com.au. The dataset includes address, type of real estate, suburb, method of selling, rooms, price, real estate agent, date of sale, property size, land size, council area and distance from the Central Business District. From the dataset provided, we chose to clean our data to focus on the important features to be used within our machine learning models to determine which would give us the best predicted housing price.
- Jupyter Notebook
- Python
- Numpy
- Pandas
- Seaborn
- Matplotlib
- SciPy
- PostgreSQL/ SQLAlchemy
- Scikit-learn
- Flask
- HTML/CSS
- Heroku
- Identify data source
- Collect and clean Melbourne, Australia housing data
- Normalized target feature for better model results
- Create charts and graphs using Pyplot
- Load data into PostgreSQL
- Transform data to be fitted into models
- Test varying machine learning models and determine best option
- Customize HTML and CSS for final application
- Develope Flask application for model deployment
- Visualize dashboard in Heroku
The data was provided to us in a CSV. We checked to verify the datatypes, dropped any columns we didn't need and renamed some of the columns we kept. We also checked for any null values in the dataset and dropped them. Once our data checked out, we made sure not to have any duplicate data values. From there, we plot a heatmap to determine which features are important to our target feature and drop those we don't need.
After being satisfied with our cleaned data, we proceeded to our data exploration. We used Matplotlib and Seaborn to do visualizations and visualized many different aspects of our data to see how it reads.
We normalized our target feature to get a better outcome on our predictions.
We checked the relationship of price with bedrooms and bathrooms.
- Visualize data to show housing price trends on a map through an application such as Tableau
- Scrape more recent data and see if the trends and predictions hold true
- Update the look and feel of the website to make it more user friendly
- Redefine the parameters of our models to try and fix possible overfitting