Airbnb Business Analytics with Orange Data Mining
Discover how to use Orange Data Mining for Airbnb market analysis, including K-Means clustering, regression, and pricing strategies for NYC listings.
Business Analytics Using Orange Data Mining Tools on the Airbnb Dataset
Global Short-Term Accommodation Marketplace for Tourist Services
Data Mining
Business Analytics
Airbnb Dataset
Academic Research Project | Orange Data Mining | NYC Airbnb Open Data
Introduction
The purpose of this project is to apply business analytics methods using a real life data set and to create business intelligence by using Orange Data Mining tool.
To carry out the analysis I used the New York City Airbnb Open Data, this data includes all the listings available on the platform along with prices, locations, types of accommodation and how engaged the listings are.
In order to maximize returns on investment of Airbnb listings in Boston, it is valuable to perform analysis on existing listings. To carry out such an analysis, we employed clustering, regression, and association rule mining techniques to gain better insight into the large number of listings.
02
Business Analytics | Orange Data Mining | Airbnb Dataset
Dataset Understanding
This dataset contains millions of listings on Airbnb across the entire city of New York. It provides information about pricing, availability, and buyer behavior.
Millions of Listings
NYC Coverage
Pricing & Availability Data
03
Business Analytics | Orange Data Mining | Airbnb Dataset
Key Variables
Core dataset attributes used in the analysis
04
Business Analytics | Orange Data Mining | Airbnb Dataset
Price
Cost per night
Neighbourhood Group
Location category
Room Type
Entire home/Apartment, Private room, Shared room
Ava (Cap)
Abbreviation for Availability and Capacity
Number of Reviews
Customer engagement
Reviews per Month
Listing popularity
Minimum Nights
Booking requirement
Business Relevance
This data is useful for anyone wishing to see the demand and price levels for different areas to help Airbnb hosts, property investors and any hospitality business set their prices appropriately.
Airbnb Hosts
Optimize daily pricing to maximize occupancy rates
Property Investors
Identify lucrative neighborhoods for future investment
Hospitality Businesses
Analyze local market demand to refine market strategies
05
Business Analytics | Orange Data Mining | Airbnb Dataset
Objectives
01
Analyze pricing patterns
02
Identify key influencing factors
03
Segment listings into meaningful groups
04
Generate actionable business insights
06
Business Analytics | Orange Data Mining | Airbnb Dataset
Data Cleaning & Preprocessing
Removed unnecessary variables (ID, listing name, host name)
Replaced missing values in reviews_per_month with 0
Removed duplicate records & extreme values (>1000)
Removed outliers from the dataset
Formatted room type and neighbourhood group fields
Normalised numerical information
Created new price category feature (low/medium/high)
Data was preprocessed using Orange Data Mining suite.
07
Business Analytics | Orange Data Mining | Airbnb Dataset
Data Analysis & Descriptive Statistics
Manhattan Pricing
Listings in Manhattan are significantly more expensive than those in other neighborhoods.
Room Type Distribution
The majority of listings are for private rooms in order to afford the cost of housing.
Review vs Price
Very little correlation between the number of customer reviews and the price of a product.
Results show a connection between location/property type and both listing value and market image.
Affordable + Luxury
Room for affordable options and luxury pricing at the same time
Low Correlation
Reviews and pricing show very little direct relationship
08
Clustering (K-Means)
We applied K-Means clustering to segment the listings by price, availability and number of reviews.
Cluster 1: Premium
High Prices
Low availability for booking. Premium class listings.
Cluster 2: Moderate
Moderate Prices
Moderate availability. The second largest segment.
Cluster 3: Budget
Low Prices
High availability. Highly competitive segment.
09
Business Analytics | Orange Data Mining | Airbnb Dataset
Regression Analysis
A regression analysis was performed to examine the variables that affect the price.
Top Predictor #1
Manhattan Location
Whether a listing is located in Manhattan is the strongest predictor of its price.
Top Predictor #2
Entire Home
Whether a listing is an entire home far exceeds the impact of any other variable.
According to an analysis of nearly 500,000 listings on Airbnb, prices for these two factors far exceed the impact of any other variable.
10
Business Analytics | Orange Data Mining | Airbnb Dataset
Association Rule Mining
In this study, association rule mining was applied to discover co-occurrences of attributes.
Entire Home Listings
Higher Prices
Entire home listings commanded higher prices
Higher Availability
Lower Prices
Listings with higher availability commanded lower prices
We will investigate these factors further to gain more insight into customer preferences and supply and demand.
11
Business Analytics | Orange Data Mining | Airbnb Dataset
Data Visualization
We provided graphical representations to support the analysis.
Bar Chart
Bars indicate average price per neighborhood, with Manhattan the priciest.
Pie Chart
Distribution of listings categorized as private, shared, or other.
Scatter Plot
Very weak correlation between price and reviews with many outliers.
Box Plot
Highlights variation in price within many of the neighborhoods.
It is often helpful to present the results of a regression in graphical form.
12
Business Analytics | Orange Data Mining | Airbnb Dataset
Business Recommendations
After reviewing your numbers, a few things can be done to improve this.
Dynamic Pricing Strategy
Incorporate dynamic pricing based on demand, location and property type.
Whole Home Listings
List the whole house as one booking and place it in areas with higher demand.
Listing Optimization
Optimize the listing for better descriptions, images, and customer service to gain more visibility.
Flexible Pricing
Offer a lower nightly rate for longer stays and pricing down during off-peak to encourage more bookings.
13
Business Analytics | Orange Data Mining | Airbnb Dataset
Limitations
Geographic Scope
The calculations are based on figures for one city only.
External Factors
No allowance has been made for factors such as time of year, market conditions and regulations.
14
Business Analytics | Orange Data Mining | Airbnb Dataset
Assumptions
Customer Engagement
Customer reviews are a good indicator for engagement.
Pricing Trends
Pricing trends are assumed to be constant.
15
Business Analytics | Orange Data Mining | Airbnb Dataset
Conclusion
01
Business Analytics Applied
This project applied business analytics methods using Orange Data Mining to extract insights from real-world data.
02
Key Finding
The analysis shows that pricing is mainly affected by location and property type.
03
Impact
Business analytics helps in making better decisions.
Thank You
16
Academic Research Project | Orange Data Mining | NYC Airbnb Open Data
- data-mining
- business-analytics
- orange-data-mining
- airbnb-dataset
- market-analysis
- k-means-clustering
- nyc-data