Made byBobr AI

Airbnb Business Analytics: NYC Insights with Orange Data Mining

Explore NYC Airbnb listing insights using clustering, regression, and data mining. Learn key pricing factors and business strategies for hosts and investors.

#airbnb-analytics#data-mining#orange-software#business-intelligence#market-analysis#nyc-data#k-means-clustering#predictive-modeling
Watch
Pitch
Academic Project

Business Analytics Using Orange Data Mining Tools on the Airbnb Dataset

Global Short-Term Accommodation Marketplace for Tourist Services

Business Analytics | Orange Data Mining | Airbnb Dataset

Made byBobr AI

Introduction

icon

The purpose of this project is to apply business analytics methods using a real life data set and to create business intelligence by using Orange Data Mining tool.

icon

To carry out the analysis I used the New York City Airbnb Open Data, this data includes all the listings available on the platform along with prices, locations, types of accommodation and how engaged the listings are.

icon

In order to maximize returns on investment of Airbnb listings in Boston, it is valuable to perform analysis on existing listings. To carry out such an analysis, we employed clustering, regression, and association rule mining techniques to gain better insight into the large number of listings.

02
Made byBobr AI

Dataset Understanding

This dataset contains millions of listings on Airbnb across the entire city of New York. It provides information about pricing, availability, and buyer behavior.

Millions of Listings
New York City
Pricing, Availability & Behavior
Business Analytics Capstone
03
Made byBobr AI

Key Variables

Price

Cost per night

Neighbourhood Group

Location category

Room Type

Entire home/Apartment, Private room, Shared room

Ava (Cap)

Abbreviation for Availability and Capacity. Cap has become the standard term for these two parameters

Number of Reviews

Customer engagement

Reviews per Month

Listing popularity

Minimum Nights

Booking requirement

04
Made byBobr AI
05

Business Relevance

This data is useful for anyone wishing to see the demand and price levels for different areas to help Airbnb hosts, property investors and any hospitality business set their prices appropriately.

Airbnb Hosts

Property Investors

Hospitality Businesses

Made byBobr AI

Objectives

1

Analyze Pricing Patterns

Explore price distribution across neighborhoods and property types

2

Identify Key Influencing Factors

Determine which variables most affect listing prices

3

Segment Listings into Meaningful Groups

Apply clustering to discover natural groupings in the data

4

Generate Actionable Business Insights

Provide recommendations for hosts, investors, and hospitality businesses

Business Analytics | Airbnb Dataset06
Made byBobr AI

Data Cleaning & Preprocessing

Data was preprocessed using Orange Data Mining suite.

Removed unnecessary variables (ID, listing name, host name) and replaced missing values in reviews_per_month with 0.

Removed all duplicate records and extreme values greater than 1000. Room type and neighbourhood group fields formatted correctly.

Numerical information normalised. A new feature created categorising listings into low, medium, or high price.

Steps Performed

1
Remove unnecessary variables
2
Replace missing values
3
Remove duplicates & outliers
4
Normalize numerical features
5
Create price category feature
07
Made byBobr AI

Data Analysis & Descriptive Statistics

Location Pricing

Listings in Manhattan are significantly more expensive than those in other neighborhoods.

Listing Types

The majority of listings are for private rooms in order to afford the cost of housing.

Price Range

There is room for affordable options and luxury pricing at the same time. Very little correlation between number of reviews and price.

Results show a connection between location/property type and both listing value and market image.

Business Analytics | Airbnb Dataset08
Made byBobr AI
Machine Learning

Clustering (K-Means)

We applied K-Means clustering to segment the listings by price, availability and number of reviews.

09
A

Cluster A — Premium

High prices and low availability for booking. Exclusive, high-demand listings.

B

Cluster B — Moderate

Moderate prices and moderate availability. Second largest segment.

C

Cluster C — Budget

Low price and high availability. Highly competitive segment.

Made byBobr AI

Regression Analysis

A regression analysis was performed to examine the variables that affect the price.

The best predictor of a listing's price is whether it is located in Manhattan and whether it is an entire home — far exceeding the impact of any other variable.

Top Price Predictors

Manhattan LocationHigh
Entire Home TypeHigh
AvailabilityMedium
Number of ReviewsLow
Minimum NightsLow
Business Analytics | Airbnb Dataset10
Made byBobr AI
Pattern Discovery

Association Rule Mining

Association rule mining was applied to discover co-occurrences of attributes in the dataset.

11

Entire Home → Higher Price

Entire home listings consistently commanded higher prices across all neighborhood groups.

Higher Availability → Lower Price

Listings with higher availability tended to command lower prices, indicating lower demand.

We will investigate these factors further to gain more insight into customer preferences and supply and demand.

Made byBobr AI

Data Visualization

Bar Chart

Bars indicate average price per neighborhood, with Manhattan the priciest.

Pie Chart

Distribution of listings categorized as private, shared, or other.

Scatter Plot

Very weak correlation between price and reviews with many outliers.

Box Plots

Highlight variation in price within many of the neighborhoods.

Business Analytics | Airbnb Dataset12
Made byBobr AI

Business Recommendations

After reviewing the analysis, several improvements can be implemented.

1

Dynamic Pricing Strategy

Incorporate dynamic pricing based on demand, location and property type.

2

Whole Home Listings

List the whole house as one booking and place it in areas with higher demand.

3

Listing Optimization

Optimize the listing for better descriptions, images, and customer service to gain more visibility.

4

Booking Incentives

Offer a lower nightly rate for longer stays and price down during off-peak periods to encourage more bookings.

Business Analytics | Airbnb Dataset13
Made byBobr AI

Limitations

Acknowledging the constraints of this study helps in contextualizing the findings and improving future research.

14

Geographic Scope

The calculations are based on figures for one city only.

External Factors

No allowance has been made for factors such as time of year, market conditions and regulations.

Made byBobr AI

Assumptions

Foundational premises underlying the analysis

Customer Engagement

Customer reviews are a good indicator for engagement.

Pricing Trends

Pricing trends are assumed to be constant.

Business Analytics | Airbnb Dataset15
Made byBobr AI
Final Summary

Conclusion

This project applied business analytics methods using Orange Data Mining to extract insights from real-world data.

The analysis shows that pricing is mainly affected by location and property type.

Business analytics helps in making better decisions.

OD

Orange Data Mining

Clustering · Regression · Association Rules

NYC Airbnb Open Data

16
Made byBobr AI
Bobr AI

DESIGNER-MADE
PRESENTATION,
GENERATED FROM
YOUR PROMPT

Create your own professional slide deck with real images, data charts, and unique design in under a minute.

Generate For Free

Airbnb Business Analytics: NYC Insights with Orange Data Mining

Explore NYC Airbnb listing insights using clustering, regression, and data mining. Learn key pricing factors and business strategies for hosts and investors.

Academic Project

Business Analytics Using Orange Data Mining Tools on the Airbnb Dataset

Global Short-Term Accommodation Marketplace for Tourist Services

Business Analytics | Orange Data Mining | Airbnb Dataset

Introduction

The purpose of this project is to apply business analytics methods using a real life data set and to create business intelligence by using Orange Data Mining tool.

To carry out the analysis I used the New York City Airbnb Open Data, this data includes all the listings available on the platform along with prices, locations, types of accommodation and how engaged the listings are.

In order to maximize returns on investment of Airbnb listings in Boston, it is valuable to perform analysis on existing listings. To carry out such an analysis, we employed clustering, regression, and association rule mining techniques to gain better insight into the large number of listings.

02

Dataset Understanding

This dataset contains millions of listings on Airbnb across the entire city of New York. It provides information about pricing, availability, and buyer behavior.

Millions of Listings

New York City

Pricing, Availability & Behavior

Business Analytics Capstone

03

Key Variables

Price

Cost per night

Neighbourhood Group

Location category

Room Type

Entire home/Apartment, Private room, Shared room

Ava (Cap)

Abbreviation for Availability and Capacity. Cap has become the standard term for these two parameters

Number of Reviews

Customer engagement

Reviews per Month

Listing popularity

Minimum Nights

Booking requirement

04

Business Relevance

This data is useful for anyone wishing to see the demand and price levels for different areas to help Airbnb hosts, property investors and any hospitality business set their prices appropriately.

Airbnb Hosts

Property Investors

Hospitality Businesses

Objectives

Analyze Pricing Patterns

Explore price distribution across neighborhoods and property types

Identify Key Influencing Factors

Determine which variables most affect listing prices

Segment Listings into Meaningful Groups

Apply clustering to discover natural groupings in the data

Generate Actionable Business Insights

Provide recommendations for hosts, investors, and hospitality businesses

Data Cleaning & Preprocessing

Data was preprocessed using Orange Data Mining suite.

Removed unnecessary variables (ID, listing name, host name) and replaced missing values in reviews_per_month with 0.

Removed all duplicate records and extreme values greater than 1000. Room type and neighbourhood group fields formatted correctly.

Numerical information normalised. A new feature created categorising listings into low, medium, or high price.

Data Analysis & Descriptive Statistics

Listings in Manhattan are significantly more expensive than those in other neighborhoods.

The majority of listings are for private rooms in order to afford the cost of housing.

There is room for affordable options and luxury pricing at the same time. Very little correlation between number of reviews and price.

Results show a connection between location/property type and both listing value and market image.

Clustering (K-Means)

We applied K-Means clustering to segment the listings by price, availability and number of reviews.

Cluster A — Premium

High prices and low availability for booking. Exclusive, high-demand listings.

Cluster B — Moderate

Moderate prices and moderate availability. Second largest segment.

Cluster C — Budget

Low price and high availability. Highly competitive segment.

Regression Analysis

A regression analysis was performed to examine the variables that affect the price.

The best predictor of a listing's price is whether it is located in Manhattan and whether it is an entire home — far exceeding the impact of any other variable.

Association Rule Mining

Association rule mining was applied to discover co-occurrences of attributes in the dataset.

Entire Home → Higher Price

Entire home listings consistently commanded higher prices across all neighborhood groups.

Higher Availability → Lower Price

Listings with higher availability tended to command lower prices, indicating lower demand.

We will investigate these factors further to gain more insight into customer preferences and supply and demand.

Data Visualization

Bars indicate average price per neighborhood, with Manhattan the priciest.

Distribution of listings categorized as private, shared, or other.

Very weak correlation between price and reviews with many outliers.

Highlight variation in price within many of the neighborhoods.

Business Recommendations

After reviewing the analysis, several improvements can be implemented.

Incorporate dynamic pricing based on demand, location and property type.

List the whole house as one booking and place it in areas with higher demand.

Optimize the listing for better descriptions, images, and customer service to gain more visibility.

Offer a lower nightly rate for longer stays and price down during off-peak periods to encourage more bookings.

Limitations

The calculations are based on figures for one city only.

No allowance has been made for factors such as time of year, market conditions and regulations.

Assumptions

Customer reviews are a good indicator for engagement.

Pricing trends are assumed to be constant.

Conclusion

This project applied business analytics methods using Orange Data Mining to extract insights from real-world data.

The analysis shows that pricing is mainly affected by location and property type.

Business analytics helps in making better decisions.

  • airbnb-analytics
  • data-mining
  • orange-software
  • business-intelligence
  • market-analysis
  • nyc-data
  • k-means-clustering
  • predictive-modeling
Airbnb Business Analytics: NYC Insights with Orange Data Mi… | Bobr AI