A Seattle deep dive in Airbnb homes

Felipe Saavedra
6 min readDec 30, 2020

A review of occupancy rate, pricing and factors that define a home price

Introduction

Over 4 million host share their homes on Airbnb, which is more than the top five hotel brands combined (according Business Insider). Of course, if you want to travel anywhere in the world the first stop is Airbnb.

In this post, we will explore a Seattle dataset of AirBnb, and compare it with Boston dataset. We will use two datasets in this analysis: description and features of homes and the availability and price per day.

The dataset of Seattle has 3,818 homes with availability information between January 2016 and January 2017 (file calendar). The same file for Boston contains 3,585 homes with availability between Septembre 2016 and Septembre 2017. Also we will see the features of the 3,818 homes from Seattle.

After looking the dataset, I propose three questions, based on — availability, price and features (and the relation with price). Specifically:

1. How many days are homes available vs. not available to rent? Is there any seasonality? How can we compare it in both cities?

2. What is the behavior of renting price in each city? And how is it compare it between both cities

3. Is there any feature in homes that predict the price for Seattle?

Question 1: How many days are homes available vs. not available to rent? Is there any seasonality? How can we compare it in both cities?

For the first question, I used the “calendar.cvs”, using the field available when the specific home is available for rent or not. Seattle and Boston have different behavior (see charts below).

Since both databases do not have the same period, the information was arranged by month-day (excluding year), in order to capture any seasonality effect between cities.

Seattle has a higher available rate than Boston (67% vs 49%), in addition the trend show that Seattle has a small but positive trend that indicates more people with available homes for rent. In the opposite side, Boston has a very flat availability rate (except between August and September). Another insight from the data is that Seattle has a peak in availability at the end and beginning of the year, however Boston is maintain the higher available rate between December and March.

An interesting point about availability is the difference between both cities in the number and percentage of homes that never were available in a year:

1.-homes without any availability all year in Seattle: 95 → 2.49% of total homes

2.-homes without any availability all year in Boston: 679 → 18.94% of total homes

When we see the histogram of how many days are available each home (during the year), we see that homes from Seattle homes are almost all year available, in the opposite side in Boston there are many homes with no or few days available days during the year.

Question 2: What is the behavior of renting price in each city? And how is it compare it between both cities

The mean of price in Seattle is lower than Boston, with 137.9 usd per night vs 198.4 (in Boston), this point could be related with the availability of the total homes capacity. In addition, it is interesting that the prices in Seattle are flatter than Boston (lower std).

When we see the evolution of prices during the year, we notice that prices, in Seattle, do not move with homes availability, however, in Boston we see that when the price goes up is the same moment that available homes go down. This fact shows that Boston is take into account the change in the market quickly (lower offer (with the same demand), higher prices).

According evolution of prices, Seattle does not show any trend with a small seasonality during the end of autumn. In the other side Boston shows a small increased price effect with 2 peaks, at the end of the year and during the second quarter, both peaks related with less availability of homes.

Question 3: Is there any feature in homes that predict the price for Seattle?

Finally, let us see if we can predict price houses, or at least see some relevant variables that define the price of homes. For this model, I will study the “listing.cvs” file from Seattle. This database shows 3,818 homes in Seattle with its main features.

One important issue of the file is that the “square_feet” column has many missing values (more than 97% of total), for this reason, it is impossible to use it, even when it sounds a potential variable to use.

To build the model it was require cleaning and transforming price data (our dependent variable) into a number. With this information, I studied the correlation between the price and the rest of independent variables and tried to understand what variables could be a potential good predictor. See detail below.

Based on the table above, I analyzed the correlation (using boxplot) between price and some independent variables (bedrooms, bathrooms, accommodates, type of room, type of bed and guest included).

The variables that I analyzed in the boxplot were the most relevant (changing the price), however we should test it in the model.

Using a linear regression analysis, I implemented a model with price as a dependent variable, and “accommodates”, “bathrooms”, “bedrooms” and “room type” as independent variables.

The results of model explain almost 60% of the behavior of price, as detailed in chart below, compare it real price data point (blue dots) with the model prediction (line)

Finally, an additional insight of the model is to determine betas. Beta could be interpreted as what is it the impact in price, when it is changed one unit of each independent variable. See table below for more detail:

According the table above, Bathrooms and bedrooms have similar impact in price, in other words when a home has an extra bathroom or bedroom the price will be increased around US$30 per day. At the opposite side when you have to share room you will pay US$74 less per day than when you rent an entire house/apt.

Conclusion

In this analysis we look at Airbnb data from Seattle and Boston to answer 3 questions related with home availability, pricing evolution and try to predict prices given a set of features.

Some useful insights were to find that in Boston the availability was lower than Seattle, with many hosts without a single available day during 1 year.

Another important point was that Boston is ore expensive than Seattle, also Boston homes showed more variability of prices (and related with availability), when availability went down, prices went up.

Finally, using Seattle listing database we find variables that allowed us to predict homes price with an adjustment of almost 60%. This information is useful if you want to rent your home and you need to determine the price according its features.

--

--

Felipe Saavedra
0 Followers

Businesses don’t create value; people do…combine this key point with analytics and strategic planning, that’s my passion