Data Overview

We used Yelp’s academic dataset, provided for Yelp Dataset Challenge - Round 9 and Zillow Nevada Neighborhood Boundaries Shape File.

Yelp Data Description

  • Geogrphic Coverage: 4 countries (US, UK, Germany, Canada) and 15 cities
  • Review Data: 4.1 million reviews from 1 million users for 144,000 businesses
  • Business Data: Restaurants, Beauty & Spas, Pet Services, Home Cleaning, etc.

For our project, we are only interested in the restaurant data in Las Vegas in the United States. Thus we use subset data files of this large dataset.

Business Data Table

The Business dataset includes the geogrphic information of business such as state,neighborhood and coordinates,name of Business (with business ID), categories, star ratings, Review Count, more than 1.1 million business attributes such as hours, parking availability, and ambience, etc. We created a subset which sampled restaurant business of the city of Las Vegas in Nevada, U.S. from “Business Dataset” for our comparative analysis. Following Data Table gives a flavor of the business data used for our anlaysis:

Review Data Table

Review Data contains Review text, Star ratings (out of 5), business ID, user ID, etc. We create a subset by sampling 10% of review data for restaurants in the city of Las Vegas of Nevada, U.S. from “Review Dataset” for our comparative analysis.