Data extraction/crawling from yelp and opentable

  • Status Closed
  • Budget $250 - $750 USD
  • Total Bids 21

Project Description

Hello, I need a web crawler to download and parse data from [url removed, login to view] and www.opentable.com. Mainly the product ratings, text and reviewer characteristics. You can send me the csv files or sql database file.

You will first need to match restaurants for the two websites in 6 major cities: New York, LA, Chicago, Houston, Philadelphia, Pheonix

After achieving matches, you will need to download variables such as ratings.

Variables to download (Yelp):

(a) Merchant Data: merchant name, website URL, Zip code, State, city, hours of operation, business website URL, parking, wifi, merchantID

(b) Review Data: Repeated for the same merchant (sort by time stamp): time stamp, rating, text, words, check-in, useful, funny, cool, reviewID, reviewerID

(c) Reviewer Data: number of friends, number of reviews, number of tips, number of fans, number of local photos, tenure (months since join), location (state and city), average of ratings (may be difficult), elite, reviewerID

Variables to download (opentable):

(a) Merchant Data: merchant name, website URL, merchantID

(b) Review Data: Repeated for the same merchant (sort by time stamp): time stamp, overall_rating, food_rating, ambiance_rating, service_rating, text, words, helpful, unhelpful, reviewID, reviewerID

(c) Reviewer Data: review count, first review date, featured reviews, average rating, reviewerID

Feel free to ask any questions.

Get free quotes for a project like this

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online