•All the bike sharing activities
•Over 350 stations
•Over 13,000 trips a day this past summer
•Duration - Duration of trip•Start date – Includes start date and time•End date – Includes end date and time•Start station – Includes starting station name and number•End station – Includes ending station name and number•Bike # - Includes ID number of bike used for the trip•Member Type – Lists whether user was a Registered (annual or monthly) or Casual (1 to 5 day) member. NOTE: The 3-day membership replaced the 5-day during Fall '11.•AWS
Download, curate and organize the data so that you can query it when it is loaded on the Hadoop cluster•Load it on Big Data infrastructure:•Understand how to manage it on a cluster (AWS/VM)•Create processes for accessing and querying the data•Provide a set of query tools/scripts using Pig/Hive/Impala to query the data on the cluster•A large part of the process here will be to set standard query scripts on Pig and Hive/Impala to allow the user to examine the dataset
Final Purpose !!!!
Build a prediction model that predicts the demand for a certain station at a certain time.
6 freelancers are bidding on average $211 for this job
Hi, I have more than 3+ years of experience in hadoop technologies contact me for more details Relevant Skills and Experience Please have review on my profile Proposed Milestones $222 USD - Project fee