BASIC EMR ON MAP-R V3 AND AWS S3 - repost

CLOSED
Bids
5
Avg Bid (EUR)
2422
Project Budget (EUR)
€750 - €1500

Project Description:
1- Scope:

Providing with a basic Hadoop Map-R V.3 environment over Amazon Web Services. Basic trial environment in this phase. No need to provide 24 x 7 tools or extra code.

Main aim is to analyse data from several text S3 input sources and start trial period.


2- Tools:

We provide Project AWS account for the Project and Map-R V.3 Hadoop clusters. Free administration for implementing this project.


3- Deliverables:

- Scripts code for AWS API based automatic MAP-R V.3 Set-up for a given number of masters and computing nodes.
- Set up scripts capable of using EC2 on “demand nodes”
o For real time 24x 7 live queries
o For batch night processes.
- Java basic code for providing basic routines like:
o Joints tables form several text sources.
o Gauss statistics: Mean, deviation, etc.
o Basic counting and basic mathematics routines.
o Output text or Mysql computed tables.

- skype sessions for 4 hours to train skilled informatics from de php and javascript world.

- Documented source code.


4- Input sources:

The project is intended for analysing and creating logs joints form distant connected devices and central text tables.

- Several TEXT files for remote devices stored on S3 files.
o Characteristics of remote devices (>400.000 TV sets)
• Brand
• Programed parameters
• Available channels o
• Geo location
o Log text of distant
• Real time logging of visits
• Number of visits
• Duration
• TV station tuned in in each moment
• Type home demographics where the device is installed.


o TV Stations programming scheduling
• Show type: movie, talk show, debate
• Start time, end time.
• Celebrities involved in the show.


6- Expected outputs.

- Several combinations of the above.

- - Mean time per TV set type expend in each type of show.

o Mean time
o Standard deviation
o Top celebrities watched

- Samples of joints form several sources.

- Real time queries set up in case of need real time response.

- Batch set up for long time consuming queries of whole set of queries.


7- Time table.

- Needed in four weeks / January end – first September week.
- We provide AWS zone with all the text sources inside ready for use.
- Week days 9- 18h CET e-mail /skype contact for immediate support for any doubt or clarification needs.

8- References:

- No project will be awarded without clear and outstanding references on hadoop implantations over AWS ,
- MAP-R is a plus.

Skills required:
Big Data, Hadoop, Java, Map Reduce, NoSQL Couch & Mongo
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


€ 1527
in 10 days
Hire innovese
€ 1444
in 20 days
€ 2888
in 60 days
€ 1250
in 20 days
€ 5000
in 7 days