Make an Elastic Mapreduce application to search a large dataset

CLOSED
Bids
9
Avg Bid (USD)
$609
Project Budget (USD)
$250 - $750

Project Description:
We are conducting early-stage research for a business idea relating to Wordpress.

We have identified a pre-crawled corpus of web documents here:

http://aws.amazon.com/datasets/41740

We are seeking someone who can make an Elastic Mapreduce application to search the corpus using techniques explained here: https://aws.amazon.com/amis/common-crawl-quick-start

We want to grep for the URLs that are Wordpress sites (perhaps by searching for the text wp-content in the HTML code or some other similar technique).

Once we have the large list of Wordpress sites there will be further jobs available but the first step is producing this list.

Skills required:
Amazon Web Services, Big Data, Data Mining, Software Architecture
Hire sekurely
Project posted by:
sekurely United States
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


Hire bradaric
$ 700
in 2 days
Hire alphaedge999
$ 1000
in 3 days
Hire tcly315
$ 666
in 3 days
Hire shomratkutub
$ 250
in 3 days
Hire gauravkumar37
$ 765
in 5 days
Hire helmot
$ 499
in 3 days
Hire topmoose
$ 650
in 3 days
Hire JuventusMaximus
$ 700
in 7 days
Hire jeremtank
$ 250
in 5 days
$ 250
in 1 days