HTML Scraping - repost

Looking for someone to build HTML scrapers for various websites to scrape, clean and collect the data in a database.

The code should run easily and be able to generate CVS files etc, or save in a local space

you will be required to provide code for scraping 30 websites at least, websites you can see that are of the type include William Hill, [url removed, login to view], World Betting Exchange...

you will need to parse the webpages, normalise the data, build a UI and present basic functionalities.

every line of code will need to be documented

data will need to be stored and cleaned, normalised and saved on AWS, and on CSV files.

You will need to commit to an NDA before the project is assigned.

All real time data (odd values, team details etc.) will be scrapped from various bookie sites and exchanges using their individual APIs/feeds; the script for this purpose will be written in PHP and HipHop VM will be used to achieve superior performance with a just-in-time compilation approach in C++. or Java All processed data will then be passed/ streamed to the TRIDENT API which will be integrated at the top of the Storm.

Skills: Engineering, MySQL, PHP, Software Architecture, Website Management

See more: cvs engineering, aws engineering, scrape html, parse html, parse an html, build html, betting exchange, aws database, html cvs, generate html, betting scraping, exchange space, html parse php, repost exchange, html parse, csv scraping, clean csv files, save html database, parse data csv, php clean csv, php html parse, html scrape, generate html csv, scrape csv php, parse html files database

About the Employer:
( 0 reviews ) United Kingdom

Project ID: #4785094