You have chosen to sponsor your bid up to a maximum amount of .
We are looking a competent PHP programmer who can take the time to get familiar with a simple scraping framework based on QueryPath v3.0 querypath.org/) (we will provide the source code and documentation) and extend it to scrape more websites. You don't need to be familiar with QueryPath as long as you have some familiarity with the jQuery syntax and good knowledge of PHP.
Specific information related to products is required to be scraped from each website, which should then be output as a file containing a JSON data set.
More details on the JSON format can be found here: docs.google.com/document/d/1NGsstkbMZLorkLn1Mk5FGNkVZJjYHxjNaSGqpwl9z78/edit?usp=sharing
A sample JSON file can be found here:
Our preferred workflow is to use Github. We have a repository for the simple scraper framework that you can fork and then commit your scraper code to before doing a pull request. However, work should also be submitted (as a PHP file) through Elance for audit purposes.
For your proposal, please consider http://uae.souq.com/ae-en/ to be representative of the complexity of website to be scraped.
Can communicate on Skype
Provide prompt updates on progress
Begin work within 24 hours after the job being awarded to you.
Complete development of a scraper in five (5) days.
Note: we will be available every day to answer your questions promptly through e-mail.
Upon awarding the job, a milestone will be created for each scraper and payment placed in escrow for the milestone. Milestone dates will be set as per the required timeline outline above.
Delays in scraper submission will lead to payment deduction of 10% for every day after the five day limit per scraper.
The following will lead to job cancellation:
If you are not responsive (i.e. ignore our communications for more than a day).
Repetitive delays in submission of work.
Not adhering to other requirements of the job (i.e. cannot commit work on Github)