Data & Image Harvesting

IN PROGRESS
Bids
16
Avg Bid (USD)
$396
Project Budget (USD)
$250 - $750

Project Description:
PLEASE SEE ATTACHED SPREADSHEET(s) SAMPLES WITH REQUIRED FORMATTING FOR THE EXTRACTED DATA.

I need an expert scraper to develop a Windows based application (to be used on computer with Windows 8 OS) that can successfully scrape the following site.

https://www.doubletakeoffers.com/ (i only need the free coupons)

Here are the following requirements for the application:

1. Select USA states to query
2. Select Category

The Results are to be displayed in the preformatted excel spreadsheets that are attached.

The results needed from the site are as follows and show in the "this data.pdf" file:

1a. Logo
1. Merchant name
2. Merchant address & phone number
3. Accurate Lat/Long coordinates for address (under Directions link)
4. Product name
5. Product description
6. Web address link if applicable
7. Product expiration date (for free coupons)
8. I only need the "Free Coupons Deals" data and not the "Premium Deals" data (look at bottom of DoubleTake page after product search).


I need the "scraperspreadsheet with mediaxls" excel file populated with data as shown. I also need the logos saved in specific directories as shown in the "mediaxls" excel file and shown.

The logos shall be saved to directories with the following format: uplimg/state/category_name/item_id and is based on the state and category search box shown in the scraper layout.pdf

So the software will populate both spreadsheets for each record. The reason why I have two separate spreadsheets is because the software I am using was written with two separate but connected spreadsheets. My need is to have one "scraperspreadsheet with mediaxls" connect to the image directory in "mediaxls" spreadsheet.

The images should be scraped and saved in the appropriate directories. The files and images(i know they will take up a lot of space) will be saved on my large external hard drives... then uploaded to my server.

I need a standalone application that I can run as needed but the item_id cannot repeat or reset when I rerun the application.. I will pay top dollar for this application and have no budget limitations. Only serious bidders will be considered and I will pay top dollar if you can do this accurately. I WILL ACCEPT YOUR BID IF YOU UNDERSTAND THE REQUIREMENTS AND CAN EXECUTE PRECISELY.

Skills required:
PHP, Software Architecture, Web Scraping
Additional Files: this-data.png scraper layout.pdf mediaxls.pdf scraperspreadsheet with mediaxls.pdf mediaxls.xlsx scraperspreadsheet with mediaxls.xlsx
Hire jamesojackson
Project posted by:
jamesojackson United States
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$ 257
in 7 days
Hire creatorul
$ 947
in 7 days
$ 315
in 3 days
$ 289
in 3 days
$ 250
in 2 days
Hire helmot
$ 350
in 1 days
$ 333
in 7 days
$ 250
in 3 days
$ 333
in 7 days
$ 555
in 5 days