You have chosen to sponsor your bid up to a maximum amount of .
I need to scrape a section of a popular website.
The Data that needs to be scraped should organised into 6 seperate csv files or (tabs).
The files are:
1. Customer Information
2. Customer Attribute 1
3. Customer Attribute 2
4. Customer Attribute 3
5. Customer Attribute 4
6. Customer Attribute 5
As you can see there are five attributes listed. Each of the attributes will be "linked" to the first file "customer information" listed above, by the "customer name" and id.
The URL's of the website to be scraped is in another file attached. The URL's contained paginated index of the listings to be scraped in alphabetical order, with pagination on each letter of the alphabet .
Not all the "listings" have all the 5 attributes or the required information for the first "tab". Wherever possible please scrape all the required data.
The attribute data can be found by clicking a tab on each of the listings. I have attached some screenshots for further reference that also outlines where the data required for scraping can be found.
Successful candidate must be using multiple proxy servers to avoid being banned by the site. Alternatively if you can use another way of avoiding getting banner or booted, please suggest it.
File can be delivered in separate csv's and data formatting must be as per the attached spreadsheet. As well, please provide raw data file of all the data scraped.
If you have any questions please don't hesitate to ask.
Preference will be given to those who can supply a sample. I'd like to kick off with a skype call.