*Advanced* Web Scraper

CLOSED
Bids
17
Avg Bid (CAD)
N/A
Project Budget (CAD)
$8 - $15 / hr

Project Description:
Advanced Web Scraper of eCommerce content: Data and Pictures

MUST work FAST, FAST, FAST, EFFICIENT and MASTERLY

*** Do not bid if you cannot deliver what we’re asking for!


25,000 Products:
- Products and details are the same between both sites and the domain address is only slightly different
- Properties and specs are NOT the same for every product, nor are they all formatted the same
Site 1a: Retail
Site 1b: Wholesale - Private Account

50,000 Products
- Products and details are the same between both sites, but domain addresses are fundamentally different
- Properties and specs are NOT the same for every product, nor are they all formatted the same, some categories may also be different
Site 2a: Retail - Turnkey site
Site 2b: Wholesale - Private

2,500 Products
- Products and details are often the same between both sites, but there are differences and the domain addresses are fundamentally different
- Properties and specs are NOT the same for every product, nor are they all formatted the same, some categories may also be different
Site 3a: Retail
Site 3b: Wholesale

2,500 Products
Site 4: Retail

1,500 Products
Site 5: Retail



Output file will need to be integrated…

- as a database to populate a website
- to update a website site with product and detail changes as well as stock
- for financial analysis - Price comparison, wholesale vs retail; profit and loss, break even

MUST HAVE FEATURES:

- Windows based GUI

- Crash Proof

- Pause, stop, continue options

- Visual and audible alerts when scrape is complete

- "Save to..." option to save scrape to a specific folder and defaulting future scrapes to that folder

- Date and Time stamp for each scrape, so previous scrapes are NOT overwritten

- Picture scrape (1000x1000 or bigger)

- Check box to scrape product data and/or picture

- Check box to scrape only certain categories (include main and sub-categories)

- Have "Check all"/"Uncheck all" feature

- Scrape History: Continue from last successful scrape point should scraper fail, but I would expect this to rarely or never happen

- Scrape History: Continue from last successful scrape point should our computer crash

- Scrape delay feature (in seconds)
- Scrape # Products between delays
- Auto update database website price comparison, archiving previous data

- Output Report in Excel: outlines the scrape process and highlighting "Errors"

- Output options: Excel; csv; XML, etc (options for use with MySQL; Drupal; Zend/Symfony)

- Option to merge all scrapes into one file to compare product details and pricing
... Including a filter to associate comparative details, but the product name is different, despite being the same product

- Password and Password change option for private sites

- Simple options for user to review and update scraper due to website changes (including, but not limited to, product, data, http fields, links, address, etc.)

- If there are other useful features I have not noted, please specify what they are and how they are of use

- Must provide support and updates for scraper functionality and accuracy

*** A high attention to detail is a MUST because product properties/specifications and their respective labels vary.

If you have read and understood the details outlined, at the beginning of your message, tell us:

What software are you an expert in to create our scraper (Python, PHP, C, other)


* All our projects are tracked using Freelancer Time tracker, but our "billing and progress" is tracked via Snagit Desktop Video Recorder. We have ongoing projects with website developers and graphic designers/illustrators. This process works for us and our contractors. It's simple to use and as long as you have a strong work ethic and integrity, it is a simple process to record the work you do for us and upload the videos on a mutually shared Dropbox or Google Drive for us to receive. Bidding on this project means you are accepting the terms outlined in this post and the attached employment agreement.

Additional Project Description:
10/26/2013 at 12:57 HKT
Once hired, the attached document and an NDA (which will be sent to you via private messaging) will need to be filled out, each section initialed, signed and dated.

** Payment options: Submitted work that meets with our approval is paid via Freelancer or PayPal.
We have a 5.0 rating for a reason, we are a serious and trustworthy employer. The Contractors we hire provide a high quality service, clarify and understand what we want and do everything they can to meet and exceed our expectations. This is how we do business, we keep our business model and relationships simple – Give us what we ask for and what we want, and we’ll pay you for your time. It’s this simple.
We honor quality products with guaranteed payment. We do not provide milestones to Freelancers we have NOT worked with before.

*** We are a small company with a big heart, but we have run into a few shady contractors for which this point applies to. We will not tolerate dishonorable behavior. If you make a small mistake, we'll let it slide, once. If you make a big mistake, you'll need to fix it at your own expense. Freelancers who are suspected of stretching out project time by working inefficiently, creating code/design errors or failing to provide a complete, tested and satisfactory product will be terminated from our project immediately, billed time will be forfeited and you and/or your agency will be blacklisted from our hiring pool.
Furthermore, we reserve the right to reject any article, design or project creation which is not, in our opinion, of publishable or usable standard, or does not meet with our requirements. We are fair and reasonable, but any work which cannot be completed to our satisfaction will not be accepted and payment will consequently be forfeited.

**** Basic fine print:
Upon selecting your custom creation/work, the Freelancer grants all rights of use of the created pieces to us, the customer. The granted rights of use comprise a worldwide, exclusive, sublicensable, perpetual, irrevocable, royalty-free license to use, copy, modify, display, and publish the selected work.

Hours of work: 40 Project Duration: < 1 week Skills required:
C# Programming, Data Mining, Python, Software Architecture, Web Scraping
Additional Files: FuzzyPeaches - Employment Agreement (Freelancer).pdf
Hire FuzzyPeaches
Project posted by:
FuzzyPeaches Canada
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$11 / hr
Hours: 40 hr/ week
$15 / hr
Hours: 40 hr/ week
$20 / hr
Hours: 40 hr/ week
$16 / hr
Hours: 12 hr/ week
$15 / hr
Hours: 40 hr/ week
$12 / hr
Hours: 40 hr/ week
$12 / hr
Hours: 40 hr/ week
$15 / hr
Hours: 40 hr/ week
$14 / hr
Hours: 40 hr/ week
$27 / hr
Hours: 40 hr/ week