Website Crawler

This project was successfully completed by Mikhail19 for $400 USD in 14 days.

Get free quotes for a project like this
Employer working
Completed by:
Project Budget
$250 - $750 USD
Completed In
14 days
Total Bids
Project Description

We want to build a crawling software which will be able to extract products details from 10 sites in the USA and store it in an SQL database.

Software should have start, pause, stop functionality. Once fully run, the software should monitor for any changes in Pricing and Stock. If new URL or product is added it should add to the database.

We will start with one site and later extend the architecture to other 9 sites. I need the quote only for one site. Software can be in [url removed, login to view] and MySQL database.

Please contact me only if you can prove you have done similar work before.

Some important features

Block Category for Extracting

In each site, I will create list of categories or sub categories from which the software should not extract any data

Fields will vary according to category. You need to come out with the fields which you will be extracting

Software should also store the category structure and URL of the product page

Images should be extracted from the site and stored in local folder. Corresponding path and file name should be stored in the database. If the product has got multiple images, the same needs to be stored. No more than 1000 images should be stored in single folder. Each single folder should not store more than 1000 sub folders. If the main folder reaches 1000 folders, then you should start creating folders inside another subfolder.

Folder Name Should start with ProductImage + Time example ProductImages181230

Each product should have a auto generated product code.

Export of Product

1. Software should extract the newly added data into a folder every 1 Hour
2. Software should extract the modified data into a folder every 1 Hour. Example Price and Stock
3. Same needs to be uploaded to an FTP location

We also need a separate Exe to do all settings. Example schedule settings, Block category, Image folder directory

Software should be able to connect to a proxy server to get the data's.

You can start the software buy building a proto type with screens and once I approve the proto type, you can start the database design. Once database design is approved you can start the coding work.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online