We want to build a crawling software which will be able to extract products details from 10 sites in the USA and store it in an SQL database.
Software should have start, pause, stop functionality. Once fully run, the software should monitor for any changes in Pricing and Stock. If new URL or product is added it should add to the database.
We will start with one site and later extend the architecture to other 9 sites. I need the quote only for one site. Software can be in vb.net and MySQL database.
Please contact me only if you can prove you have done similar work before.
Some important features
Block Category for Extracting
In each site, I will create list of categories or sub categories from which the software should not extract any data
Fields will vary according to category. You need to come out with the fields which you will be extracting
Software should also store the category structure and URL of the product page
Images should be extracted from the site and stored in local folder. Corresponding path and file name should be stored in the database. If the product has got multiple images, the same needs to be stored. No more than 1000 images should be stored in single folder. Each single folder should not store more than 1000 sub folders. If the main folder reaches 1000 folders, then you should start creating folders inside another subfolder.
Folder Name Should start with ProductImage + Time example ProductImages181230
Each product should have a auto generated product code.
Export of Product
1. Software should extract the newly added data into a folder every 1 Hour
2. Software should extract the modified data into a folder every 1 Hour. Example Price and Stock
3. Same needs to be uploaded to an FTP location
We also need a separate Exe to do all settings. Example schedule settings, Block category, Image folder directory
Software should be able to connect to a proxy server to get the data's.
You can start the software buy building a proto type with screens and once I approve the proto type, you can start the database design. Once database design is approved you can start the coding work.