I need someone to develop a website in Django/web2py for a price comparison site using Scrapy (or something better) & Selenium - code must be documented in English. It should allow Scrapy/alternative to crawl a "variable" number of separate sites (using a number of "spiders") that can pull out product details such as Product ID, Title, Price, Vendor, Description, Image, URL and Stock Position etc. This information should then be placed in a PostgrSQL database to be displayed using Web2py/Django. There should also be a way of the URL to the products be changed to affiliate links.
This is an easy project for someone who has done this before, if you have examples of previous work this will go in your favour so please reference them. Additionally if you have advice on the a better architecture/solution I am open to ideas.
a) The Products Table in the server database to be automatically populated by the scraper. The required fields are Product ID, Title, Price, Vendor, Stock Position, Payment Options, Delivery Time
b) Easy extensibility (with some python coding) to add more sites in future.
c) To meet the above, the scraper to be implemented as two modules. The "Scraper Module" and the "Parameter Module".
f) The scraped URLs (referred by the primary URL) to be saved in a Database Table with "processed flag", so that these can be skipped if scraping needs to be resumed after interruption.
g) Primary URLs also to be saved with the date of last successful scraping, to enable scheduling of periodic repeat scrapings.
h) While executing scraping, only those fields that have changed since last scrape are to be extracted and the original table entry for the product to be "updated", as required. In case of new products, the details to be "inserted" as a new row in the Products Table.
j) Performance must be adequate to enable scraping of the sites in order to generate the Products database
k) There should also be a way of the URL to the comparison products within the website to be updated changed to affiliate links.
Budget: USD 200 to USD 300