Find Jobs
Hire Freelancers

Create Python App to find/save/update products from different e-commerce websites.

$50-200 USD

In Progress
Posted over 1 year ago

$50-200 USD

Paid on delivery
All websites you need to scrap located in US. Amazon, Best Buy, Newegg, Costco. Scenario: I will provide the category address of every website in db. App should pick all product from that categories and save all required information to db. This task should repeat for all 4 websites. This find/save task should autorun every day. In other hand, app should update all db-stored product price and title and stock count and review counts every period which we specify in db settings collection. To update products, app should use product link to go to the correct product details page. You should use Scrapy to extract data from websites. App should develop in a way that we will be able to add another site for crawling by adding a file to "extractor" folder. means, we will have 4 website which has its own .py file into "extractor" folder to crawl and find product. App should be able to check that folder and run all files which related to other websites. Because, we will add other websites to crawl and find product in future. We should not change source code of app. Only thing we need to do, is add new file of that website in extractor folder and then app should consider that website for crawling and finding product too. Now we will have [login to view URL], [login to view URL], [login to view URL], and [login to view URL] If we decide to add eBay and Target, only thing that we should to do is add [login to view URL] and [login to view URL] with required functions in those files. Functions name should be same in all websites .py files. In this case app can handle all current and future websites easily. Means, we can add new websites by duplicate one of current crawler .py file and change some html tag of that in .py file to find correct data in page and then upload the new website .py file into the extractor folder. and then app should consider them during crawling and finding products. DATABASE STRUCTURE WILL PREPARED BY ME AND PROVIDE TO DEVELOPER TO FOLLOW THAT. WE WILL USE USA REGIONAL PROXY TO DECREASE THE CHANCE OF BLOCK/BLACKLISTED BY CRAWLED WEBSITES. WE WILL USE MULTIPROCESSING OR THREADING FOR PARALLEL TASKS TO FIND AND SAVE PRODUCTS FROM ALL WEBSITES AT THE SAME TIME. FIRST OF ALL, READ PROJECT DETAILS. WHAT IS YOUR PLAN TO AVOID BLOCKING THE APP FROM CRAWLING BY E-COMMERCE WEBSITES WHILE WE NEED TO UPDATE ALL PRODUCT PRICES MANY TIMES IN A DAY ? WE NEED TO FIND NEW PRODUCT CONTINUOUSLY AND UPDATE CURRENT PRODUCT DETAILS IN DB AT LEAST 4 TIMES IN A DAY *** We defined 5 steps for this project. Milestones will create one-by-one during these steps. Milestone will never release until every task end *** ********************** 1st task - app run correctly and find products in all required websites and save required data to db. In this step app should found products based on what we need and make sure of save product in MongoDB 2nd task - implement update task. In this step app should be able to update stored product for new price, new title, new sale type, new shipping fee, update product status, etc. Update-task should run as threads or multi-process. We should find best model for parallel running. 3rd task - implement logic to running tasks without blocking/blacklisted. 4th task - improve find-task and update-task accuracy and speed. 5th task - make documentation and deploy the apps to servers. In this step, app should deploy on different Ubuntu/CentOS servers and continuously run without any issue and work exactly same as we want. **********************
Project ID: 35115440

About the project

20 proposals
Remote project
Active 1 yr ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

About the client

Flag of UNITED STATES
San Jose, United States
5.0
11
Payment method verified
Member since Aug 19, 2018

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.