Closed

Scrape a website & insert into database & perform some tasks with the information

I need someone to write some software that will archive every listing posted on a particular website and use that information as described in the features section of this post.

Basic logic of program:

1. Send a request to a website that returns listings in xml format

2. Check each listing against a mysql database

3. Send a web request to each new listing individually to get all the information

4. Features 1,2,3 (Explained in detail below)

5. Upload images from the listings to amazon S3

6. Add the information for each listing to a mysql database

7. Sleep before looping back to step 1 (Read feature 4)

Limitations:

The website is limited to a 20 listings at a time (Step 1). If all new listings are found, keep sending web requests for the next page of listings until previous listings are found, so no listings are missed. (During peak times it is possible for more than 20 listings to be posted between the minimum sleep period of 2 minutes)

Features:

1. Create a table that tracks listings that are from the same user (by using two values found in the listing). Keep a tally of how many listings that user has posted and a tally of how many of those listings are unique (I suggest this is done on a separate thread as to not slow down the scraping).

2. If enabled, check each new listing's price against comparable listings on another website (web request to an api), and calculate the average value for comparable listings using the archive of listings in my database. Use some math calculations to decide if the listing is undervalued by a configurable amount/percent and send an alert (Amazon SNS and database entry). (This must be done on a separate thread as to not slow down the scraping)

3. Check each listing against search criteria, which can be configured by adding rows of criteria to a mysql database, and send an alert (Amazon SNS and database entry) if a new listing satisfies that criteria. (This will be simple criteria, such as if the listings price is >100, or if the listing is a specific model, etc). (This must be done on a separate thread as to not slow down the scraping)

4. Adjust the sleep time automatically as to minimize the amount of pages requested before finding previous listings (Explained in limitations). With a minimum sleep time of 2 minutes, a maximum of 15 minutes from 7AM - 11PM, and a maximum of 2 hours from 11PM-7AM, before looping.

5. Once daily check each active listing in the database against the website to see if the listing has been updated, or if the listing has been deleted. If it has been updated, save the changes to the database as a new row. If it has been deleted, change the status in the database so the listing will not be checked again. (I suggest this be a separate script ran by a cron job).

Requirements:

1. Must run on a linux server

2. Error Handling (Website down, website responds with unexpected data, etc)

3. Log activity/errors in a text file. Send an alert if errors occur (Amazon SNS and entry into database)

Program can be coded in any language that can run on a linux vps and take advantage of the multiple ip addresses the server has. PHP would be preferred.

Skills: Data Entry, Linux, MySQL, PHP, Web Scraping

See more: write for finding a job, website price value, vps price, vps linux price, vps for web scraping, simple scraping software, need of tally, minimum multiple in math, maximum price to create website, job finding website, how to write a finding, how to create database program, how to create a website with database, how to add new pages to a website, finding a new job, data scraping from website software, database entry software, average job search time, amazon price scraping, amazon api scraping, alert logic, crawler website insert database, can rapidweaver website uses information mysql database, auto insert information mysql database, scrape database website

About the Employer:
( 3 reviews ) Regina, Canada

Project ID: #10165405

28 freelancers are bidding on average $609 for this job

mituld

Hi I work towards providing reliable, relevant and robust IT solutions at most competitive prices to my customers. I ensure 100% customer satisfaction so lets start Thanks

$670 CAD in 18 days
(301 Reviews)
7.9
jubair7

Experienced TEAM HERE to work for your project. Let's discuss more and finalize the project and cost. We have our own workplace with 7+ working PCs and laptops, with ~15mbps internet connections. We also have a back More

$800 CAD in 15 days
(263 Reviews)
7.4
scriptphp87

Hello, I'm a professional programmer for web programming with php language to build the system website, Besides, I'm also expert in MySQL , HTML,HTML5,CSS, JS I'm always top in Vietnam freelancer [login to view URL] More

$789 CAD in 20 days
(206 Reviews)
7.7
ehsankayani

HI. I am an expert in developing automated tools and scraping scripts and you will tons of similar bots with requirement as you need here (ip switching , useragent switching, storing images to Amazon s3 buckets etc More

$526 CAD in 10 days
(125 Reviews)
7.3
hwanghendra

Hello, what is the website ? ...........................................................................

$722 CAD in 10 days
(411 Reviews)
7.0
Toperfection

Hello there, We are interested in offering our Data Entry and Web Scrapping services for this job requirement. We have completed various projects related to web scrapping making use of both automated and manual More

$250 CAD in 5 days
(115 Reviews)
7.4
mantislin

Hi sir, I am scraping expert, I have did too many similar projects, please check my feedback then you will know. Can you tell me more details? then I will provide demo data for you. Thanks, Kimi

$699 CAD in 6 days
(255 Reviews)
7.2
shengui

Hi.. I am very interested in your project, because I am an expert in C/C++, C#, php, asp.net, web scraping, web automation, selenium and others. Please contact me, then we can discuss about the details and the price. More

$515 CAD in 3 days
(79 Reviews)
6.8
$947 CAD in 10 days
(215 Reviews)
7.0
nuprogramer

Hope you are doing great. I am interested to provide you my services. I have more than 5 years experience in providing professional website development services and worked with almost every type of project. So this is More

$1111 CAD in 14 days
(67 Reviews)
6.7
vsolcorp

Hello, Sir I am from vSol CORP (TEAM: 19 employees). I have checked your project requirement. If you like to have a look on some sample data then please let us know. Kindly interview us to review our competencies & hir More

$250 CAD in 4 days
(198 Reviews)
7.2
phpXpertbd

Dear Sir, I'm very much delighted to let you know that i did data scraping with PHP-cURL, PhantomJS, Node.js, Selenium from many sites. I just scraped the data from web site and then wrote the data in mysql database More

$400 CAD in 10 days
(56 Reviews)
6.6
rkmomin

Hi, Ok, we will do it. Please contact for more discussion. Please see our profile for our recent reviews and feedback. Websites URL: [login to view URL] [login to view URL] http://ww More

$500 CAD in 12 days
(104 Reviews)
6.8
prashushinde9

================== Amazon MWS API Experts ================== NOTE: Most of the requirement of your project scope is already completed by us and we have demo for you as well. We are Amazon MWS API experts and comple More

$773 CAD in 20 days
(47 Reviews)
6.8
gangabass

I'm one of the best web scraping experts here that's why I'm sure you'll be impressed with my work. I can create such scraper in less than 5 days and I can offer you best price here. You have pretty good project More

$631 CAD in 5 days
(290 Reviews)
6.6
shreeyait

Hello Sir, Hope you are fine there. We are having good experience with CorePHP projects and the reason we came across here to give the best output to your project with supreme quality. We have develope More

$555 CAD in 10 days
(56 Reviews)
6.9
Verz1Lka

Hello! I'm web scraping expert and i can done your project. I use python language and scrapy framework. My scripts works on windows, mac or linux, but linux is preferably. I can schedule scripts on server if it is re More

$599 CAD in 10 days
(57 Reviews)
6.0
baddesigner

I am arif from Pakistan, having 7 years of experience in software house in the IT-Programming and Graphic Design. I can realize almost everything you need databases and languages - it does not matter because they ar More

$1052 CAD in 10 days
(40 Reviews)
5.9
$663 CAD in 30 days
(45 Reviews)
5.9
stdhtelkom

Hello, I am very interested. I am very familiar with this jobs and has done many of that. I have long term relationship with some clients for this kind of jobs and has done almost a thousand of the jobs. Lookin More

$600 CAD in 10 days
(26 Reviews)
6.0