Custom Web Spider/Crawler

IN PROGRESS
Bids
25
Avg Bid (USD)
$446
Project Budget (USD)
$250 - $750

Project Description:
I'm looking to create a web spider/crawler that will crawl and index any websites I specify in order to track changes. Specifically my goal is to track target websites to the point where I will know if a page has been changed or if a new page has been added.

While I'm completely open to suggestions I was thinking the best way to do it would be to have the spider visit the target site.  When the spider crawls it will:

1. Mark any new URL's it finds

2. Mark any variations to pages previously found (in previous the previous crawl). To do this the spider looks at changes in the pages file size to show a change on that page.

Then there would be a way for me to generate a exportable (CSV) report of new pages and altered pages on that site.  

Also I'm aware of the list of open source web crawlers as in http://en.wikipedia.org/wiki/Web_crawlers#Open-source_crawlers, you can use that too if you're able to modify it to meet my needs & requirements.

Also I'm completely open to any type of setup. Ideally this would be completely web based but I'm open to a desktop setup if necessary.

Skills required:
Java, Javascript, Linux, MySQL, PHP
Hire domain81
Project posted by:
domain81 United States
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$ 250
in 7 days
$ 400
in 4 days
$ 520
in 12 days
Hire jqMike
$ 750
in 7 days
$ 280
in 4 days
$ 250
in 10 days
Hire nexuslite
$ 250
in 15 days
$ 1000
in 15 days
$ 700
in 10 days
$ 250
in 5 days