Closed

Custom Web Spider/Crawler

This project was awarded to n3vermind for $250 USD.

Get free quotes for a project like this
Employer working
Awarded to:
Skills Required
Project Budget
$250 - $750 USD
Total Bids
25
Project Description

I'm looking to create a web spider/crawler that will crawl and index any websites I specify in order to track changes. Specifically my goal is to track target websites to the point where I will know if a page has been changed or if a new page has been added.

While I'm completely open to suggestions I was thinking the best way to do it would be to have the spider visit the target site.  When the spider crawls it will:

1. Mark any new URL's it finds

2. Mark any variations to pages previously found (in previous the previous crawl). To do this the spider looks at changes in the pages file size to show a change on that page.

Then there would be a way for me to generate a exportable (CSV) report of new pages and altered pages on that site.  

Also I'm aware of the list of open source web crawlers as in [url removed, login to view], you can use that too if you're able to modify it to meet my needs & requirements.

Also I'm completely open to any type of setup. Ideally this would be completely web based but I'm open to a desktop setup if necessary.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online