You have chosen to sponsor your bid up to a maximum amount of .
I'm looking to create a web spider/crawler that will crawl and index any websites I specify in order to track changes. Specifically my goal is to track target websites to the point where I will know if a page has been changed or if a new page has been added.
While I'm completely open to suggestions I was thinking the best way to do it would be to have the spider visit the target site. When the spider crawls it will:
1. Mark any new URL's it finds
2. Mark any variations to pages previously found (in previous the previous crawl). To do this the spider looks at changes in the pages file size to show a change on that page.
Then there would be a way for me to generate a exportable (CSV) report of new pages and altered pages on that site.
Also I'm aware of the list of open source web crawlers as in http://en.wikipedia.org/wiki/Web_crawlers#Open-source_crawlers, you can use that too if you're able to modify it to meet my needs & requirements.
Also I'm completely open to any type of setup. Ideally this would be completely web based but I'm open to a desktop setup if necessary.