You have chosen to sponsor your bid up to a maximum amount of .
I'm looking for an expert who has experience with Nutch or Scrapy to help me set up a webcrawler to scan websites and webfiles and then update a database with the info.
Client-based user interface:
1. create/edit/remove rules
a. real-time webpage scan
b. real-time webpage + crawl scan (crawl means it follows links on the website to other pages, and then scans these pages, for X levels)
c. real-time file scan
d. real-time multiple/batch file scan
rules can be executed one time, continuously, or over a timeframe (every x minutes)
rules can be turned on/off
2. Display live status of the engine in an
3. write to database accroding to structure predefined in rules