I'm looking for an expert who has experience with Nutch or Scrapy to help me set up a webcrawler to scan websites and webfiles and then update a database with the info.
Client-based user interface:
1. create/edit/remove rules
a. real-time webpage scan
b. real-time webpage + crawl scan (crawl means it follows links on the website to other pages, and then scans these pages, for X levels)
c. real-time file scan
d. real-time multiple/batch file scan
rules can be executed one time, continuously, or over a timeframe (every x minutes)
rules can be turned on/off
2. Display live status of the engine in an
3. write to database accroding to structure predefined in rules
10 freelancers are bidding on average $325 for this job
Sir, first and foremost allow me to thank you for inviting me to this interesting project. I truly believe you chose the right developer for the job. Please see my message for more details, thank you.