I would like to have developed a perl programm that
1) connects to a database
2) fetches a list of RSS feed URLs
3) walks through that list item by item
4) uses a structured dictionary of keywords to look for interesting new feeds
5) stores a feed content which is of interest into a temporary file or structure
6) assembles one email containing all the interesting feed messages
7) sends that email out to a given email-address
The perl crawler should not use Unix programs like wget. It should completely work on its own (only using CPAN modules and a mysql database connection).
The crawler will be called once a day by cron.
The crawler will work on feeds of different languages.
The crawler should memorize feeds that were already processed.
It could memorize all feeds ever parsed.
Please feel free ask questions.
The crawler couldd be developed in other languages like Java, Python or COmega. Let's discuss.