We need a crawler library for Java which performs the following:
-It should visit 7 different download websites that we will provide you.
-From those download websites it should grab the list of new software added today (from the What's new list of those sites).
-Based on this, it should visit the details for the programs and collect them in a java class (program name, description, link to screenshot, size, etc.). We have already created an interface for the details and will send you on project start or after bidding
-For 2 download sites we need an extended version that crawls all programs and returns them to us.
-Some notes on the sites:
-We will provide them to you after bidding
-Some of them have RSS and maybe you can use it
-4 of them are in German langauge but we can provide you help if you need to translate some parts.
-If a page is slow or not available, thenm you need to have a timeout
Regarding your solution:
-We need pure Java
-We need clean code + documentation