We are looking for a web scrapper to select a website URL and profile from our database, go to the site scrape all the jobs from it (over multiple pages) and return the parsed results into an XML, the XML then must be posted via HTTP and include variables from the first MySQL lookup.
The scrapper must be dynamic in a way we can quickly and easily add more sites with different values to get, ideally the values would be stored in the database and it should be able to get any type of data structure.
The scrapper would run once a day for each site.
Must be able to handle large data.
It must be able to cope with any number of sites whether 1 or 100,000.
The returned XML should be validated.
Full specification will be provided.