Needs - a tool to recover previous versions of web sites, including sub pages, from web archive ([login to view URL]) Tech Specs (to be detailled) - needs to be server based, Linux, PHP or PERL + MYSQL - rotating proxy support - master/slave agent architecture preferred but not required - exporting whole extraction as [login to view URL] (containing all html/gif in correct structure) - all data extraction must be done in background-jobs/independent from the UI - all background jobs must be monitored from the UI (i.e. percent complete, possible errors to take care of) Front End Specs (to be detailled) - web based UI - fast work on checkboxes/selections without roundtrips (AJAX-supported to improve usability) => "instant" updates of the ui - english - colored highlighting of errors or fields to fill out Basic work process (to be detailled) - create a new project, specify main domain (extra info parameters like name, topic, etc... to tbs) - specify aproximate date (i.e. May 1 2004) - optionally specify sub-pages known of domain (giving a list of N startpoints for crawl) - system then crawls web archives for for (prev. exisitng) pages - system retrieves old copies of the website probably active at that time as well as related versions - stores content into the mysql DB - systen mail admin when retrieval is compelte, with download link to the [login to view URL] with the archive - additionally 2 previous and 2 following versions are retrieved from the archive and provided as download (optional) as well - if none of the retrieved version does match the admin can edit the list of urls and their available archive dates and manually set the "baseline" to combine from the archive. The data is then either take from online archive or already retrieved DB Further specs when winning bidder was chosen.
## Deliverables
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows? (depending on the nature? of the deliverables):
a)? For web sites or? other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software? installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
## Platform
Linux,Php/perl,MYSQL,AJAX