We need a Windows application that we can leave running on a PC that will crawl a business directory on the web
We need to be able to specify limits (e.g. just gather .[url removed, login to view] domains, or just a specific category in the directory) and compile the following data.
It needs to log the page name, URL, and results for the following criteria....
1) Does the site use frames (y/n)
2) Does the page have a Doctype declaration (y/n) + what is it (html, xhtml, strict etc)
3) Does the page code validate (can integrate an open source validator)
4) What is the google page rank of the page (this is optional)
5) What was the directory category
6) A way of determining (looking for ideas) of whether it is built using CSS+divs, or tables for layout.
Please state as much detail as possible about how you will achieve this, how the logged data can be accessed (ideally as an exportable excel spreadsheet from the app.), and post examples if you have done similar things.
11 freelancers are bidding on average $238 for this job
Hi, I have nt actually worked on the exact specs that you have written down but on data extraction from a particular site. But can give it a try. Regards, Vibhu.
Can do this in Delphi. Will use MySQL 3 (will include it in installation procedure) server. Have xpirience ofpascal programming. Also can do this in php for UNIX. Already have page ranking module ... in php.
I can do it. Program will be created using Delphy. Information can be saved in .csv files, or directly into MS Excel. I send a project in UML if needed, at first day.