I want to crawl and fetch information(Like Address,Phone number etc) about colleges from AICTE Website.
Specifically, These colleges are 2012-13 approved colleges and AICTE assigned unique application number to each college. i have around 20k colleges Application ids.
Below link is the place where users to search colleges and It works only in IE. Seems this is coded with Siebel Web Engine.
[url removed, login to view]+12-13+Public+Domain+New+Search+View
Sample Application ids
My Requirement is, Crawl the site for 20k Colleges application ids and generate HTML File for each college. So that i can parse that HTML File and will fetch details. I am looking only Crawling Part. If you develop a program that works for two Ids that's sufficient. You can use any Language to crawl the pages.