I want to crawl and fetch information(Like Address,Phone number etc) about colleges from AICTE Website.
Specifically, These colleges are 2012-13 approved colleges and AICTE assigned unique application number to each college. i have around 20k colleges Application ids.
Below link is the place where users to search colleges and It works only in IE. Seems this is coded with Siebel Web Engine.
Sample Application ids
My Requirement is, Crawl the site for 20k Colleges application ids and generate HTML File for each college. So that i can parse that HTML File and will fetch details. I am looking only Crawling Part. If you develop a program that works for two Ids that's sufficient. You can use any Language to crawl the pages.