In Progress

Web crawler to spider search engine results

I need a Windows XP program to do the following:

1. User types in a search term, file extention and selects whether to crawl only within search result domain or to through every site linked to within tree

2. Code will then perform a google search on a search term that is input by the user

3. For each resulting URL, follow all links on every page from that URL down (only within the same domain as the search result or not as specified by the user) and record which page contains a given file type (and how many files of the file type) input by the user. As the program runs, progress statistics (# links found, #links explored, # files found, etc) should be displayed.

3. Output to a .csv file every URL found with number of files of the filetype on the page in descending order by number of files of the file type on the page.

Skills: C Programming, Delphi, Windows Desktop

