In Progress

Web crawler to spider search engine results

I need a Windows XP program to do the following:

1. User types in a search term, file extention and selects whether to crawl only within search result domain or to through every site linked to within tree

2. Code will then perform a google search on a search term that is input by the user

3. For each resulting URL, follow all links on every page from that URL down (only within the same domain as the search result or not as specified by the user) and record which page contains a given file type (and how many files of the file type) input by the user. As the program runs, progress statistics (# links found, #links explored, # files found, etc) should be displayed.

3. Output to a .csv file every URL found with number of files of the filetype on the page in descending order by number of files of the file type on the page.

Skills: C Programming, Delphi, Windows Desktop

See more: web-crawler, search in tree, search in code, search files on web, search a tree, program site web, order of tree, google web search engine, code web, code search, number search engine, google search search, google search on, google search web, web program, web domain, search file, perform, google search result, extention, web progress, runs, spider crawler csv file, web crawler search engine, spider engine

About the Employer:
( 9 reviews ) Lexington, United States

Project ID: #24143