I need someone that can make a custom bot. I use Windows and can run Perl, .Net, etc. I do not want it to be web based as that has proved to be slower but if you insist, I'll give it a try.
Here's how it could work.
In order to extract the data, you first have to have an active session with the website. That means I need a window (or a portion of one that I can visibility see) that stays open that I log-in. (I'll keep this window open and occasionally click around to keep the session active.)
I need the ability to load a list of URL's to extract data. I am OK with either the text being extracted or the code. Which ever is faster. It needs to be saved in CSV format. The HTML code is at the bottom and runs from line 156-228 on the page. Each page is the same.
Notice in the code that each field has a field name (ie: Name, Address, etc). I want to keep these in place so that it will look like "Name: Chris Jones". That way, when I sort it it will help better organize the data in case things get shifted around.
Each field name will needs its own column in CSV. Name, Address, City, State, Zip, etc.
This will need to support multiple threads since it needs to be fast.
I do not need to download any images or other files. I just need the text from each page.
Each URL is in this format: domain.com/xxx/programDetails.aspx?sid=1991658
PS - Due to the nature of the data, I cannot provide a link for you to test. I realize this impacts the job but I cannot give out the data.
SAMPLE CODE ATTACHED -- (Runs from line 156-228)