using vbscript and vba to combine web pages in excel

In Progress Posted Sep 18, 2008 Paid on delivery
In Progress Paid on delivery

I have a list of ~100 links, the work to be done is to use vbs and vba:

1) check if the link is valid, links are in attached txt file

note that user can change the the [url removed, login to view] file in the future.

build a worksheet ("links") in an excel spreadsheet file, for each link, flag its validity with a valid/invalid sign in a separate column ("validity" column) next to the "link" column

2) download those links, save the webpage in a html or txt files in local drive. For link1, the file name should be link1-new.html. Meanwhile save previous downloaded version (the previous [url removed, login to view]) as [url removed, login to view], yyyymmdd being today's date

eg, the link [url removed, login to view] should be saved as link1-new, old one as link1-20080909

3) compare the [url removed, login to view] with [url removed, login to view], if same, mark a status column ("download status" column) with "same-yyyymmdd" in worksheet ("links")

If not the same or [url removed, login to view] did not exist (ie, it's a new link), then copy the webpage to the excel spreadsheet(sheet name "all-pages"), and put the modified page below the existing rows (ie, the "all-pages" may already have many rows from other links and previous downloads).

The first column of the "all-pages" should be link address

The 2nd column of the "all-pages" should be today's date

The 3rd column of the "all-pages" should be "same/new", explained below

The 4th- columns are the downloaded webpage

4) for each new row added to "all-pages" sheet (ie, these are webpages that has changed), compare with old rows from the same link in the "all-pages" sheet. Mark each row that's new with "same" in the 3rd column of the "all-pages", otherwise, mark it "new".

4. the information we need is the paper title and link and authors, note that these pages can have very different formats

5. we need vbs and vba source code, not an application

thanks.

Excel Visual Basic

Project ID: #317063

About the project

7 proposals Remote project Active Sep 28, 2008