Google Tag Website Spider / Scraper

IN PROGRESS
Bids
4
Avg Bid (USD)
N/A
Project Budget (USD)
$30 - $250

Project Description:
To design a web scraper which will scan a list of web domains for the presence of the Google Web Optimizer javascript tag on a page or sub-page of a domain from a csv list.

Further details:

- the application should take an input of a csv list of web domains and scan all pages and sub pages for the presence of the Google Website Optimizer content generation tag. This tag is available by registering at https://www.google.com/analytics/siteopt/splash and setting up a dummy test or I can provide an example

- the proposed means of detecting the tag must ensure that all cases of the tag are detected, I will take your technical expert view on this matter

- the output should be a list of those domains which include the specified tag, specifying the pages where the tag was found

- programming language used does not matter for this project

- applicaiton / script must be able to be run on a Windows XP PC

- application must be capable of working from a list of 1 million domains (Alexa top 1m sites list, too large to be attached but can be downloaded / supplied if interested)

- I anticipate that the scan will take some time so this applicaiton must be able to run in an unattended mode and have a pause function. In case of error it should quit without losing progress.

Skills required:
C Programming, Internet Marketing, SEO, Visual Basic, Website Design
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.