Closed

Google Tag Website Spider / Scraper

This project was awarded to MAnkita for $125 USD.

Get free quotes for a project like this
Employer working
Awarded to:
Project Budget
$30 - $250 USD
Total Bids
4
Project Description

To design a web scraper which will scan a list of web domains for the presence of the Google Web Optimizer javascript tag on a page or sub-page of a domain from a csv list.

Further details:

- the application should take an input of a csv list of web domains and scan all pages and sub pages for the presence of the Google Website Optimizer content generation tag. This tag is available by registering at [url removed, login to view] and setting up a dummy test or I can provide an example

- the proposed means of detecting the tag must ensure that all cases of the tag are detected, I will take your technical expert view on this matter

- the output should be a list of those domains which include the specified tag, specifying the pages where the tag was found

- programming language used does not matter for this project

- applicaiton / script must be able to be run on a Windows XP PC

- application must be capable of working from a list of 1 million domains (Alexa top 1m sites list, too large to be attached but can be downloaded / supplied if interested)

- I anticipate that the scan will take some time so this applicaiton must be able to run in an unattended mode and have a pause function. In case of error it should quit without losing progress.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online