Web site scraping/mapping v2

Avg Bid (AUD)
Project Budget (AUD)
$250 - $750

Project Description:

We need the following:

For each of the six sites below, we need to get a map of the product hierarchy and the number of product within the lowest category.

The result will look like the spreadsheet attached. There will be 6 spreadsheets, one for each brand. Each with 3 tabs

We also need a screen grab of just the main menu for when you hover over the top menu and the left hand nav when you click on each of the top menu’s. So each spreadsheet will have twice the number of images as there are top level menu categories. You can use FireShot for this

This is going to require some intelligence in the way it is approached to work out how the site actually navigates to each category. This can also vary by area within the site e.g. electronics different to toys. However once you figure it out, it is easy. E.g. John lewis – for Toys, click, on by age and then click on toys by age, and then view all. Then you can copy them to notepad and then into excel

Each of the web sites approaches the category presentation differently.

The way to get the hierarchy is to click on each of elements of the main menu, then scrape the categories on the left hand nav, then click on each and scrape the left hand nav categories again until there are no more categories. REMEMBER – each site gets you to categories differently. When there are no more categories below, you need to get the number of products in the lowest level category. This can be got from either the number of in the lister page, or the number of results in brackets next to the field of price in the left hand navigation.

You can do this manually or automatically. We estimate it is 3-4 man hours per site, though have not proven this. You have to make your own judgement. The maximum revision to price if it takes longer than expected is 15%

Needs to be completed within 3 days.

Sites are:

Additional Project Description:
12/05/2012 at 10:30 PKT

We want that the hiehierarchy is captured every time something is clicked on from the main menu and then on the left hand Nav. You now do not need to work out what is a category, as you will capture everything and we will work it out afterwards

You will not need to click on any links in the left hand nav under a heading of price, colour, brand, size, age

So for example, if you click on womens, record this as level 1, and record everything in the left hand nav as level 2, then click on dresses in the left hand nav, and record everything that appears in the left hand nav as level 3, and so on. however you never need to click on anything that comes under a heading of price, colour, size, age or brand. Though we want to know if they appear in the nav at each level

You still need to record the number of products appearing in the lister and the page screen shot as a jpg, so that the text is readable.

Though we have simplified this, the price range stays the same. We will not consider bids more than 3 days.


Skills required:
Transcription, Web Scraping
Additional Files: Examplse+for+freelancer.xlsx
About the employer:
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.

Hire greggfletcher
$ 300
in 3 days
$ 350
in 10 days
$ 250
in 3 days
$ 599
in 7 days
$ 250
in 5 days
Hire webscrapinggurus
$ 500
in 3 days
$ 251
in 3 days
$ 750
in 10 days
Hire bob1982
$ 750
in 6 days
$ 250
in 1 days