Website crawlers / scrappers, pref. Python

  • Status Closed
  • Budget N/A
  • Total Bids 20

Project Description


I need 6 websites crawlers / scrapers.

Please create 6 scripts for each site, that I will run from command line.

The output should be plain JSON or XML files (the one you prefer), no database interaction is needed.

Each script should create 2 output files:

* The list of catalogs

* The list of items

I would like to develop these scripts in Python language, based on Scrapy ([url removed, login to view]) framework. But if you want to use any other language / framework – you should explain me the reason why you've chosen it.

Please note, that each Product in these sites has 3 types of images. Save URL links to each of such image:

* small – you see it near product description (135x173 px)

* medium – it's displayed on the same page, when you click on the small image (324x416 px)

* large – it's displayed when you click on medium image (1,920px × 2,462px)

For each Catalog item please save the Name, ID and the Parent ID.

This will be a fixed price assignment. For a proficient programmer, this should take no more than a couple hours.

If you can comfortably complete this job, there is great opportunity for many future jobs.

Payment will only be made upon completion of all 6 scrappers. No deposit will be made what so ever!

Get free quotes for a project like this

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online