Closed

[PYTHON] Parsing data from websites to CSV

This project received 12 bids from talented freelancers with an average bid price of €189 EUR.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
N/A
Total Bids
12
Project Description

Hi Freelancers,

In the scope of my work, I need to parse frequently a lot of data from three different websites.
So I would like to have a python script that can perform this task, and pull all the data into a CSV File.
Three CSV output : one for PTE Academic Tests Centers, one for IELTS Tests Centers and one for IELTS Institutions Accepting.

-- WEBSITES --

- PTE Academic Tests Centers
[url removed, login to view]
Countries by Countries, City by Cities (some countries have cities. Check for USA for exemple)

- IELTS Institutions Accepting
[url removed, login to view]
Go Countries by Countries, pages by pages (some countries have more than one page of Institutions Accepting, Check for USA for exemple)

- IELTS Tests Centers & Prices
Need to parse the global list of all the tests centers
AND need to go tests centers by tests centers to parse the price for each (please look at the files Enc.)
[url removed, login to view]


YOU NEED TO PAY ATTENTION TO THE SOURCE CODE OF THESE PAGES:
For PTE Academic Tests Centers ([url removed, login to view]) you will need to deal with strange iframe things
For IELTS Tests Centers & Institutions Accepting ([url removed, login to view] + [url removed, login to view]) you will need to deal with strange [url removed, login to view] cryptic values like __VIEWSTATE__ & Cie.

THIS SCRIPT NEED TO BE RUN UNDER WINDOWS. I want a script that will work under windows, not only on Linux.
So the use of Grab module ([url removed, login to view]) is forbidden, as far it is not working on windows).
Scrapy, Twisted and others are welcome.

This script would propose three options for the user :

1) Parsing IELTS (Tests Centers & Prices + Institutions Accepting)
2) Parsing PTE Tests Centers
4) Parsing 1) and 2)

You will find Enc. an archive with five CSV documents that represents the five output that I want for this script.


Please BID on this project ONLY if you have the skills to perform this job, and I will contact you by Private Message to know how you plan to do this job.

Thank you in advance :-)

Carto.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online