HTML/web page parser

This project received 14 bids from talented freelancers with an average bid price of $499 USD.

Get free quotes for a project like this
Employer working
Project Budget
$250 - $750 USD
Total Bids
Project Description

Do you have experience parsing HTML, using a headless browser to navigate website and turn HTML content and HTML tables into structured data? Then you are needed. Ideally I'd like this written in C# but I'll take Perl or Python or Java if the HTML parsing experience is there.

Job Description:
We need a talented application and database developer to create a program that recursively goes through given web pages, parses them and turns the content into structured data.

- parse HTML and add text and HTML to a RDBMS (preferably SQL server but I'm open to others)
- identify tables in HTML, parse content and add to relational tables
- clean data and consolidate over different time periods

A shell program has already been written and I will personally oversee the development very closely. There are just a lot of details and scenarios to get data cleanly from the materials we are looking to scrape.

Your qualifications:

- A work style that is extremely detail oriented
- Strong communication skills
- A complete Elance profile
- References or an established reputation

Desired Skills
Database Programming, HTML parsing, headless browser, regular expressions

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online