HTML/web page parser
- Status Closed
- Budget $250 - $750 USD
- Total Bids 14
Do you have experience parsing HTML, using a headless browser to navigate website and turn HTML content and HTML tables into structured data? Then you are needed. Ideally I'd like this written in C# but I'll take Perl or Python or Java if the HTML parsing experience is there.
We need a talented application and database developer to create a program that recursively goes through given web pages, parses them and turns the content into structured data.
- parse HTML and add text and HTML to a RDBMS (preferably SQL server but I'm open to others)
- identify tables in HTML, parse content and add to relational tables
- clean data and consolidate over different time periods
A shell program has already been written and I will personally oversee the development very closely. There are just a lot of details and scenarios to get data cleanly from the materials we are looking to scrape.
- A work style that is extremely detail oriented
- Strong communication skills
- A complete Elance profile
- References or an established reputation
Database Programming, HTML parsing, headless browser, regular expressionsGet free quotes for a project like this
Looking to make some money?
- Set your budget and the timeframe
- Outline your proposal
- Get paid for your work
Hire Freelancers who also bid on this project
Looking for work?
Work on projects like this and make money from home!Sign Up Now
- The New York Times
- Wall Street Journal
- Times Online