Web Scraping of news outlets using C++ into NoSQL databases

  • Status Completed
  • Budget $2 - $8 USD / hour
  • Total Bids 3

Project Description

We are looking for a programmer to develop a c++ scraper for financial newsblogs. This should be reasonably commented, and run with parallel threads. The program should:

Authenticate itself (if necessary) on the website

Create a JSON object saving the contents of the article

Some websites that will be scraped are:

The Wall Street Journal -[url removed, login to view]

Seeking Alpha - [url removed, login to view]

The Motley Fool - [url removed, login to view]

..more websites are to come, so the script should have generic elements and be easily extensible

The results will be in JSON structure, preferably inserted into a mongoDB instance (couchDB may also be used), or for testing purposes json files.

Get free quotes for a project like this

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online