Blog aggregator similar to techmeme

  • Status Closed
  • Budget N/A
  • Total Bids 12

Project Description

I am looking to develop a platform similar to [url removed, login to view] but for a different industry. I have a collection of approximately 1200 blogs that can be used to seed.

The site would behave much like techmeme in that it would

1) scrape websites/blogs for data

2) collect and index the data

3) Algorithmically or by way of machine learning cluster articles/posts that relate to each other

4) present the data in real time in a structured and easy to navigate way

5) provide a backend that would allow an administrator/user some "editorializing" such as tagging one article/post in a cluster as the top story. Backend also needs to be able to manage all aspects of website - sponsorship, users, database updates, add new urls to the scraper, etc.

6) provide a means to organize and promote sponsorships throughout the site.

Based on my research, this project could be accomplished using a combination of Apache Nutch, solr, hadoop, and mahout.

This will likely be deployed on a platform like Amazon AWS.

Type of Website: News Media / Informational Content

Other Skills: hadoop, mahout, nutch, solr, lucene, java

Get free quotes for a project like this

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online