Blog aggregator similar to techmeme

Avg Bid (USD)
Project Budget (USD)
$3000 - $5000

Project Description:
I am looking to develop a platform similar to but for a different industry. I have a collection of approximately 1200 blogs that can be used to seed.

The site would behave much like techmeme in that it would
1) scrape websites/blogs for data
2) collect and index the data
3) Algorithmically or by way of machine learning cluster articles/posts that relate to each other
4) present the data in real time in a structured and easy to navigate way
5) provide a backend that would allow an administrator/user some "editorializing" such as tagging one article/post in a cluster as the top story. Backend also needs to be able to manage all aspects of website - sponsorship, users, database updates, add new urls to the scraper, etc.
6) provide a means to organize and promote sponsorships throughout the site.

Based on my research, this project could be accomplished using a combination of Apache Nutch, solr, hadoop, and mahout.

This will likely be deployed on a platform like Amazon AWS.

Type of Website: News Media / Informational Content
Other Skills: hadoop, mahout, nutch, solr, lucene, java

Skills required:
Apache Solr, Hadoop, Java, Website Design
About the employer:
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.

$ 4950
in 45 days
$ 3000
in 40 days
$ 5000
in 45 days
Hire ngcomp
$ 5000
in 30 days
$ 5000
in 90 days
$ 4500
in 90 days
$ 4000
in 50 days
Hire softomaniac2011
$ 5000
in 45 days
Hire orda
$ 5000
in 90 days
Hire QehZ081TY
$ 4000
in 1 days