Closed

Clustering of news websites

This is a complex project (similar to google news), please bid only if you have a very good experience in scraping, data mining, databases (like mysql) and HTML.

*Scraping: I need to scrape 40 news websites continuously for news title, image and content. I want to be able to easily include new news websites myself (e.g. with XPath expressions). This means I will have to understand your code. A fancy frontend with nice-looking buttons is not needed, I'm a computer scientist with 5+ years industrial experience.

*Data mining: Similar news shoule be identified and clustered into a single news.

*Database: Everything should be stored in a database

*View: A website should show the clustered news with links to each news website. The layout of the website is very simple, it's no big deal.

This might sound easy to you, but it's more complex than I said. similar_text or levenshtein functions in PHP are *no* adequate solution, because they are very limited and produce very poor clustering results.

This is how it will go down: Place your bid and write ma a PM. I will write you back with all information about the project so you can see if you are up to it.

The budget is 3000-3500$. I could do the job myself but I don't have time. I don't care about user interface, I care about business logic. You will have to provide full source code. To show me that you read everything, write "Project1" when bidding, thanks.

Skills: Data Mining, Data Processing, HTML, MySQL, Web Scraping

See more: ma code, layout websites, job news, job bidding websites, want scientist, image web solution, image websites, write good content websites, read computer code, deal news, budget websites, data mining clustering news, clustering websites, google data scientist, data scientist google, news website, news read, image mining, google news, data mining simple project, data means clustering, code websites, html data mining, news website mysql, source code simple websites

About the Employer:
( 0 reviews ) Hamburg, Germany

Project ID: #1154751

22 freelancers are bidding on average $3858 for this job

NishantBamb

Project1. Please refer your Inbox. Thank you.

$4000 USD in 30 days
(150 Reviews)
7.3
tomydeveloper

Hello "Project1",We are Scraping / datamining / mysql database expert.Please check pmb.Thanks

$3500 USD in 55 days
(166 Reviews)
7.2
roamsoft

Please check your PM. Thanks

$3500 USD in 45 days
(14 Reviews)
6.8
LinkPlusOffshore

Please see PMB.

$3500 USD in 75 days
(4 Reviews)
6.3
zeke

"Project1" Please see PMB for examples of my previous projects related to web scraping. Very interested in this project, please provide details. Thank you, Zeke

$5000 USD in 30 days
(74 Reviews)
6.2
ehsankayani

HI, I have strong webscrapping experience unfortunately i am new to this website but find this project very interesting , kindly check my profile on another freelance site , http://www.peopleperhour.com/freelancer More

$3000 USD in 30 days
(38 Reviews)
6.1
aruhat

Dear Client, Please see PM. Regards, Chandni

$4000 USD in 18 days
(13 Reviews)
6.1
jeevanoss

Pls see the PMB

$3500 USD in 40 days
(23 Reviews)
5.9
inavigator

Project1... we done the similar application. Please check PMB

$3500 USD in 30 days
(2 Reviews)
5.4
scriptmindz

Can I start?

$6500 USD in 90 days
(12 Reviews)
4.7
menkaur

Project1. Please, check your PMB

$3000 USD in 30 days
(2 Reviews)
3.7
mktgamigo

Hi we have 7+ years of experience team. we can start the project immediately.

$5000 USD in 30 days
(1 Review)
2.6
golabs1

Hi ditonline... I understand what you are talking about the challenges with this project regarding levenshtein functions etc.... I assume the project is to be done in PHP and be serverside I may need to use a mys More

$3000 USD in 25 days
(3 Reviews)
2.2
cocksure

"Project1" plz check pmb thx

$3000 USD in 5 days
(2 Reviews)
2.0
dragonuet

Project1. I can do it.

$4000 USD in 30 days
(1 Review)
1.8
softfinder500

PLEASE CHECK PM

$4000 USD in 65 days
(0 Reviews)
0.0
riktatech

Please check our PMB

$5000 USD in 75 days
(1 Review)
0.0
Feddo

The basis for "Project1" is already finished! Looking forward to customize the program to suit your solution. Please refer to your private message.

$3000 USD in 30 days
(0 Reviews)
0.0
DeepakMalik1234

Work 4 years as a Software Developer. Familiar with Enterprise technologies (Documentum, different Application Servers) and many programming languages. Technical Skills: Documentum ( 2 years ) DC, WebTop, More

$3100 USD in 20 days
(0 Reviews)
0.0
deamon1767

Project1 Based on your needs, I would suggest using Joomla and Econtent for this project. It will handle it perfectly> We have all needed software in our archive. We have read your requirements and understand your nee More

$3765 USD in 50 days
(1 Review)
0.0