ERLANG large scale concurrent real time website scanning

Cancelled

I'm looking for expert consulting on the feasibility (costs in terms of hardware/development) of scanning several million webpages in real-time or close to real time). Longer term employment may result if you can provide a solution that is feasible and within my budget. If you'd like this, you'd need to write a small mini-program to demonstrate the feasibilty and speed of your proposed solution. Knowledge of Python and mysql are a plus, but not required. This project may close as soon as I find someone suitable (it may not stay open the full 60 days).

Currently we have about several million html (no-rss feed) websites to scan stored in a database. I'd like to have them scanned on a real-time basis or a nearly real-time basis because the market data needs to be updated as fast as possible. I'm wondering if Erlang can do this.

Currently, these websites are stored in a database and scanned via brute force methods using approximately 1,000 IP's and three servers with xeon hex core's on each. These are all connected to a database server on the same network. With this, the scanning speed is approximatly 1-3 times every 1-2 hours depending on server load. This is too slow for our needs.

Skills: Erlang

See more: erlang large scale, python consulting, program website in python, mini project on website development, find employment, find consulting, employment plus, employment 2011, costs of website development, brute force solution, core consulting, website development costs, small python project, scanning, Scale, Python network, large data, hardware development, erlang, d real, database real, real small, mini program, python html mysql, python mysql html

Project ID: #912916

1 freelancer is bidding on average $750 for this job

gasparch

Please see PM

$750 USD in 15 days
(1 Review)
3.1