Closed

ERLANG large scale concurrent real time website scanning

This project received 1 bids from talented freelancers with an average bid price of $750 USD.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
N/A
Total Bids
1
Project Description

I'm looking for expert consulting on the feasibility (costs in terms of hardware/development) of scanning several million webpages in real-time or close to real time). Longer term employment may result if you can provide a solution that is feasible and within my budget. If you'd like this, you'd need to write a small mini-program to demonstrate the feasibilty and speed of your proposed solution. Knowledge of Python and mysql are a plus, but not required. This project may close as soon as I find someone suitable (it may not stay open the full 60 days).

Currently we have about several million html (no-rss feed) websites to scan stored in a database. I'd like to have them scanned on a real-time basis or a nearly real-time basis because the market data needs to be updated as fast as possible. I'm wondering if Erlang can do this.

Currently, these websites are stored in a database and scanned via brute force methods using approximately 1,000 IP's and three servers with xeon hex core's on each. These are all connected to a database server on the same network. With this, the scanning speed is approximatly 1-3 times every 1-2 hours depending on server load. This is too slow for our needs.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online