Closed

Process efficiently many HTTP POST requests simultaneously

This project received 8 bids from talented freelancers with an average bid price of $526 USD.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
N/A
Total Bids
8
Project Description

IMPORTANT:
Please read the explanation very carefully and make sure you understand the task well. Only respond if you are 100% sure that you can do the job.

Currently I'm using php multi_curl to execute a large number of php scripts simultaneously. Around 100 http requests have to be processed simultaneously. Those requests are done to the same server and to remote servers. Along with each http request also a POST string is being send to the php scripts. The problem is that the current solution that uses the build-in function multi_curl from php is very slow. Also: Most of the scripts are not fired right a way, meaning that there is a delay in starting the scripts.

The tasks:

1. Configure a linux server properly to be able to efficiently process the scripts. Describe all the steps you do in the configuration very precise, in such a way that anyone can do it by just following each step.
2. Build a new php script ([url removed, login to view]) that does 200 requests to the below script on the same server. Note that the URL is just an example, you can choose any website that is able to process the large number of requests easily.

Keep in mind the following:

I need a step-by-step description of how to solve this problem. Please comment before every line of code what it does, and what the input and output is. I prefer not to install any additional software, but if you have good arguments to so this is an option. Also read the below background information on the current method I'm using.

The script (which will be called 200 times simultaneously):




Background information on multi-curl

A more efficient implementation of curl_multi() curl_multi is a great way to process multiple HTTP requests in parallel in PHP. curl_multi is particularly handy when working with large data sets (like fetching thousands of RSS feeds at one time). Unfortunately there is very little documentation on the best way to implement curl_multi. As a result, most of the examples around the web are either inefficient or fail entirely when asked to handle more than a few hundred requests.

The problem is that most implementations of curl_multi wait for each set of requests to complete before processing them. If there are too many requests to process at once, they usually get broken into groups that are then processed one at a time. The problem with this is that each group has to wait for the slowest request to download. In a group of 100 requests, all it takes is one slow one to delay the processing of 99 others. The larger the number of requests you are dealing with, the more noticeable this latency becomes.
The solution is to process each request as soon as it completes. This eliminates the wasted CPU cycles from busy waiting. I also created a queue of cURL requests to allow for maximum throughput. Each time a request is completed, I add a new one from the queue. By dynamically adding and removing links, we keep a constant number of links downloading at all times. This gives us a way to throttle the amount of simultaneous requests we are sending. The result is a faster and more efficient way of processing large quantities of cURL requests in parallel.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online