PHP HTML DOM Web Scraping Issues

  • Status Closed
  • Budget $30 - $250 USD
  • Total Bids 17

Project Description


I'm using the 'Simple HTML DOM Parser' for scraping a few pages from Amazon. Everything is working fine some of the time, but I keep getting a 'captcha' request from Amazon stopping the script because it is recognizing that I'm using a scraper. I can normally request around 3 pages before it stops my script with the following message "Sorry, we just need to make sure you're not a robot. For best results, please make sure your browser is accepting cookies.".

What I've Tried:

I've tried adding user-agents. I've also tried spacing out the requests to between 45-60 seconds. Neither have consistently worked.

What I Need:

I setup a test page for someone to get it to work properly (consistently) which contains the function for scraping the data and the 'Simple HTML DOM' library page included. I'm currently just echoing all of the HTML in the script so you can see if the page is returned or if Amazon is blocking the request with a Captcha. I'd like to keep the library I'm using ([url removed, login to view]) because I have other scripting based off of it. I also need this complete ASAP - tonight or tomorrow the latest.

Thank you.

Get free quotes for a project like this

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online