PHP5 CLI - Crawl for host and domain names

This project received 4 bids from talented freelancers with an average bid price of ₱10000 PHP.

Get free quotes for a project like this
Project Budget
Total Bids
Project Description

I need a PHP 5.3+ CLI crawler using CURL / DOM to extract host and domain names from websites. Crawling must read and follow [url removed, login to view] files. Must be multi-treaded to crawling is fast an efficient. Crawling given host name, supplied by a JSON data feed ([url removed, login to view]) returning a list of ALL domains and hostnames that site links to in JSON format. This should be a unique list so no hostname / domain is repeated. This list then will be submitted via an API to another script. This system MUST be very memory efficient and follow PHP 5.3+ recommended programming standards.

Items to check for host / domain names should be images, scripts and href values, but allow for expansion whilst coding.

This script will only be run from the Debian command line using PHP so make sure you really know CLI before bidding. This is the first of MANY small projects that will link together so clear well documented approach is essential.

Skills Required

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online