In Progress

Website Spider/Crawler

Notice: This project is for a FUNCTION written in PHP, not an entire website!

Project Analysis:

Website Spider/Crawler

I Need this to be ran by calling 1 function (which may call other supporting functions if needed)

function spider(url,follow_javascript,return_404,use_robots_file,use_meta_tags)

{

// function spider is to follow a given website url and list all links on the page, then follow the links until their are no more to follow("A Search Engine Spider"). This statement will be true when the spider tries to "leave" the domain, or their are no links found on the page. This spider is to be limited to the domain provided! The output should not list duplicate entries). The spider needs to be able to follow javascript, and href locations (including popups). The spider needs to follow unique and dynamic links as well. As long as it is a readable page, it needs to be crawled. (including query string generated pages). If a link has a rel="no follow" included in the link, it is to ignore the url.

= Paremeters

@url - determines website to crawl

@follow_javascript - determines if it will follow javascript locations or not: ([url removed, login to view] & [url removed, login to view])

@return_404 - determines if it will return errors on bad links. If this is set to true and a bad link is found, it needs to return the page the link was found on, and the bad link itself.

@use_robots_file - determines if the spider will follow the rules of the [url removed, login to view] file. If set to true, rules need to be followed exactly.

@use_meta_tags - determines if the spider will follow the spider meta tags. If set to true, rules need to be followed exactly.

}

Skills: PHP

See more: website spider, spider crawler, website spider test, website spider engine, spider crawlers, website spider reviews, website spider javascript, vbnet spider crawler, php spider crawler, crawler follow javascript, robot website, notice leave, function calling, php spider engine, spider javascript url, crawler follow javascript windowopen, crawler return excel, website spider test report, vbnet website spider, test crawler sees website, show site spider, list links spider, complex php spider crawler, website spider crawler, javascript crawler

About the Employer:
( 1 review ) Columbus, United States

Project ID: #40145

Awarded to:

cliver

Hello, Please look at the PMB. Thanks, Sergey

$300 USD in 3 days
(18 Reviews)
6.2

8 freelancers are bidding on average $281 for this job

websoftinfo

Our bid is for really very high quality work for your Website Spider/Crawler that will be made to be upgradable in case you need some upgrades in future. We will always be available for upgrades. Our bid includes six w More

$300 USD in 10 days
(66 Reviews)
6.4
bsoist

Please see PMB for details.

$200 USD in 5 days
(36 Reviews)
6.3
rsdsoft

Hello. I has a lot of experience in parsing and extracting data from catalogs, sites and simple html pages. I just finished to develop crawlers(products with description and etc.) for this catalogs: www.mcmaster. More

$300 USD in 5 days
(21 Reviews)
6.1
smirnoff

At your service

$250 USD in 3 days
(14 Reviews)
5.6
navrajsharma

HI !!!. I have the script. Only to customise as per the robot.txt If interested PMB

$300 USD in 10 days
(0 Reviews)
0.0
Tilani

I can assure you that we can provide you with the best solution. You can know more about our company if you visit our website http://www.bccomputersltd.com . . We can assure you that we will provide you with the best q More

$300 USD in 10 days
(0 Reviews)
0.0
vermasourav

I have read the project specifications and I am ready to do this. I assure you that you will get 100% satisfaction.

$300 USD in 5 days
(0 Reviews)
0.0