Closed

Website crawler for HTML content

I need a crawler to identify phrases in the html of websites, for example "google analytics".

There will be about 5 phrases in total, i want this to be an input that i can control. I want to be able to control the depth of the crawl in terms of how many levels "deep" the crawler goes into the website (e.g., home page --> about us --> management would be 3 layers deep).

Also, i want to be able to control the total number of pages crawled per site, e.g., cut-off search after 100 pages crawled.

Finally, the crawler needs to be able to crawl 20,000 sites in about a week. Therefore, the winner bidder needs to be able to build a "fast" crawler--e.g., utilizing multi-threading etc. Also, i will need to be able to upload the urls of the websites I want to crawl.

Finally, this crawler needs to be completed in a couple days.

This is something that was allready asked a couple of months ago by somebody else. But I need it as well now.

Skills: PHP

See more: crawler html content, input html, fast php website, website crawler, utilizing, multi threading, crawler, build html, website build example, fast crawler, html build, html number, crawl google, quot html, crawl sites, html google, build websites content, management content, php html cut, html 100, website home page example, html quot, crawler website, search identify, html search site

About the Employer:
( 0 reviews ) Hoorn, Netherlands

Project ID: #556542

6 freelancers are bidding on average $177 for this job

wildlily980

I'm interesting in it. check pmb for detaisl.

$150 USD in 7 days
(47 Reviews)
6.3
numatido

Hi, Please check your PM. Thanks.

$150 USD in 2 days
(2 Reviews)
2.8
svetlinb

Contact me to clarify details on the project

$150 USD in 2 days
(0 Reviews)
0.0
mrtuannm

Hi, Please see some websites we've developed: [url removed, login to view] [url removed, login to view] ... and at [url removed, login to view] we've created price search engine website. In which have several crawler modules to More

$230 USD in 3 days
(0 Reviews)
0.0
nzpiknik

Hello, Thank you for your clear specification and requirement, I wish all jobs on [url removed, login to view] were as clear and concise as your post. I suggest having a screen where you would enter (a) the phrases to se More

$200 USD in 7 days
(0 Reviews)
0.0
alphacoms

I can do this in PHP. This will be a multi-threading script, if we can say this. PHP doesnt naturally support it, but there are some tricks to implement it. I've the similar experience.

$180 USD in 7 days
(0 Reviews)
0.0