Experienced programmer for large scale crawl

In Progress Posted 7 years ago Paid on delivery
In Progress

URGENT PROJECT, ONLY FOR SOMEONE WHO IS AVAILABLE FULL TIME FOR THE NEXT FEW DAYS.

I have a list with a few millions of domains, for each one I need to request 1-5 pages and extract some data from the HTML using regex.

I will provide a server with strong capabilities, you need to write the crawling/scraping code and use the server to run it, the result for each domain will be the HTML files + a json file with the values I will ask you to extract.

This is a large scale crawl so you must have experience in multithreaded crawling and in general you need to know all the standard tricks of web crawling.

Please bid and tell about your experience in web crawling.

Linux Programming Web Scraping

Project ID: #11467082

About the project

13 proposals Remote project Active 7 years ago

Awarded to:

Crazometer

Hello, I'm a scraping expert and would be able to build you a multi-threaded [login to view URL] would be built in nodejs and then dump the data into a nosql database. If needed we would also be able to scale it out to multi More

$36 USD / hour
(6 Reviews)
5.3

13 freelancers are bidding on average $1306/hour for this job

gangabass

Please give me more details about pages you want to check on each domain so I can estimate completion time. Thanks. Roman

$1052 USD / hour
(374 Reviews)
7.5
sergioes

Hi, I have 10+ years experience in web scrapping, and I'm completely available for the next few weeks. Regards, Sergio.

$45 USD / hour
(61 Reviews)
5.8
thewebscraper

Hi I am an expert web scraper. I will use python mechanize and subprocess for web crawling and multiprocess.

$55 USD / hour
(44 Reviews)
5.5
LeadSoft

Hello, My name is Adrian. I own a software development company and I can provide you at least one dedicated full time senior web developer with over 7 years experience for an hourly rate of 25$. I have a senior d More

$27 USD / hour
(3 Reviews)
5.5