Looking for a specialist in web site crawling

CLOSED
Bids
4
Avg Bid (USD)
$16 / hr
Project Budget (USD)
$8 - $15 / hr

Project Description:
We are looking for freelancers who are specialized in web site crawling. We are working on several projects which require full crawling of web sites like e.g. http://www.parlament.ch. For large web sites we typically define several subsites which can serve as improved starting points for the crawler. The results should be the complete texts contained in the web site. Text in PDF files or in HTML-tables also need to be crawled and available in the result. Once the crawler is correctly set up for a specific site, we typically expect a periodic crawling of the site contents (e.g. once a week).

We are looking for someone who is experienced in this data gathering process and can manage all steps (setup crawler, improve crawler, manage document content update, transfer data to our server).

Crawling can be done with Apache Nutch or other crawling softwares which the specialist recommends.

Hours of work: Unspecified Project Duration: Ongoing Skills required:
Apache, XML
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$12 / hr
Hours: 10 hr/ week
Hire clivelim07
$9 / hr
Hours: 10 hr/ week
$35 / hr
Hours: 1 hr/ week
$8 / hr
Hours: 20 hr/ week