Crawl website, Extract the content, Store it into DB and make it searchable

This project received 25 bids from talented freelancers with an average bid price of $824 USD.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
Total Bids
Project Description

Hello, i need someone who expert about web scraping.

1. Crawl [url removed, login to view] & [url removed, login to view]
2. Use the best technique to crawl the website using nutch or other engine
3. extract all the files name, size, source url, source title, filehost, download links, etc
4 stock it in database.
5. Make it "searchable" using elastic search or sphinxsearch, i dont know which one better
6. Create a well-formatted json or xml search results, for results with specific queries.

And also tell me what is your guarantee for this project

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online