We are looking for someone able to create a public search engine using elastic search and nutch for crawling or the constellio system.
What we need:
1. Crawl [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view] and [url removed, login to view]
2. Use the best technique to crawl up to 1 - 2 million pages per day.
3. extract all the files name + download links
4 stock it in our database.
4. Make it "searchable" inside our search engine.
We have the global idea but looking for someone able to advice us how to realize this project like a consultant and then provide the technology to start the project.
6 freelancers are bidding on average $3260 for this job
Hi We are interested in your project and read your requirements.We have completed 150+ big projects in last 5 years. Please check private message board for details.