We are looking for someone able to create a public search engine using elastic search and nutch for crawling or the constellio system.
What we need:
1. Crawl [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view] and [url removed, login to view]
2. Use the best technique to crawl up to 1 - 2 million pages per day.
3. extract all the files name + download links
4 stock it in our database.
4. Make it "searchable" inside our search engine.
We have the global idea but looking for someone able to advice us how to realize this project like a consultant and then provide the technology to start the project.
19 freelancers are bidding on average $2806 for this job
We have worked with a similar project and have crawled/indexed and provide search engine based on sphinx for a 7M video website. I've read all requirement. I know we can do it. please check PM
Hi, I'm Apache Solr 4.0 developer experience.I have done similar project in past and i can share details if required.I can complete this project within the given time frame.
"I've read all requirement" and did a search engine so this project is under our expertise if you have a fair budget, please check private message for example and decide the best team accordingly. Thanks