Create Multi-Threaded Distributed Web Crawler on AWS
This project received 4 bids from talented freelancers with an average bid price of $179 USD.Get free quotes for a project like this
This is much, much simpler than a typical 'web crawler'. It needs to be run as cheaply as possible (preferably on AWS).
The software has 2 simple functions:
1. URLS: Grab a webpage (with a multi-threaded approach), these are simply pulled from the db along with the extraction class to use.
2. EXTRACTION CLASSES: Classes with ability to easily extract data from HTML, following a given pattern and insert into db. (with a multi-threaded approach)
You should follow this Perl approach and make sure your solution will garner similar, if not better results.
[url removed, login to view]
(Further reading: [url removed, login to view] )
For an experienced programer I expect this to take no longer than a day as instructions are laid out above, therefore budget is very low, bid accordingly.
Looking to make some money?
- Set your budget and the timeframe
- Outline your proposal
- Get paid for your work
Hire Freelancers who also bid on this project
Looking for work?
Work on projects like this and make money from home!Sign Up Now
- The New York Times
- Wall Street Journal
- Times Online