I'd like to gather some data for an academic project to study the electronic book market.
The Internet Archive (Wayback Machine) had crawled websites that are of interest to me in the relevant period, and I'd like your help to
(1) Crawl Internet Archive to save html pages of interest
(2) Extract relevant fields in the html to form a comma separated file ready for data analysis packages.
The webpage of interest are product page of books or e-book reader devices in the following period, venue, and category:
2010.1 - 2010.5 (one capture a day if available)
Amazon, Barns & Noble
Physical Book, Kindle/Nook book. (not textbook, newspaper, etc. )
Device itself: Kindle and Nook.
Books listed as bestseller, award winner, editor's picks, best books, book club, etc.
We can discuss whether it's easier to get all books or just the popular books.
Fields of interest: Title, author, publisher, # reviews, ratings, list price, discount price, price of other formats, whether listed as bestseller, sales rank, ISBN, category.
(1) Small sample - prefer to have a small sample by May 14th.
Amazon only, one day in mid March, one day in mid April, one day in mid May in 2010.
(2) Negotiable, but preferably completed before June 5th.
(3) Possible future projects to extract 2005-2013 if initial run goes well.
6 freelancers are bidding on average $208 for this job
Hi alicealisa, I can get the info you need. I have a lot of experience getting info from websites and I'm available immediately. For more info, please see my pm.