WebScraper -- 2
zł30-90 PLN
Paid on delivery
I need a Web Scraping application to scrap reviews of products in polish website ceneo.pl. It has to work as an ETL process. Extract, Transform and Load. Extract raw data, then transform it to get useful, clear data and last, load it to database.
Data to be scraped for each review (by class):
a. product-reviewer (if lack of info, it shoul write "Anonim")
b. reviewer-recommendation
c. datetime (when posted)
d. review-score-count
e. product-review-body
f. pros
g. cons
h. vote-yes-(some number?) vote-no-(...) How many thumbs up and how many down. If "0", then it should remain empty in database.
4. Data to be scraped for each product:
a. Device Type
b. brand
c. Model
d. Additional comments
First user need to input ID of a product (e.g [login to view URL] the number is ID) to scrape, then should be 2 buttons: [Extract Tranform Load] - which should stop after each step (Extract -> output of raw data [continue button] Transform -> output clear data [continue button] Load -> output data loaded to DB [Back to choices button with question - "Do you want to clear DB?" if yes -> clear and show home page, if no show home page]) and [ETL] which should show the final data in database (with the same back to choices button as before).
Application must have a button to clear database in home page too, to show how it is loaded, and app should delete every single file used for the ETL process when it is finished. Of course it can't be any duplicates in database. There should be an information about how many reviews was scraped. Of course it should work for all subpages if there is any.
The application should allow the display of opinions from the database by product which they discribe. There should be an option to export all scraped data to .csv file and each review to .txt file (e.g a button next to review, and a button to export all scraped data to csv).
I prefer PHP curl and simplehtmldom libraries, html5, bootstrap and MySQL for it.
I does not have to look extraordinarily, it has to work.
I need this app no later than 12 january 2017.
Thanks in advance!
Edit: I need it till tomorrow 13.01.2017!!
I need it till 14.01.2017!!!
Project ID: #12766854