I want to scrape a significant amount of data from [url removed, login to view] as part of my thesis research. Please visit the website before you bid. There are two parts to this project, I want both parts done by the same bidder.
1) I need a program that can help me compare average salary to average company rating. Under the Salaries section of the website, salaries are given by different job titles. Under the Reviews section of the website, each review, sorted by different job titles, includes a company rating. I want to have a file that has columns for company name, job title, lowest salary, highest salary, average salary, lowest company rating, highest company rating, and average company rating. The job title will be the key link between the salary and company rating data, but there may be inconsistencies. Say a certain job title has a review with a company rating but no salary data, I still want it included in the file. I want as much information as possible from www.glassdoor.com.
2) I need a program that can help me compare average salary to text comments. The motivation to the first part of the project is similar, but instead of company ratings, I want the entire text comments from each review. I want to have a file that has columns for company name, job title, lowest salary, highest salary, average salary, all "pros" comments, all "cons" comments, and all "advice to senior management" comments. Each review on the website is divided into those sections so the sorting shouldn't be difficult. Again, the job title will be the key link, and I want as much information as possible from www.glassdoor.com.
3) Another feature of the program is to compile a list of companies that have high revenues, but poor reviews. Where I can go through companies and look at their reviews and see the lowest performing companies to the highest performing company where they are compared to their revenues. Then all of their details should be provided. Phone number, email, contact names and a link to their current reviews trend etc. This data should be able to compile and extracted to a excel sheet where I can go through the list quickly if I want to call the companies. This then should tie in with and data scraper and/or something like this so I can do calls to these companies with their specific data on hand like phone numbers, emails etc. I have added an excel sheet to how I would like the data to be assembled.
I should be able to do a variety of search queries.
1. Industry type vs rating
2. Employee size vs rating
3. Industry type vs trending down and their rating
Omit 1 and 2 on this project list and focus on 3.
Also...one other thing...search querries should be able to be made on job postings and should be included in the data.