I need someone who have experience with scraping data from web service/API help me with this work. If you are student, do not apply. I need professional people who have experience with python scraping that can finish it for me. Knowledge of visualization will helpful for bonus. Here is the job. We have a website with web service call like a survey interview, depend on the previous question, it will display the next question. So I need to get all the content of the survey.
The first request to the server is something like:
It will create a session variable for current session to use in the 2nd request as well as return the first content (next question).
Then there will be a series of questions for this above session with value Yes, No, Back (first question don't have Back) call with the URL like below:
http://localhost/?m=NAME&session=530cc4c3000082aafmd&navigationMethod=[Yes, No, Back]
Each time we click Yes/No/Back button, the navigationMethod variable will change accordingly and return a json content for the next question:
At the end, there will be no question, just a json return with a lot of information. But now, the session variable is killed so we can not get back to previous question.
What I need you to do is to write a python script that can help me to scrap all the data and save to a text file.
What I think we need to do is write a breadth-first search or depth-first search function to traverse along the tree that can save each state when we click the button. The state will include both the hierarchy of current node (like 126.96.36.199.1.2.1 ... [1 for Yes, 2 for No]) as well as the content (json returnted). And then when the session variable no longer work, we will need to create a new session variable and get the previous state we have saved and continue, continue until we traverse all the leaves of tree.
Below is requirements:
+ It should be a function that I can call from command line like:
CMD >>>> python [url removed, login to view] -DES1 -VAL1 -UNIT1 -DES2 -VAL2 -DES3 -VAL3 [url removed, login to view]
+ The content of text file should have some way that allow me to track the hierachy of questions, which one display first, which one is second, which one is the child of which one. [Because it limit to 4000 character, I put the sample in text file.]
The deep of each tree may be up to 50 levels, even more than that.
Bonus 1: If you can have some way that allow me to filter the content it return, like I only want to save variable3, variable5, variable6 from the json return to text file. There is a bonus for it.
Bonus 2: If you can format the data and write a tool allow me to import the text file [url removed, login to view] and display it like a tree diagram like this example [url removed, login to view] , whenever I click to one note, i can get the data popup, that would be great. There is an extra bonus.
Bonus 3: If you can write a tree diagram allow me to load the text file [url removed, login to view] and make change, and save it back to the original file. There is also another extra bonus.
Let me know with your price:
Bonus 1: XX$
Bonus 2: XX$
Bonus 3: XX$
13 freelancers are bidding on average $199 for this job
Hi sir, I am scraping expert, I have did too many similar projects, please check my feedback then you will know. Can you tell me more details? then I will provide demo data for you. Thanks, Kimi
we have already done web scrapping in our project for news site. So we can work on your project very quickly. so let me know when you want to discuss with me...
I'd like to help you with this web scraper. I have a lot of experience writing Python crawlers and bots. I also have some experience with matplotlib for visualization. I am really interested in the project.