Closed

Java scrapping improve

I developed a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string (without these, the server after some requests starts responding 512); and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information.

This program does the basic functionality of extracting the information but has a few problems:

It depends on an external non-Java component: Chrome WebDriver

It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove

It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed.

Deliverables

You will get the current program Java code and you will need to solve the problems above. To do so, you will need to:

A. Find out how to authenticate and refresh the 3 headers and the query string without depending on Selenium, Chrome Webdriver, and BrowserMobProxy. As most of this data is likely generated in JavaScript, you will need knowledge about JavaScript and how to execute JavaScript from within Java or convert the JavaScript code to Java (preferable solution).

B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes.

Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.

Skills: Java, JavaScript, Web Scraping, Software Architecture, Python

See more: java samples improve website look, outsource java swing jobs party time, outsource java swing jobs part time, web scraping java vs python, jsoup tutorial, java web scraping handbook pdf, web scraping in java with jsoup, beautifulsoup for java, jaunt java, extract specific data from website using java, web scraping in java tutorial, gps gsm java google maps real time, java code send real time video audio, guitar hero improve reaction time, java developer erding part time, java sipclient example register, java programmers perth part time, java program display real time waveform, create java twodimensional array calculate time wages, google maps java api calculate travel time

About the Employer:
( 1 review ) Băilești, Romania

Project ID: #26825638

8 freelancers are bidding on average $243 for this job

p4logics

Dear Sir, I am interested in your project. I have gone through your requirement, I'm expert in web scraping and web automation using java selenium and jsoup, data management, data mining. I assure, I will do my bes More

$250 USD in 7 days
(63 Reviews)
6.2
SunnyXCJ

HI I'v just read your requirement carefully and understood what you mean I'v rich experience in Python/Java Development I'v completed many projects with your required skills I'm confident to complete your project perfe More

$200 USD in 3 days
(9 Reviews)
4.1
juneadkhan

Hello Sir! I am a web scrping expert, I think I'm a great fit for this project. because I have an interest in your project and can deliver on time, according to your specifications Thanks

$140 USD in 7 days
(2 Reviews)
2.9
alexeygamil

Hi I am Python, Webscraping, PHP, HTML expert and I ONLY apply to jobs when I know I can do it. I am confident that I will deliver you the best solution possible and will exceed your expectations. I'm excited with fe More

$200 USD in 3 days
(2 Reviews)
2.8
serzhkavalchuk96

Hi, With over 5 years of experience in Python. I’ve gone through your complete project description. I am interested in this project as it is exactly within the scope of my skill. My main skills are as follows: Python, More

$800 USD in 5 days
(2 Reviews)
2.4
asr112

I am a Java Software Engineer having 3+ years of experience. I have good understanding about Java Stacks like Spring Cloud, Spring Boot, Jhipster, Spring Security, JPA & Hibernate etc. I am expertise in Microservice ar More

$200 USD in 2 days
(0 Reviews)
0.0
shailap

Implementing web design and development principles to build stable software. Bringing mock-ups to life using HTML, CSS, JavaScript. Collaborating closely with the team to support projects during all phases of delivery. More

$35 USD in 7 days
(0 Reviews)
0.0
engrfarooq04

Hi, Hope you are doing well! Thanks for sharing your project requirement with us.i have experience in python,algorithm and have excellent programming skills understood the requirements that you want to scrapp the info More

$120 USD in 7 days
(0 Reviews)
1.1