Closed

Java expert Improve webpage scrapping solution

I developed a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string (without these, the server after some requests starts responding 512); and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information.

This program does the basic functionality of extracting the information but has a few problems:

It depends on an external non-Java component: Chrome WebDriver

It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove

It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed.

Deliverables

You will get the current program Java code and you will need to solve the problems above. To do so, you will need to:

A. Find out how to authenticate and refresh the 3 headers and the query string without depending on Selenium, Chrome Webdriver, and BrowserMobProxy. As most of this data is likely generated in JavaScript, you will need knowledge about JavaScript and how to execute JavaScript from within Java or convert the JavaScript code to Java (preferable solution).

B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes.

Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.

Skills: Java, JavaScript, Software Architecture, Web Scraping, PHP

See more: java project linked list assignemnt solution, java media player webpage, nice java code gallery webpage, java samples improve website look, salary java expert, integrate java expert system shell jsp, java expert mumbai, java expert melbourne, java photo galleries webpage, absolue java expert, freelance java expert, java screen scrape webpage, java car buses boat program solution, time needed become java expert, swing java expert, java implementation longest common problem solution, java expert needed, java expert needed code fixing

About the Employer:
( 1 review ) Băilești, Romania

Project ID: #26818705

9 freelancers are bidding on average $478 for this job

(93 Reviews)
5.6
eightsl

Hello, I'd like to take a look at this. Can you send me the website in question and the information you want to extract? I'll try to mimic the behavior of Selenium by sending the appropriate headers, then parse out th More

$400 USD in 7 days
(12 Reviews)
3.9
shynar88

~Dear client~ I am sure I can complete❤️ your Angular website perfectly. I have more than ⭐7+ years of experience in Website Design and developments already. Also I have just read your requirements carefully. For this More

$500 USD in 7 days
(9 Reviews)
4.4
inforajes

Hi, I am certified UiPath and Selenium Expert.I have 5 years of experience in Robotic Process Automation.I have worked with different clients with complex workflow. i read your description u need to automate a browser More

$500 USD in 1 day
(5 Reviews)
3.2
milandjokic

Hello. How are you? I have already read your description and i think i am qualified for this subject. I'm full-stack web developer. "He that has most time has none to lose" it's my creed. I'll solve your anyproblems in More

$500 USD in 7 days
(4 Reviews)
2.6
serhiilyskin

Hi, sir. This is Serhii from Ukraine. I've been working as a programmer for over 10 years, and I have very much experience in web & desktop app development. I have checked your requirements in detail, and I think I can More

$555 USD in 6 days
(1 Review)
2.0
lukyaanton

Hello, manager! Nice to meet you. I hope you are safe without the effect of covid-19. Should I use only Java? If u agree, I can use python for your project. I have rich experience with web scraping. I used python selen More

$500 USD in 5 days
(3 Reviews)
1.8
sgramesh1980

Greetings, We have  teams with PHP, Laravel, Magento, Codeigniter, Woo-Commerce, Angular(Prime Engine), Node, React, Flutter, Wordpress, Java, Android, IOS and Dot net.  We have a 2-8 + years team with well qualified More

$500 USD in 7 days
(0 Reviews)
0.0
haris35185

Hi Sir, High Quality & Fast Delivery is promised.I am professional [login to view URL] work Will be Done as you say as you want. I can start right now. 100% Satisfaction guaranteed with my work.I have gone throu More

$550 USD in 6 days
(0 Reviews)
0.0