Closed

Java scrapping improve expert

I developed a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string (without these, the server after some requests starts responding 512); and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information.

This program does the basic functionality of extracting the information but has a few problems:

It depends on an external non-Java component: Chrome WebDriver

It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove

It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed.

Deliverables

You will get the current program Java code and you will need to solve the problems above. To do so, you will need to:

A. Find out how to authenticate and refresh the 3 headers and the query string without depending on Selenium, Chrome Webdriver, and BrowserMobProxy. As most of this data is likely generated in JavaScript, you will need knowledge about JavaScript and how to execute JavaScript from within Java or convert the JavaScript code to Java (preferable solution).

B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes.

Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.

Skills: Java, Web Scraping, JavaScript, Python, Software Architecture

See more: java samples improve website look, java script designing form register website, expert working time, web scraping java vs python, scraper library java, jsoup, web scraping java source code, how to do screen scraping in java, java web scraping handbook pdf, java web scraping package, beautifulsoup for java, java countdown clock based server time, java developer job ofbiz full time, excel expert part time job, expert advisor time frame value, improve sql query time oscommerce, java codes sql cash register, java socket code server login time clients, perl java scrapping, recent java projects indian citizen part time

About the Employer:
( 1 review ) Băilești, Romania

Project ID: #26951665

12 freelancers are bidding on average $179 for this job

schoudhary1553

Hi, Greetings! ✅checked your project details: Java scrapping improve expert ✅Completed Time: In project deadline We have worked on 650 + Projects. I have 6 + years of the experience in same kind of projects. If More

$240 USD in 5 days
(128 Reviews)
7.1
kkc264043kkc

I can replace existing scrape with python requests library with removing headache of jars and selenium and chrome driver. it will be very light weight. to resolve 429 errors we need to use proxy or delay. These are my More

$167 USD in 2 days
(55 Reviews)
6.1
StanislavRezer

Hi, Manger! Here is a Web Scraping expert. I have experience web scrapping and auto send mail app. So I think it's best for this project. I can work 24/7. if you require, I can work 60hours per week. if you hire me, y More

$200 USD in 3 days
(29 Reviews)
5.6
suyashdhoot

Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several comp More

$250 USD in 7 days
(25 Reviews)
5.5
PoojaRautela417

Hi there, Let’s have a quick chat to discuss this project. I am expert in Python, PHP, JavaScript,Web Scraping,MYSQL.I do have expertise for this project. You can check my portfolio here:- https://www.freelancer.com/ More

$250 USD in 3 days
(18 Reviews)
4.7
PKonstiantyn

Hello, I am very interested in your project”Java scrapping improve expert”. Webscrapping is my best skill. I have read the job description and I am interested in this job. I have 8 years experience in developing produ More

$140 USD in 7 days
(14 Reviews)
4.7
umairali8062

Hii there , I am bidding on your project and I am good at this field I can do this for you within due time and honestly. I also have a few questions to discuss. Kindly contact me and we will discuss time and budge More

$140 USD in 7 days
(17 Reviews)
4.3
priyanandrai

Hello, I hpoe your family safe with Covid-19 I am a Java Full Stack Developer with hands-on experience working on Various websites, applications for more than 5+ years. I have an expert development team. All the resou More

$140 USD in 7 days
(10 Reviews)
1.9
nutankumarftp

Dear Employer, Thanks for posting the project . I have gone through your description " need to create your own login/password in the webpage. No additional requirements exist to register." and I believe I'm capable to More

$140 USD in 7 days
(0 Reviews)
0.0
rajatyadav25

Hi! I am happy to put my bid on your project. I have read your requirement carefully and I am confident in this project. I am a skillful and experienced web developer, I have a tons of experience with JAVA/Python/MY More

$140 USD in 7 days
(0 Reviews)
0.0
BabakKanan

Hi there, In order to scrap data from eTORO, may I suggest an alternative approach to yours? Selenium inherently has limitation and that is why your solution is unnecessarily complex. I suggest Puppeteer (https://dev More

$200 USD in 3 days
(0 Reviews)
0.0
cashraddha3105

Hi, I am a new freelancer. I have professional experience in [login to view URL], Jave script, C#, Web scrapiing, Data Mining, Pythton and Tableau. I am also a certified Charted Accountant. I can deliver high Quality project and More

$140 USD in 5 days
(1 Review)
0.0