Java Scraping - Extracting Content From The Page



We are looking for a PoC app showing how to extract main content from the random html page, stripping everything else out (navigation, banners, sides, etc) .

Similar to what instapaper does with random content page.

I have attached list of random html pages covering similar topic, result application should intelligently extract only main content from the page.

!!! To be considered for the job, please outline general direction you would take

Skills: Java, Web Scraping

See more: java scraping, web scraping application, scraping web content, poc web, result java app, page app java, content page, page java app, intelligently, scraping java, web scraping java pages, extract list web page, scraping web pages, java html app, page outline, navigation java application, java content, java web app, java navigation, web poc, list application java, content banners, scraping html java, job web scraping, web scraping web application

Project ID: #4188123

5 freelancers are bidding on average $220 for this job


Hi sir, please check PM, thx Kimi.

$250 USD in 5 days
(65 Reviews)

Scraping Experts Here. Check the message and contact us. Scraping samples are also attached.

$250 USD in 10 days
(15 Reviews)

Hi, Ready to start your work. Eagerly awaiting for your positive reply. Please check your inbox for further details. Thanks, Shaik.

$250 USD in 3 days
(22 Reviews)

I can complete your project perfectly, please check PMB for more details....................Thanks..

$200 USD in 7 days
(18 Reviews)

I have something that will work for this.

$100 USD in 1 day
(1 Review)

1) Will the program download the content or not or you have something already doing the download? Just wanted to know 2) The way you go about doing this is use URLConnection to download if you want to download the More

$300 USD in 3 days
(0 Reviews)