Web Scraping (PDFs)

Closed

Description

Hi there,

I am trying to develop a way to automate the daily retrieval of PDF's from a State Government website and extract (scrape) specific information from the document. The procedure is as follows:

1. Go to URL [url removed, login to view]

2. Enter Docket Code: ORDER

3. Enter Case Type: CD

4. Enter Date Range (to be done daily)

5. Hit ‘Search’

6. Open first document by clicking hyperlink under ‘ID’ Column

a. Identify RELIEF SOUGHT

b. If RELIEF SOUGHT = ‘POOLING’ continue to step 7

c. Else, return to results and open next document, then repeat a/b

7. If DISMISSED return to step 6

8. Else, identify fields highlighted in example documents

9. Export results to excel database – each column name marked in red on example documents

10. Return to search results and continue searching through documents with criteria from step 6

Obviously, I only need the PDF's that pertain to POOLING as the RELIEF TYPE.

I am looking to organize all this data in a program like Excel for my use. I'd like the data to be organized by Order Date, Cause CD No. and then a column for each piece of information highlighted and identified in red in the example documents.

I have provided two examples to show that the document may vary somewhat in formatting and the presentation of data.

Skills: Excel, PDF, Web Scraping

See more: excel web scraping, procedure to develop a website, web scraping excel, url scraping, scraping pdf, scrape url for data and information, document imaging, excel website scraping, government type, excel export pdf, extract excel pdf, extract fields data pdf, pdf hyperlink, pdf extract information, automate excel code, scraping web data excel, pooling , web scraping excel code, excel extract web data, export excel pdf, show pdf web, excel extract data pdf, database scraping search, pdfs, program enter excel

Project ID: #4441298

Awarded to:

chaituse

Hello sir, I can deliver required scraper with excellent quality.

$275 USD in 5 days
(17 Reviews)
4.2

13 freelancers are bidding on average $367 for this job

cheapexcell

Where are the examples ?

$257 USD in 7 days
(190 Reviews)
7.1
Dhruvika111

Dear minz08, Greetings!Please refer to your PM For Bid details. Thanks Dhruvika

$360 USD in 3 days
(163 Reviews)
7.2
fhasanbd

I can do this for you

$275 USD in 5 days
(203 Reviews)
7.0
datasolutionind

Let's Start...

$550 USD in 30 days
(66 Reviews)
6.0
muzammil21

we do not offer package we offer Only guaranteed results..100% quality work within time limit and ...Read more in PMB

$515 USD in 9 days
(26 Reviews)
5.4
sonarkaushik

Sir, I can do the project. Refer PMB. Looking for further discussions in this matter. with thanks and regards

$303 USD in 9 days
(52 Reviews)
5.8
afua23

I will really love to work on this for you. I have gone through the whole thing, being to site and looked at the example attached and understood what you are looking for.

$385 USD in 3 days
(37 Reviews)
5.1
xautoit

im ready to get it done

$275 USD in 3 days
(3 Reviews)
3.7
renatofileto

Interested

$275 USD in 3 days
(2 Reviews)
3.1
thetidevw

Hi!, i can do this very fast, i have similar project done here so i can use this.

$367 USD in 2 days
(1 Review)
2.7
tlyx

Looks like very few of pdfs has all those required fields, correct?

$550 USD in 10 days
(1 Review)
2.4
photoshop

Ready to start.

$385 USD in 3 days
(0 Reviews)
0.0