PDF Data Extraction & Storage

This project was awarded to chitta for $555 USD.

Get free quotes for a project like this
Project Budget
$250 - $750 USD
Total Bids
Project Description

I have a set of ~1000 PDF files containing PO data that I need to have the data extracted from and stored in a relational database. The data is somewhat structured and should lend itself easily to extraction.

Deliverables & Details:

-Preferred technology: PHP & MySQL

-Application shall be configurable so that various environmental variables including but not limited to database and file system connection parameters.

-When executed, application shall scan configured file system directory & scrape data from multi-page PDF files.

-Extracted data shall be stored in a relational MYSQL database.

-PO shipping address records shall not be duplicated in relational database.

-All file processing results and exceptions shall be logged.

-Source code committed to Github.

-Application will be demonstrated and tested in provided continuous integration environment linked to Github repository.

-Sample PDF data upon qualified request.

Awarded to:
Skills Required

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online