PDF Data Extraction & Storage

This project was awarded to chitta for $555 USD.

Get free quotes for a project like this
Employer working
Awarded to:
Skills Required
Project Budget
$250 - $750 USD
Total Bids
Project Description

I have a set of ~1000 PDF files containing PO data that I need to have the data extracted from and stored in a relational database. The data is somewhat structured and should lend itself easily to extraction.

Deliverables & Details:
-Preferred technology: PHP & MySQL
-Application shall be configurable so that various environmental variables including but not limited to database and file system connection parameters.
-When executed, application shall scan configured file system directory & scrape data from multi-page PDF files.
-Extracted data shall be stored in a relational MYSQL database.
-PO shipping address records shall not be duplicated in relational database.
-All file processing results and exceptions shall be logged.
-Source code committed to Github.
-Application will be demonstrated and tested in provided continuous integration environment linked to Github repository.
-Sample PDF data upon qualified request.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online