Closed

Manual PDF Data Extraction

This project was awarded to yaarali for $15 USD.

Get free quotes for a project like this
Employer working
Awarded to:
Project Budget
$30 - $5000 USD
Total Bids
47
Project Description

I have approximately 650 PDFs of one page each that I need specific text from. The PDFs are the first page of scientific articles and all I want from them are the Title, Author Names (if present), Authors' Affiliations (if present), Abstract(if present), **and Keywords (if present).**


There is no pattern between the files, so it will require manual copying. Most, if not all, are selectable, so typing should be minimal.

I have attached the zip file of the PDFs and an example output file to help you decide if you can do this project. **Please submit a sample output of two files, or one Excel file with two sheets, so I can verify you understand. **
1. **Filename, Author Number, Author, Author Affiliation(if present)**
2. **Filename, Title, Abstract (if present), Keywords (if present)**


Thanks!

Edits in Bold:
1. Added "Keywords" to desired fields.
2. Request for sample output files


## Deliverables

Attached are the PDFs and example output file. I organized it in Excel with two sheets, but I will also accept two text files with an appropriate delimiter that does not appear in the text. Please state which output format you will provide when bidding.


**Edit: Updated Excel file to include Keywords from example files. Changes in Excel file are in red. **
* * *This broadcast message was sent to all bidders on Friday Jul 29, 2011 7:49:49 PM:



I have updated the project requirements to also include "Keywords" on the Title, Abstract file. Please see the updated example Excel file. Also, please submit a sample output (pick 1 or 2 of the 650 at random, your choice) so I can verify you understand the task. Thanks!

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online