Closed

Resume Parser

This project received 28 bids from talented freelancers with an average bid price of $1007 USD.

Get free quotes for a project like this
Employer working
Project Budget
$750-$1500 USD
Total Bids
28
Project Description

We are looking to develop a web application for resume parsing written in Java/Spring. If you want to use something else please let us know.

This parser will be used to parse thousands of UNSTRUCTURED resumes in html, word (doc, docx), rtf, text and pdf formats.

Input: Resume files in the following formats: WORD, PDF, TEXT, TIF, html
Output: XML format files of the resume when all the words from resume are located in the correct tag of the XML.

The parser needs to be able to extract the following data from the resumes:
. first name
. last name
. address
. city
. state/province
. zip code
. country
. citizenship/immigration status
. email address
. resume job category
. resume title
. career objective or background
. years of professional experience
. employment history
. education history
. licenses and certifications
. foreign languages
. references
. skills keywords
. publications
. security clearances

Output of the parser should be an xml tagged file, one xml file for each parsed resume, output file name to be the same as the input file name with extension changing from [url removed, login to view] to [url removed, login to view]

All of the parsed fields will be used to upload into a mysql database. Parser is required to do the database insertion as part of the parsing process.

We will supply a sample set of resumes, as many as you need to be successful.

Resumes are unstructured so formats and content vary widely. The ability to score the parsing performance would be beneficial. It would be helpful to be able to look at a parsing report (i.e. The application should contain a log file) that indicates which resumes the parser thinks it did poorly on so we can manually revisit those parsed resumes that have the highest probabilty of having parsing errors.

We need to be able to integrate the web application parser with our existing php website.

•The application should contain at least 2 main modules:
[url removed, login to view] converter – Each file format will be translated by this module to text format
[url removed, login to view] engine – This engine should receive a text file and return an XML file
The separation is needed in order to allow additional file formats in the future.

Passing acceptance testing with several resumes will be required at project completion.

I expect there will be a lot more questions so feel free to ask.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online