Resume Parser

CLOSED
Bids
28
Avg Bid (USD)
$1007
Project Budget (USD)
$750 - $1500

Project Description:
We are looking to develop a web application for resume parsing written in Java/Spring. If you want to use something else please let us know.

This parser will be used to parse thousands of UNSTRUCTURED resumes in html, word (doc, docx), rtf, text and pdf formats.

Input: Resume files in the following formats: WORD, PDF, TEXT, TIF, html
Output: XML format files of the resume when all the words from resume are located in the correct tag of the XML.

The parser needs to be able to extract the following data from the resumes:
. first name
. last name
. address
. city
. state/province
. zip code
. country
. citizenship/immigration status
. email address
. resume job category
. resume title
. career objective or background
. years of professional experience
. employment history
. education history
. licenses and certifications
. foreign languages
. references
. skills keywords
. publications
. security clearances

Output of the parser should be an xml tagged file, one xml file for each parsed resume, output file name to be the same as the input file name with extension changing from resumefile.xxx to resumefile.xml

All of the parsed fields will be used to upload into a mysql database. Parser is required to do the database insertion as part of the parsing process.

We will supply a sample set of resumes, as many as you need to be successful.

Resumes are unstructured so formats and content vary widely. The ability to score the parsing performance would be beneficial. It would be helpful to be able to look at a parsing report (i.e. The application should contain a log file) that indicates which resumes the parser thinks it did poorly on so we can manually revisit those parsed resumes that have the highest probabilty of having parsing errors.

We need to be able to integrate the web application parser with our existing php website.

•The application should contain at least 2 main modules:
1.File converter – Each file format will be translated by this module to text format
2.Parsing engine – This engine should receive a text file and return an XML file
The separation is needed in order to allow additional file formats in the future.

Passing acceptance testing with several resumes will be required at project completion.

I expect there will be a lot more questions so feel free to ask.

Skills required:
Data Processing, Java, Research, Software Architecture, XML
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$ 750
in 7 days
$ 800
in 10 days
$ 750
in 5 days
Hire initSoftware
$ 1200
in 30 days
Hire Methodz
$ 850
in 10 days
Hire oddSchool
$ 2000
in 20 days
$ 1200
in 20 days
Hire apostle13th
$ 750
in 0 days
$ 1500
in 30 days
$ 1000
in 10 days