Need to parse and analyze the resumes
Input file: PDF or XML
The program should convert any document to XML format first.
After converting, extract the data and analyze
Need to analyze the following
Lets say the resume received specifies that candidate worked as Software engineer
The program should understand the context and give probable related experience.
It should also take into consideration other position that candidate worked and give the related results and probability in percentage for fitness for the position.
You will need to understand the context of the information in order to achieve better accuracy, just "simple match for the requirements keyword with the available data won’t work".
Consider the example:
Requirement; JAVA 4 years a experience
Software engineer 2000-2007
- wrote hello world program in JAVA
- learned JAVA tutorials
- worked as business analyst
- helped team to achieve better accuracy
- served as receptionist
- helped other people with food
To the human eye, its pretty evident that a person has worked as JAVA for only basic stuff. He hasn’t worked for JAVA for more than a month but the experience in terms of year it is more than 4 years. The algorithm shouldn’t accept this candidate. It will be great to see why the algorithm hasn’t passed this candidate.
AIM is to build very good algorithm or set of modules that provides context analysis. So we can skip the conversion from PDF to XML and directly start making the algorithm that provides better understanding of the resumes.
sample resume and the pre converted XML file will be provided.
May require use of NLP, summarization tools, context analysis, machine learning, data science.