I am working on a natural language processing problem involving information retrieval from a number of similar PDF documents. After some research, I have decided to have UIMA be used for all the NLP work.
The problem is as follows:
I have a bunch of high court judgements with text format as follows
IN THE HIGH COURT OF JUDICATURE AT THELMES
EXTRA ORIGINAL CIVIL JURISDICTION
MAP PETITION NO. 5643 OF 2009
Dr. Jose Costellas
Union of Thelmes and Ors.
MAP PETITION (L) NO. 5628 OF 2009
Dr. Dee Dee Smith
Union of Surrey.
MAP PETITION (L) NO. 5421 OF 2010
St. Williams Education Society & Association
All Thelmes Council For Education.
Mr. V.M. Smith A/d. Ms. Perry V. Tine for the Petitioners.
Mr. Dennis Trudy a/d. Gina Mason A/d. V.P. Gill i/b. Scholm Mason LLP for Respondent 1.
Mr. A.B.. Borris for Respondent 2.
Mr. E.P. Mccotter, Senior Associate A/d. Nancy Parizek i/b. Kay, Windsor & Cohen LLP for Respondent 3.
CORAM : Hon. D.Y. Meier &
27 DECEMBER 2011.
I need following to be extracted from above:
High court: (Thelmes)
Jurisdiction : (Extra original civil)
Petitioner(s): (Dr. Jose Costellas, Dr. Dee Dee Smith, St. Williams Education Society & Association)
Respondent(s): (Union of Thelmes and Ors., Union of Surrey., All Thelmes Council For Education.)
Attorneys for petitioner: (Mr. V.M. Smith A/d. Ms. Perry V. Tine)
Attorneys for Respondents: ( Dennis Trudy, Gina Mason, V.P. Gill; A.B.. Borris; E.P. Mccotter, Nancy Parizek )
Law firms involved: Scholm Mason LLP, Kay, Windsor & Cohen LLP
Judges: D.Y. Meier, A.A. Copola
Judgement date: 27 DECEMBER 2011
Number of case lumped in one judgement can vary greatly. Also the date format may be a little different from document to document.
I need deliverable with complete source code as UIMA annotator. NO REGEX, may be a little to locate certain anchor words.
I should be able to use the UIMA annotator as part of an aggregate analysis engine.
8 freelancers are bidding on average $1187 for this job
I am a Java programmer having knowledge of artifical intelligence and writing algorithms. I have done a lot of POC in my earlier works. I think, i can perform this task with high degree of proficiency.