Please see attached PDF files.
A program with complete code to read pdf files and display the text in text file.
The text in text files will have to appear exactly as it appears in PDF files.
Please 'More Information to PDF Reader Program.txt' file attached. This will explain how pdf program will have to run.
There are PDFBox and iText libraries to read pdf files. You could also use any other language or libraries you are good at.
The PDF files information:
1) It has characters in English and Kannada language.
2) Kannada is Indian vernacular language.
3) It has unicode points range U+0C80 to U+0D7F.
4) The PDF files are identity-H encoded.
In the following link, you also could have a look at unicode points for Kannada language characters.
Please select drop down and select Kannada.
Please send/post me sample output, so I could see if the program could extract the text in correct order.
For example: Out of 30 pages, extract information from page 1 and page 3 and organize in correct order as mentioned in 'More information to pdf reader program.txt' file. I will let you know if the order is correct and need any modification. I will then award you the project.
I would looking for complete code and instructions on how to run it.
Please let me know if you have any questions.