Please see attached PDF files.
A program with complete code to read pdf files and display the text in text file.
The text in text files will have to appear exactly as it appears in PDF files.
Please 'More Information to PDF Reader [url removed, login to view]' file attached. This will explain how pdf program will have to run.
There are PDFBox and iText libraries to read pdf files. You could also use any other language or libraries you are good at.
The PDF files information:
1) It has characters in English and Kannada language.
2) Kannada is Indian vernacular language.
3) It has unicode points range U+0C80 to U+0D7F.
4) The PDF files are identity-H encoded.
In the following link, you also could have a look at unicode points for Kannada language characters.
[url removed, login to view]
Please select drop down and select Kannada.
Please send/post me sample output, so I could see if the program could extract the text in correct order.
For example: Out of 30 pages, extract information from page 1 and page 3 and organize in correct order as mentioned in 'More information to pdf reader [url removed, login to view]' file. I will let you know if the order is correct and need any modification. I will then award you the project.
I would looking for complete code and instructions on how to run it.
Please let me know if you have any questions.
9 freelancers are bidding on average $117 for this job
Please see your PM box for a demo program I had already written which extracts PDF text. This may be a very easy project for me to complete for you. Thanks for your consideration!