YOU MUST HAVE THIS DONE ALREADY and READY TO GO!!!
We need this yesterday!
Most Important: We are trying to extract text from various file types such as DOC, PDF, PPT (notes pages), XLS, and others for use in our website.
Secondary Importance: We need to convert English text or rtf to other languages, Spanish, French, and others.
Can you do the following:
Provide us with source code for independently operating functions that will extract the data (text) from PDF and DOC files as well as PPT and XLS files... we should focus on the 1997-2003 versions of these formats because while the newest Office is reverse-compatible, the formats may not be.
Here is our scenario:
User uploads file (DOC, PDF, PPT, XLS…possibly ebook) -> our file handler recognizes file -> the proper function is called to extract the text -> the text is sent to text box for editing.
Eventually we would like to incorporate OCR for protected sources or images such as PDF, XPS, TIF… but for now, just standard PDF, DOC, PPT, and XLS are necessary.
If you have the source code ready to go, you would have a chance to make some extra money out of it.
We ARE NOT looking to resell the code, we need it as a component in our comprehensive website.