This is a unique project for those with great expertise in audio segmentation, etc.
The goal of this project, is to have a program that can analyze a 15 second audio file (mp3) and match / extract numbers that are being spoken aloud. I want a string returned with the numbers that were spoken in the audio. This must be processed relatively fast and with reasonable accuracy.
It is tricky because there is background noise and garbled speaking throughout the audio.
One possibly good thing however, is that the collection of "voices" speaking the number don't vary greatly.
I've attached samples of the audio.
If you think you can do it, please bid and I will show you where to get more audio for testing. I also want to know:
1. What language will you implement it in?
2. What accuracy can you deliver?
3. How long will it take to process each audio clip?
This should give you enough information to make estimates and bid. I will provide you with more details in private however.