
In Progress
Posted
We have already developed and fully tested a Windows C# Advanced Speech Analysis (ASA) engine that includes: Offline Speech-to-Text (Whisper-based) Phoneme-level analysis (Wav2Vec2 CTC) Pronunciation scoring (CTC forced alignment) Accent strength scoring (WavLM embedding model) Fully working evaluation pipeline The system works correctly on Windows using C# + AI models. We now require an experienced mobile developer to: Convert this engine into a production-ready Unity SDK for Android and iOS. This includes native model execution and Unity integration. Scope of Work The developer will: Create Native Mobile Inference Layer Android Build native .so library (NDK) Integrate: ONNX Runtime (for phoneme model) [login to view URL] (for STT) TorchScript WavLM model (accent) ARM64 support required iOS Build static .a library or XCFramework CoreML optional (if beneficial) ARM64 support required Model Integration We will provide: whisper model (ggml format) Phoneme CTC model (ONNX INT8) Accent model (TorchScript) Reference embedding bank Vocabulary file Developer must: Load models efficiently Run inference offline Optimize memory usage Ensure stable performance Unity SDK Wrapper Create a clean Unity C# interface: Example API: ASAResult Analyze( float[] audioPCM16k, string expectedWord, string expectedIPA ); Return structured result: { "stt_text": "volleyball", "stt_match": true, "pronunciation_score": 82, "accent_score": 74, "phonemes": [ { "ipa": "v", "score": 90 }, { "ipa": "ɒ", "score": 65 } ] } Unity plugin must support: Android (AAR) iOS (Xcode project / xcframework) Thread-safe inference Non-blocking calls Functional Requirements 1. Offline Only No cloud calls No external APIs 2. Performance Targets Model load time: < 2 seconds Per utterance (1–3 sec audio): Total latency < 1.5 seconds Memory under reasonable mobile limits 3. Audio Input 16kHz mono PCM float Silence trimming (VAD optional but recommended) 4. Scoring Logic (Already Designed) Pipeline: Whisper STT → check expected word If match → run phoneme forced alignment Compute per‑phoneme scores Compute overall pronunciation score Compute accent similarity score Return structured result Developer does NOT need to redesign scoring logic. Deliverables Android .so library iOS .a or .xcframework Unity wrapper Example Unity demo scene Build documentation Performance test report Required Skills C++ (strong) Android NDK iOS native development Unity native plugin development ONNX Runtime mobile Experience with [login to view URL] Experience with TorchScript / LibTorch Audio DSP basics Preferred: Experience with on-device ML optimization Experience with quantized models Budget & Timeline Timeline: 2 weeks Fixed price or milestone-based Must deliver stable SDK, not prototype
Project ID: 40256612
13 proposals
Remote project
Active 1 mo ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

Greetings! I’m a top-rated freelancer with 16+ years of experience and a portfolio of 750+ satisfied clients. I specialize in delivering high-quality, professional unity offline voice ask integration services tailored to your unique needs. Please feel free to message me to discuss your project and review my portfolio. I’d love to help bring your ideas to life! Looking forward to collaborating with you! Best regards, Revival
$2 USD in 40 days
0.0
0.0
13 freelancers are bidding on average $17 USD/hour for this job

Hello there , Good morning! I am an expert mobile engineer with skills including Android, Mobile App Development, Unity 3D, iOS Development, iPhone, AI Development, Unity, Documentation, Audio Processing and Game Development. "score" Please contact me to discuss more regarding this project. Thank you
$50 USD in 22 days
1.8
1.8

Hello, I'm a Unity developer with over 10 years of experience in mobile SDK integration and AI models. We'll discuss the details in a chat. I understand the importance of converting your tested C# Advanced Speech Analysis engine into a production-ready Unity SDK for Android and iOS. I will ensure seamless integration for both platforms and optimize performance according to your targets. For implementation, I propose two approaches. Option A: Build the Android .so library using NDK and integrate ONNX Runtime for phoneme models, providing efficient model loading and offline inference. Option B: Create an iOS .a or XCFramework with optional CoreML support, ensuring stable performance and memory optimization. Which method do you prefer? I will deliver a clean Unity C# interface, example API, demo scene, and performance reports, ensuring everything aligns with your functional requirements. I am excited to create a professional and functional SDK that meets your expectations. Best, Yurii.
$20 USD in 49 days
1.5
1.5

Las Vegas, United States
Payment method verified
Member since Feb 5, 2026
$25-50 USD / hour
$750-1500 USD
₹12500-37500 INR
£1500-3000 GBP
$10-30 USD
$30-250 USD
₹12500-37500 INR
$250-750 USD
$30-250 USD
€8-30 EUR
₹1500-12500 INR
$10-30 USD
£10-11 GBP
₹37500-75000 INR
$30-250 USD
₹600-1500 INR
£20-250 GBP
$750-1500 USD
$30-250 USD
$750-1500 USD
₹1500-12500 INR
₹1500-12500 INR