Filter

My recent searches
Filter by:
Budget
to
to
to
Type
Skills
Languages
    Job State
    1,995 nvda tts jobs found
    Python Real-Time Voice Agent
    6 days left
    Verified

    ...enabled); Claude’s text comes back out through ElevenLabs Flash v2.5 for TTS and streams to the caller in real time. What I need from you is a working reference implementation, instrumented and tuned so I can see latency at each hop and fine-tune Voice Activity Detection thresholds. The code must retry transient errors, log everything that matters, notify the caller gracefully on trouble, and, if the Sonnet tier fails mid-call, fall back to the Haiku model without dropping the line. Deliverables • Docker-ready Python project with clear README • End-to-end demo showing p50 latency < 2.5 s on a five-minute call • VAD parameters exposed via config file or CLI flag • Prometheus/Grafana-friendly metrics for STT, LLM, TTS, and network hops ...

    $509 Average bid
    $509 Avg Bid
    174 bids

    ... integrations, and making the system fully functional. ⸻ Key Features Needing Implementation 1. Call / Communication System (Most Critical) * Outbound call creation via provider (Twilio / Plivo / similar) * Caller ID configuration using custom caller ID spoofing and real-time call control features * Real-time call control (answer, hangup, DTMF input handling) * Multi-language voice support (TTS integration) * Call recording storage & retrieval * Call flow / verification workflows (menu-based, user input, etc.) * Optional SMS integration (pre/post notifications) ⸻ 2. Real-Time Monitoring * Live call status updates (ringing, answered, ended) * WebSocket-based real-time dashboard updates * Admin ability to monitor active sessions/calls ⸻ 3. Payment System * Subscriptio...

    $521 Average bid
    $521 Avg Bid
    122 bids

    ...Reels, or TikToks you've edited — TTS-style content is a strong plus Genuine understanding of retention editing (not just "making cuts") Reliable daily availability — this is a real workflow, not a side gig Clear English communication (Spanish a bonus but not required) What we offer: 5 USD per Short, flat rate With 6 videos/day × 6 days = 180 USD/week, paid through the platform Guaranteed volume from week one: no searching for the next gig Clear briefs, no creative guesswork — we know exactly what works Paid test video at full rate before we commit to a long-term engagement Long-term stability: our current editors have been with us for over a year How to apply: Send a short reply including: 3 links to Shorts, Reels, or TikToks you'v...

    $909 Average bid
    $909 Avg Bid
    48 bids

    ...After Effects are second nature to you (both, not one) Your portfolio has Shorts, Reels, or TikToks — TTS-style content is gold You actually understand why retention editing works, not just how to cut You show up every day. This is a real workflow, not a side quest. Your written English is solid (Spanish is cool but not needed) Here's the deal: 5 USD per Short, flat 180 USD every week, paid through the platform Volume is guaranteed from day one — no hunting for the next gig Paid test video at full rate before we commit. No free work, ever. Stick around and it stays stable. Our current editors have been with us over a year. How to apply — read this part: Send 3 links to Shorts you've edited (TTS or viral-style wins) Confirm you can handle 6 v...

    $869 Average bid
    $869 Avg Bid
    50 bids

    ...Poland and 3CX * Built initial dialplan logic for the 222 prefix routing Remaining work: * Fix SIP authentication: Outbound calls aren’t authenticating with the Polish provider (no CDR records), so this needs troubleshooting * 3CX integration & Caller ID: Finalize the bridge so 3CX accepts transfers and preserves the original caller ID * Campaign setup: Create the Press-1 survey campaign, upload TTS prompt, and configure call menu routing to 3CX * Data import: Prepare and upload the CSV list with duplicate checking enabled...

    $169 Average bid
    $169 Avg Bid
    66 bids

    ...scalability and reliability for large call volumes. 3. Scope of Work • System Setup: o Deploy open source dialer software (e.g., Asterisk, FreeSWITCH, VICIdial, GoAutoDial) or propose alternatives. o Integrate with SIP trunks/VoIP providers. • Features Required: o Upload and manage contact lists (CSV/Excel). o Schedule and batch calls to avoid congestion. o Play greeting messages (audio file or AI TTS). o Real time dashboard for call status (answered, busy, failed). o Reporting and analytics (delivery success, call duration). o Opt out/Do Not Disturb compliance features. • User Experience: o Web based interface, mobile friendly. o Simple controls for non technical users. o Secure login and role based access. 4. Deliverables • Fully functional dialer system dep...

    $260 Average bid
    $260 Avg Bid
    43 bids

    ...expérimentée (Flutter + IA + UX) pour développer , une application éducative inclusive pour enfants (TSA, TND, neurotypiques). Objectif Créer une application mobile iOS et Android publiée sur les stores (App Store + Google Play) dès la première version (MVP), multilingue (10 langues), avec : IA adaptative (niveau, rythme, difficulté) Jeux éducatifs type Montessori Module AAC (pictogrammes + TTS) Mode offline partiel (~60%) Tableau de bord + export PDF Accessibilité (RGPD, COPPA) Stack technique obligatoire Flutter (frontend) Firebase (backend) Python + TensorFlow Lite (IA) Stripe + In-App Purchases (paiements mobiles) Livrables attendus UI/UX Figma complète MVP fonctionnel Intégration I...

    $2743 Average bid
    $2743 Avg Bid
    118 bids

    ...want the whole back-office piece to run itself. Here’s what has to happen after the call: the agent triggers a Zapier workflow that writes the appointment details straight into a Google Sheet. From there the sheet should keep itself current—creating new rows, updating existing ones, and generally handling the repetitive admin without me touching it. If you’ve wired ElevenLabs (or a comparable TTS/voice platform) into a scheduling flow before, and you know your way around Zapier filters, webhooks, and Google Sheets automations, you’ll feel right at home. I’ll provide the call script, access to my ElevenLabs account, and the specific column layout for the sheet; you bring the integration logic, any helper code, and a quick round of testing to prove e...

    $15 / hr Average bid
    $15 / hr Avg Bid
    127 bids

    ...pipeline: microphone input → STT → LLM (OpenAI / Grok / similar) → TTS → speaker output - Add half-duplex audio gating (mute mic while speaking to prevent feedback in plush) - Add simple memory toggle (on/off) - Optimize for low latency (target 1.2–1.8 seconds average) - Basic logging and error handling - Clean, well-documented code with setup instructions Already completed: - All hardware is physically assembled and should just need to be coded (I'm happy to buy any other necessary components) - Power system, speaker, and microphones are wired and working - You will only be working on software/firmware Required skills: - Strong Raspberry Pi experience (Zero 2 W preferred) - Python + Linux audio handling - Experience with STT / TTS / LLM pi...

    $2461 Average bid
    Featured Urgent
    $2461 Avg Bid
    40 bids

    ...back-and-forth so users never feel they are starting over. What the bot must handle • Course information, timetables, and key dates • Event details and real-time updates • A searchable staff directory with contact methods • Personal student info that sits behind login (e.g., enrollment status, fees) Essential capabilities • Accept text, convert voice to text (and optionally reply with TTS), and extract text from images via OCR so a photo of a timetable or poster is enough to trigger the right answer. • Natural-language understanding fine-tuned for academic terminology. • A memory layer that lets the conversation stay coherent over multiple interactions, whether a student returns an hour or a week later. • Clear fallback or...

    $104 Average bid
    $104 Avg Bid
    15 bids

    ...typical health apps You’ll receive: – the approved voice-over script – app UI screen recordings and logos – a short mood board that shows colour palette, typography and the upbeat track we’ve licensed What I need back: 1. A polished vertical reel (1080 × 1920 MP4) optimised for Instagram Reels, TikTok and YouTube Shorts. 2. Hard-burned captions for silent autoplay. and voice over (tts) 3. Light motion graphics or kinetic text to reinforce each key benefit. 4. A punchy call-to-action ending on the App Store / Google Play badges. Please blend stock or original footage where it strengthens the story, keeping the tone friendly and inclusive. Clean, tight pacing is crucial—around 60 seconds works well, but I’m open...

    $12 Average bid
    $12 Avg Bid
    4 bids

    I want to launch a conversational practice bot that feels like a friendly language partner rather than a rigid tutor. The core flow starts with the bot itself proposing everyday scenarios—ordering a cup of chai at a Connaught Place café, asking an auto-r...pipeline in Python or Node.js—so long as: • the scenario engine can expand easily with new role-plays, • Hinglish code-switch detection is accurate enough to nudge, not scold, • quizzes follow a true SRS algorithm (SM-2 or similar), and • voice and text remain in sync across web and mobile. Please include a brief outline of your proposed architecture, any pretrained language models or Hindi ASR/TTS services you would leverage, and a timeline for an MVP that covers at least three role-...

    $53 / hr Average bid
    $53 / hr Avg Bid
    85 bids

    ...link option is not available for your conversation, take screenshots of the full transcript (scrolling to capture all turns), upload them to a Google Drive folder, and paste the Drive folder link instead. Step 5: Record and Upload the Voice Conversation Since the conversation link only captures the transcript, we also need the actual voice recording to evaluate audio quality (pronunciation, accent, TTS). Follow these steps: 1. Before starting the Gemini Live conversation, set up a recording on a separate device: - Option A — Second device: Use another phone or laptop to record the audio from your primary device&#39;s speaker. - Option B — Google Meet: Start a Google Meet on a laptop (just yourself), turn on recording, then place your phone near the laptop mic and ...

    $3 - $4 / hr
    Local
    $3 - $4 / hr
    0 bids

    PROJECT: Voice Input Integration + TTS Narration Refinement This is a continuation of an existing project. The system is a King James Bible concordance engine that: * accepts user queries (currently via text) * retrieves scripture based on Cruden’s concordance logic * returns results in text and audio (TTS) --- OBJECTIVE Enhance the system by: 1. Adding voice input (speech-to-text) 2. Improving the quality of the voice output (TTS narration) --- 1. VOICE INPUT (PRIMARY FEATURE) Goal: Enable users to speak their query instead of typing. Requirements: * Add a microphone button next to the input field * When clicked, the system should: * activate browser-based speech recognition * capture the user’s speech * convert speech to text * popula...

    $31 Average bid
    $31 Avg Bid
    40 bids

    ...seamlessly on iOS, Android, and the web. The assistant must handle both voice and text interactions fluidly: think high-quality speech recognition, natural-sounding TTS, and a chat interface that keeps context across devices. You’ll architect the back-end intelligence and then wire it to native mobile front-ends plus a responsive web app. Key deliverables • A scalable back-end (Python, Node, or comparable) exposing clean APIs for voice/text queries • Mobile clients for iOS (Swift/SwiftUI or React Native) and Android (Kotlin or React Native) plus a browser-based interface • On-device wake-word or tap-to-talk, real-time STT, and TTS • Secure user authentication and encrypted data sync across platforms • Deployment scripts and brief te...

    $2161 Average bid
    $2161 Avg Bid
    63 bids

    ...retrieves exact King James Bible verses (no AI interpretation) * returns results in order (Genesis → Revelation) * converts results into speech (TTS) The goal of this phase is to create a **simple web interface** so that non-technical users can use the system easily. --- CORE FUNCTIONALITY The interface should allow a user to: 1. Enter a question Example: “What does the Bible say about trust?” 2. Click a search button 3. See results displayed clearly: * Book * Chapter * Verse * Full verse text * Ordered from Genesis → Revelation 4. Click a “Play Audio” button: * This should read all returned verses aloud * Use the existing TTS functionality * Maintain correct format: “Genesis 2:24 – ...

    $55 Average bid
    $55 Avg Bid
    70 bids

    ...accessibility specialist located in India to perform a complete accessibility audit of my website. The scope includes a deep dive into the site’s content and structure, its overall navigation and usability, and its adherence to recognized accessibility standards. You will combine automated testing (e.g., axe, WAVE) with thorough manual reviews, keyboard-only checks, and screen-reader sessions (JAWS/NVDA). I expect clear, actionable insights rather than generic scan results. Deliverables • A detailed audit report that maps every issue to the relevant WCAG 2.1 success criterion and Indian accessibility guidelines. • Severity ratings and practical remediation recommendations, prioritised so my dev team can tackle quick wins first. • Annotated screenshots o...

    $310 Average bid
    $310 Avg Bid
    7 bids

    ...then the line is terminated. The bot should immediately send me a Telegram notification that includes the called number and a timestamp confirming they pressed 1. Anyone who does not press 1 or does not answer can simply be logged as “no response”. • I need the freedom to swap out both audio prompts at any time without touching the code—either by uploading new audio files or by pasting a fresh TTS script that the system converts on the fly. • I do not need calendaring or time-of-day scheduling; manual launch is fine. Tech is up to you, but Twilio, Vonage, Plivo or a similar voice API that supports parallel outbound calls and DTMF detection will likely fit best. Please build the Telegram side with the official Bot API so I can host everything on my...

    $405 Average bid
    $405 Avg Bid
    89 bids

    ...link option is not available for your conversation, take screenshots of the full transcript (scrolling to capture all turns), upload them to a Google Drive folder, and paste the Drive folder link instead. Step 5: Record and Upload the Voice Conversation Since the conversation link only captures the transcript, we also need the actual voice recording to evaluate audio quality (pronunciation, accent, TTS). Follow these steps: 1. Before starting the Gemini Live conversation, set up a recording on a separate device: - Option A — Second device: Use another phone or laptop to record the audio from your primary device&#39;s speaker. - Option B — Google Meet: Start a Google Meet on a laptop (just yourself), turn on recording, then place your phone near the laptop mic and ...

    $15 / hr Average bid
    $15 / hr Avg Bid
    6 bids

    ...can interpret both typed and spoken input, decide whether to answer directly or trigger a search, and respond conversationally. • A responsive GUI in PyQt that works seamlessly on desktop browsers through WebAssembly or a comparable bridge, with a native desktop build for Windows/macOS/Linux. What I’ll look for in your delivery 1. Clean, well-documented Python code (PyQt, speech-recognition, TTS, NLP libraries of your choice). 2. A modular design that lets me later plug in deeper contextual search or mobile support without refactoring the core. 3. Setup scripts or Docker files so I can spin the project up locally and deploy to a web host. 4. A brief user guide plus inline developer comments. If you have previous examples of voice-enabled assistants, chatbot...

    $6 / hr Average bid
    $6 / hr Avg Bid
    29 bids

    ...text-to-speech (TTS) layer so that the returned Bible verses can be read aloud. Scope * Use the verses returned from the existing system * Convert the verse text into audio using a TTS API * Ensure the spoken output is clear, natural, and easy to listen to Requirements * Must read all returned verses in sequence * Must maintain correct Biblical order (Genesis → Revelation) * Must use a clean reading format, for example: “John 3:16 – [verse text]” * Must use a high-quality natural-sounding voice (not robotic) * Implementation must remain simple and backend-focused (no UI required) Important * No new GPT logic is required in this phase * No frontend or app development is required * Focus only on converting the existing verse results into speech ...

    $53 Average bid
    $53 Avg Bid
    34 bids

    ...milliseconds, so please apply suitable indexing or search extensions where appropriate. • Simple test interface – A lightweight REST endpoint or command-line script that accepts a query string and returns matching verses in canonical Bible order, with optional pagination. • Documentation – Clear notes on the schema, the import process, and example queries so I can plug this layer into the next GPT/TTS stages without guesswork. Acceptance criteria: * The full Bible imports without errors * Sample searches for common words (e.g. “faith”, “grace”, “love”, “money”) return correct verses instantly * Queries return verses in canonical Bible order * The codebase runs locally with a simple readme command No UI, AI...

    $85 Average bid
    $85 Avg Bid
    60 bids

    I’ve already taken my voice bot from concept to a fully working prototype, but two issues still stand in the way of a smooth experience: barge-in handling and an audible delay during audio playback. The speech recognition and response generation return almost instantly; the pause only appears between receiving the TTS stream and hearing it, and I’m currently unsure whether that delay remains constant or drifts. Because the codebase contains sensitive material I can’t share, I’m looking for someone who can jump on a live session (Zoom, Google Meet, or a tool of your choice), inspect logs and metrics on my machine in real time, and walk me through the quickest path to a fix. You should be comfortable profiling audio pipelines, tuning buffering, and advising on...

    $16 - $134
    $16 - $134
    0 bids

    ...browser-based product video generator installed on my server. 2. Source code and build/readme files. 3. A 15-second sample video for the link above, using the chosen style (live-action with overlays, informative voice-over, feature-focused). I need to review this demo before awarding the final milestone. Let me know time frame, libraries you plan to use, and any licensing needs for stock clips or TTS voices....

    $269 Average bid
    $269 Avg Bid
    74 bids

    ...and follow-up questions are enough. Think of it as a smart SDR that can handle objections, schedule a meeting, or tag a lead as uninterested without drifting off-script. Minimum acceptance criteria • Secure uploader for CSV/Excel lead lists with mapping for name, number, language preference • Dialer module that auto-rotates numbers, respects DND, and retries on busy/no-answer • Human-like TTS and low-latency ASR supporting Hindi, English, and hybrid sentences • On-the-fly accent recognition, noise suppression, and live transcript display • Conversation logic builder so I can tweak responses myself without coding • Exportable call logs (audio, transcript, call outcome) in CSV/JSON I’m open to whichever speech engines, LLMs, or te...

    $100 Average bid
    $100 Avg Bid
    9 bids

    ...Description I am looking for a freelancer who can develop a simple and easy-to-use Text-to-Speech (TTS) application or web-based tool for Azerbaijani language. The goal is to create a tool where I can enter text, click a button, and get natural-sounding speech output suitable for professional training and e-learning content. Voice quality is very important. I do not want a robotic voice. The freelancer may use Python or another suitable technology, but they must have real experience in speech generation, TTS systems, audio processing, and voice quality improvement. I can provide Azerbaijani voice samples and example text for testing. Main Requirements The freelancer should be able to: Develop a working TTS tool as: a simple web page, or a lightweight desktop ...

    $483 Average bid
    $483 Avg Bid
    120 bids

    I have a production-ready, end-to-end voice bot running on Plivo with a streaming pipeline that chains Deepgram for STT, an LLM for intent generation, and a TTS engine for playback. Two problems are stopping me from going live: • Interruption handling (barge-in) – when a caller begins speaking, the TTS stream should halt instantly, but today the audio keeps playing. • Latency – the STT → LLM → TTS round-trip is a few seconds too slow; I need it trimmed to near real-time. • Overall flow optimisation – once the first two points are stable, I’d like a quick sanity check on buffer sizes, chunk timing and any other easy wins. I already have partial barge-in logic coded, yet it isn’t firing reliably, so I’m l...

    $59 Average bid
    $59 Avg Bid
    12 bids

    ...trigger calls from our existing database (Google Sheets/SQL) and write back the "Lead Status" and "Call Summary" automatically. ​Technical Stack (Preferred): ​Orchestration: , Retell AI, or Bolna. ​LLM: GPT-4o or Claude 3.5 Sonnet (optimized for Malayalam). ​TTS (Voice): ElevenLabs (Malayalam) or Azure Neural Voice. ​STT (Hearing): Deepgram (Nova-2) or OpenAI Whisper. ​Automation: , Zapier, or Python-based Webhooks. ​Telephony: Twilio or Exotel. ​Responsibilities: ​Configure the Voice AI pipeline (STT -> LLM -> TTS). ​Draft and optimize the Malayalam System Prompt for loan sales/qualification. ​Handle Interruptions (Turn-taking) and "End-of-speech" detection. ​Set up the automation flow to pull/push data from our database. ​Conduct testi...

    $309 Average bid
    $309 Avg Bid
    11 bids

    I’m enhancing a Voice AI agent and need professional support from trainers who are fluent in both Spanish and Portuguese. The engagement is part-time and in-site, scheduled flexibly around mutually convenient hours. Your background should include practical work with voice-driven technologies—whether that’s speech recognition, TTS, conversational AI, or similar tools—so you can quickly understand how linguistic choices influence model performance. What I’ll rely on you for: refining pronunciation and intonation, spotting grammatical or cultural missteps in generated dialogue, and suggesting targeted data additions that help the agent sound natural to native speakers. If you have strong expertise in only one of the two languages, let me know and we ...

    $8 - $13 / hr
    Local
    $8 - $13 / hr
    0 bids

    ...images and supply other ARIA attributes or captions as the report indicates. You will receive the latest scan report plus wp-admin access (on a staging copy first, not the live site). Once your changes are in place, run a fresh scan and manual spot-checks with tools such as WAVE, Lighthouse or NVDA to confirm the issues are gone. Acceptance criteria 1. No errors or contrast warnings remain in the supplied scanner report. 2. Keyboard can reach and activate every interactive element visibly. 3. NVDA or VoiceOver reads menus and content in the intended order without missing labels. 4. All images carry meaningful alt attributes; decorative images are marked accordingly. When everything passes on staging I will migrate to production. Please outline how many hours you...

    $20 / hr Average bid
    $20 / hr Avg Bid
    204 bids

    **Project Title:** C# Application to Monitor Text File and Trigger Voice Notification **Project Description:** I need a C# application that monitors a text file in real time. **Re...value from the text file. **Important Notes:** * Do NOT trigger voice if the value has not changed. * Do NOT trigger voice for old lines (only newly added lines). * The monitoring should work in real time (auto-detect file updates). **Technical Preferences:** * Language: C# * Prefer using efficient file monitoring (e.g., FileSystemWatcher or similar). * Text-to-Speech can use Windows built-in TTS or any reliable library. **Optional (Nice to Have):** * Ability to configure the file path. * Ability to customize the voice or message format. --- Please provide a clean, well-structured, and reliable...

    $135 Average bid
    $135 Avg Bid
    71 bids

    ...conversations—speech-to-text on the way in, text-to-speech on the way out—so learners can talk to it as if they were speaking to a tutor. Core goals • Accurate CA guidance: the bot should answer syllabus-level questions, explain tricky concepts, and suggest study plans. • Fluid voice exchange: latency below two seconds using a stack such as Whisper / Web Speech API for recognition and a neural TTS engine for replies. • Continual engagement: greet students each day, track brief study logs, and offer tailored prompts to keep them on pace. What I already have • A basic list of common CA queries and official reference links. • Wireframes for a minimal web UI (chat window + mic button). • A modest hosting plan on Vercel. Where I...

    $6 / hr Average bid
    $6 / hr Avg Bid
    23 bids

    ...Secure handling of account and transfer data; PCI-compliant storage of call recordings • Webhook or REST callback that posts the customer’s response (confirmed / cancelled / no response) back to our system • Configuration dashboard or editable JSON/YAML so we can tweak prompts without code changes • Comprehensive test cases and a short hand-off guide I am open to whichever speech-to-text and TTS engines you prefer (Google, Amazon, Azure, or an on-prem alternative) as long as latency is low and the voice sounds human. Future phases may expand the bot to balance inquiries or even loan applications, so please build with modularity in mind. Send a quick outline of your proposed tech stack, any relevant voice bot projects you have shipped, and an estimated...

    $82 Average bid
    $82 Avg Bid
    12 bids

    ...result should feel like talking to a real coach, not watching a video. --- Scope of Work Depending on your expertise, you will work on parts of the system: - TTS integration (voice cloning, streaming capable) - real-time audio processing - avatar animation pipeline (face + lip sync + expressions) - GPU-based inference optimization - backend orchestration (API + pipeline) - real-time streaming / latency optimization - optional: behavior / animation logic --- Tech Approach (Important – Open Architecture) We have a target architecture in mind, but we are intentionally NOT locking the stack. Expected system components: - real-time TTS system (low latency, voice cloning) - avatar animation engine (high realism, portrait-based) - backend pipeline (Python / API-bas...

    $15362 Average bid
    $15362 Avg Bid
    88 bids

    PROJECT TITLE: Senior Full-Stack Architect (React Native + Node.js) for Premium AI-Fintech Application SKILLS REQUIRED: React Native, Expo, Node.js, Express JS, , API Integration, AI Models (LLM/TTS), CI/CD Deployment PROJECT DESCRIPTION: We are a forward-thinking fintech organization. We require an elite full-stack developer to architect, build, and deploy our next-generation wealth management application. This is a high-performance, dark-mode native application requiring seamless frontend-backend synchronization and real-time AI capabilities. 1. Frontend Architecture (React Native / Expo) Dynamic UI Engine: The app's interface must dynamically morph based on 4 distinct user profiles. Each requires different routing, deep nesting, and state management. Aesthetics: OLED Da...

    $2311 Average bid
    $2311 Avg Bid
    69 bids

    My construction-remodeling firm needs an AI agent that can speak naturally, text convincingly, and email professionally while tying everything together inside n8n. I already use ElevenLabs for lifelike voice, so the call flow must leverage its TTS / voice-cloning and feed straight into n8n automations. Core scope • Inbound calls: greet callers, answer basic service questions, capture lead details and, when asked, book or reschedule an appointment in our calendar system. • Follow-up outreach: automatic SMS and email sequences that go out after the first conversation, plus the ability to place an outbound call if a prospect hasn’t replied. • Multichannel syncing: every interaction—phone, text or email—should update the same contact record so my ...

    $215 Average bid
    $215 Avg Bid
    100 bids

    ...Ongoing video production Platforms: YouTube Shorts, TikTok, Instagram Reels Niche: Investments, financial markets, economic news Video Requirements Duration: 20–60 seconds Visuals: Must match the narration precisely (charts, stock footage, graphics, animations — all relevant to the script) Voice-over: High-quality, clear, professional Natural-sounding (no robotic voice) Script will be provided (TTS is acceptable only if it sounds human-like) Style: Modern and dynamic Use of text animations, infographics, and visuals where appropriate Clean and engaging (not overloaded) Frequency: 2–3 videos per day Volume: Minimum 50 videos per month Format: Vertical video (9:16), MP4 Timeline: Ongoing monthly collaboration (long-term preferred) What I Provide Scripts or conten...

    $338 Average bid
    $338 Avg Bid
    19 bids

    We are looking for US English and UK English voice-over artists for a Text-to-Speech (TTS) project with 4–5 hours of recording work. > Project Overview: Language: English (US & UK accents) Duration: 4–5 hours of total recording per artist Content: Simple, pre-written sentences (script will be provided) Usage: AI voice training / TTS systems Recording Platform: Require Studio professional recording only > Requirements: Native US/UK English Female accent Clear pronunciation and natural delivery Ability to maintain consistent tone for long recordings Studio recording only (no mobile/laptop with good mic) > Who Can Apply: Freelancers & professional Female voice artists > Payment: Payment based on completed hours (4–5 hours) Competitiv...

    $186 Average bid
    $186 Avg Bid
    3 bids

    ...M4 (16GB) I need a COMPLETE working local solution for a Mac Mini M4 (16GB) that can generate highly realistic Dutch podcast audio with multiple speakers. Requirements: - Runs locally on Mac Mini M4, 16GB RAM - Dutch speech only - Voice cloning from MAX 10 minutes of clean voice samples per speaker - Output must sound human-level realistic, expressive, natural, and not recognizable as standard TTS - Reference point: quality should be in the direction of projects like Parkiet, but the final solution must be more robust, cleaner, more stable, and more production-ready - Multiple speakers in one project (separate rendering per speaker is acceptable if final workflow is simple) - Generation speed: max 1.5x audio duration - Clean studio-quality output: no artifacts, glitches, metall...

    $560 Average bid
    $560 Avg Bid
    63 bids

    ...political consulting HQ, made up of members who pay a subscription, and donors. The agent must provide natural, Australian-accented responses and resolve inquiries regarding membership, candidate endorsement, and party objectives without human intervention unless an escalation is required. Technical Scope & Integration requirements: • Voice Engine: Implementation of a low-latency Text-to-Speech (TTS) engine with a professional Australian accent. • Intelligence Layer: Integration with the ChatGPT Language Model (LLM) to process State and Federal Constitutions and various federal, state and territory Fact Sheets to provide accurate regulatory advice.  • Telephony Stack: SIP-based integration with our current Access4 and Cisco Webex infrastructure. • D...

    $1106 Average bid
    $1106 Avg Bid
    37 bids

    ...trying to grow on Instagram, Facebook and YouTube, then explain the simple solution my agency offers. Please work with a realistic-human avatar solution—whether you prefer Synthesia, HeyGen, D-ID, or a comparable pipeline—so the character’s facial micro-expressions, eye focus and mouth movements feel natural on full-HD playback. The Hindi must sound locally authentic, without robotic intonation; if TTS is used, I want the voice refined in post so pauses and emphasis land conversationally. I will supply the exact script in Hindi as well as brand logos and a colour reference. Your job is to animate the avatar, integrate the audio, handle subtle background music, and deliver a polished file ready for posting on Reels, Shorts and Ads. Acceptance criteria: &bull...

    $55 Average bid
    $55 Avg Bid
    7 bids

    I need a WordPress designer to customize an accessible educational blog focused on ADHD. The blog should cater to a wide audience, ensuring content is easy to navigate and understand. I’m using the “Unlimited” theme from Complete Themes. I’d like to integrate TTS, and some design elements, such as color coded post templates and icons. I’d also like to have accessibility fonts. The site is up and running, but it’s not visually engaging. I tried using plug ins, but I quickly realized they get snarled and may not be the best choice for everything. Essential Features: - Accessibility-focused design - User-friendly navigation - Mobile responsiveness - Integration of educational resources Ideal Skills and Experience: - Experience with WordPress desig...

    $434 Average bid
    $434 Avg Bid
    223 bids

    We are hiring German voice actors to perform multi-character dialogue recordings to help train internal Text-to-Speech (TTS) models. You'll be voicing fictional characters with varying styles, accents, and personalities. This will be a SINGLE person recording project, but you will be expected to perform different voices. This is non-broadcast, non-commercial work for internal research and development purposes only. Your voice will not be used in public-facing or paid media. Project Scope :. - Record 3 hours of audio performing dialogues between 2+ characters - Adjust voice, tone, and accent based on character descriptions - Ensure clear differentiation between characters - Deliver RAW, high-quality WAV files according to project specs - Complete up to 2 rounds of revisions (f...

    $342 Average bid
    $342 Avg Bid
    9 bids

    We are hiring German voice actors to perform multi-character dialogue recordings to help train internal Text-to-Speech (TTS) models. You'll be voicing fictional characters with varying styles, accents, and personalities. This will be a SINGLE person recording project, but you will be expected to perform different voices. This is non-broadcast, non-commercial work for internal research and development purposes only. Your voice will not be used in public-facing or paid media. Project Scope :. - Record 3 hours of audio performing dialogues between 2+ characters - Adjust voice, tone, and accent based on character descriptions - Ensure clear differentiation between characters - Deliver RAW, high-quality WAV files according to project specs - Complete up to 2 rounds of revisions (f...

    $250 - $750
    Local
    $250 - $750
    0 bids

    ...HTML/CSS/JS, Firebase). A "hands-free" mode allows users to validate items by voice, without touching the screen. The problem: Web Speech API on Android Chrome is unstable in noisy environments: - Sessions die silently after a period of silence - Short trigger words ("hop", "ok", "next") frequently missed or misrecognized - Erratic behavior across Android versions - Conflicts between speech synthesis (TTS) and recognition (STT) What I already have: - Working voice mode with short sessions + automatic restart - interimResults, maxAlternatives, expanded trigger word list - Accent stripping in transcript comparison What I'm looking for: Someone who has already solved these issues in production — not theory. Ideally with one of th...

    $25 Average bid
    $25 Avg Bid
    78 bids

    ...(always-on after start), live voice conversation with AI agent. Wake word: default “Wrench” (or “Hey Wrench”); optional custom wake word set by user in settings (saved per user). Pause/resume: tech says “[Name], hold on” to mute responses (still listens for wake word); “[Name], come back” to resume. Two-layer AI architecture: Voice layer: real-time mic input, wake-word detection, speech-to-text, TTS output, conversation flow, interruptions, Bluetooth earbud audio routing. Research layer: triggered by voice layer; makes secure OAuth 2.0 calls to AutoData API for torque specs, wiring diagrams, procedures, part verification; displays live search progress/results on screen (e.g. “Searching AutoData…” + results overlay...

    $15 - $25 / hr
    Sealed NDA
    $15 - $25 / hr
    165 bids

    ...using Flutter (speech_to_text) 2. AI Reminder Parsing * Send text to OpenAI API * Extract: * Task * Date * Time * Repeat (daily/weekly/custom) 3. Reminder Storage * Use local SQLite database (NO backend) 4. Notifications * Use flutter_local_notifications * Must work even when app is closed or phone is locked 5. Voice Confirmation * App should say: “Reminder set successfully” using TTS --- **Screens Required:** * Splash Screen * Login / Signup * Home Screen (Mic Button) * Reminder List (Today / Upcoming / Completed) * Add/Edit Reminder * Settings --- **Technical Requirements:** * Flutter (latest stable version) * Clean and modular code * Use these packages: * speech_to_text * flutter_local_notifications * flutter_tts * sqflite * http...

    $101 Average bid
    $101 Avg Bid
    55 bids

    ...software and its companion mobile app. The work must cover visual, hearing and motor-impairment scenarios so that every interaction is usable with screen readers, captions, alternative input devices and keyboard-only navigation. Scope of work • Review the current build on Windows/macOS plus iOS and Android. • Verify conformance against WCAG 2.1 AA, Section 508 and ADA requirements. • Use NVDA, JAWS, VoiceOver, TalkBack, switch control, caption checkers and colour-contrast tools to surface issues. • Deliver a clear report: defect description, severity, screenshots or short clips, exact reproduction steps and practical remediation advice. Acceptance criteria The engagement is complete when I receive the consolidated audit document, a prioritised defe...

    $12 / hr Average bid
    $12 / hr Avg Bid
    23 bids

    ...speakers to participate in a short voice recording project supporting language and speech technology research. Participants will record a set of short sentences using a mobile recording application. The collected recordings will help improve speech processing systems and Text-to-Speech (TTS) models. The task is simple and typically takes 30–45 minutes to complete. A detailed participant guide and instructions will be provided after hiring. Project Purpose The recordings will be used for Text-to-Speech (TTS) research and development. The collected audio will be used strictly for technology research and system training, and will not be broadcast or used in any public or paid media. Task Overview The recording task includes two parts: Sentence Recording • ...

    $19 Average bid
    $19 Avg Bid
    3 bids