Text To Speech

Marketplace For Buyers For Vendors For Partners

Purchase this listing from Webvar in AWS Marketplace using your AWS account. In AWS Marketplace, you can quickly launch pre-configured software with just a few clicks. AWS handles billing and payments, and charges on your AWS bill.

Deepgram Voice AI- Multilingual Speech-to-Text (STT)

Deepgram is the enterprise Voice AI platform for building and scaling real time voice applications on AWS. This model transcribes the following 10 languages at once, allowing for aggressive code switching during the conversation. English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch. Our APIs for Nova or Flux Speech to Text (STT) and Aura Text to Speech (TTS) are natively available in the new SageMaker BiDirectional Streaming API. Additional native touchpoints with Amazon Bedrock, Lex, and Amazon Connect make it simple to compose full voice experiences with the cloud services your teams already trust.

Sonic 2.0

Cartesia Sonic is the fastest enterprise text-to-speech model with as low as 40ms latency, offering human-quality voice generation for real-time conversations. Built on breakthrough State Space Model technology pioneered at Stanford, Sonic delivers ultra-realistic voices across 15 languages with perfect accuracy on complex phrases.

SSFM 2.0

Typecast uses the advanced Typecast SSFM, our next-gen AI voice model that delivers incredibly natural and expressive TTS technology.

VARCO TTS Lite

VARCO TTS LITE is a low-latency, high-throughput engine for large-scale real-time speech synthesis. It offers a wide range of voices optimized for in-game characters while keeping audio quality stable and consistent across sessions.

Gradium TTS

Gradium builds the technological backbone, models and infrastructure, to support all voice applications. We work at the finest level of details to design natural, expressive, real-time voice interactions at scale.

Waves Text-to-Speech (TTS)

Waves TTS by Smallest.ai delivers real-time, multilingual text-to-speech for global, production-grade voice applications - powered by our state-of-the-art Lightning V2 engine, benchmarked for the fastest, most natural speech synthesis

Listening.com - Listen to Papers, Books, and Websites

Listen to websites, papers, and books.

VARCO TTS Standard

VARCO TTS STANDARD is a generative speech model that delivers vivid, dynamic voice synthesis. Unlike conventional TTS, which produces the same output for the same input, the system uses sampling techniques so the same text can be rendered with different intonation, rhythm, and expressions each time

Deepgram Voice AI- French Speech-to-Text (STT)

Deepgram is the enterprise Voice AI platform for building and scaling real time voice applications on AWS. This model transcribes French. Our APIs for Nova or Flux Speech to Text (STT) and Aura Text to Speech (TTS) are natively available in the new SageMaker BiDirectional Streaming API. Additional native touchpoints with Amazon Bedrock, Lex, and Amazon Connect make it simple to compose full voice experiences with the cloud services your teams already trust.

Deepdub GO - Hollywood Grade Generative AI-powered Localization

Deepdub GO is a cutting-edge virtual AI studio designed to streamline the post-production dubbing process. This platform empowers creators to produce high-quality localized content quickly and efficiently by leveraging proprietary emotion-based text-to-speech technologies and professional voice creation.

Botnoi Voice - Text-to-Speech API

Voice's Text-to-Speech API. Featuring over 200+ AI voices across more than 20 languages, including all ASEAN languages, English, Chinese, Japanese, and more. Perfect for voice-overs, dubbing, educational content, news reading, and presentations. Join over 3 million registered users worldwide and experience seamless text-to-speech conversion.

Real-Time Speech Translation API

Convert speech from one language to another (AI Interpreter)

AgentX Base

AgentX for 311 will allow your limited resources to focus on true emergency calls. Let AgentX address your non-emergency calls. AgentX leverages AWS AI technologies using a plethora of digital channels.

NVIDIA Riva

NVIDIA® Riva is GPU-accelerated multilingual speech and translation AI for building and deploying fully customizing and deploying real-time conversational AI pipelines. Riva is part of the NVIDIA AI Enterprise software platform.

Adapt - AI Enabled Localization

Adapt combines the latest AI technology with a global network of native speaking linguist to create high quality foreign language subs and dubs at fraction of the cost of traditional methods all via our cloud native SaaS platform.

Deepgram Voice AI- English Speech-to-Text (STT)

Deepgram is the enterprise Voice AI platform for building and scaling real time voice applications on AWS. Our APIs for Nova or Flux Speech to Text (STT) and Aura Text to Speech (TTS) are natively available in the new SageMaker BiDirectional Streaming API. Additional native touchpoints with Amazon Bedrock, Lex, and Amazon Connect make it simple to compose full voice experiences with the cloud services your teams already trust.

ssfm 2.0(public)

ssfm(speech synthesis foundation model) for TTS(text-to-speech)

Text to Speech Synthesizer

This solution can take text input and convert it into a human like speech

BotWA AI Speech Analytics: Real Time Voice Data Insights for AWS

Transform customer conversations into actionable insights with our AI driven speech analytics platform. This real-time voice transcription and sentiment analysis AI engine analyzes customer interactions across multiple languages including Arabic, English, French, Hindi, Spanish, and Urdu. Enhance customer experience with conversation intelligence, agent performance tracking, and compliance monitoring. Seamlessly integrates with AWS services for scalable call center analytics and contact center performance optimization.

AudioStack

AudioStack is the enterprise solution for AI-powered audio production. We sit at the intersection of tech, creative and audio, unlocking cost and time-efficient high-quality audio, addressable at scale.

TravelAssist

Powered by Gen AI and Conversational AI, TravelAssist is a prebuilt suite of self service accelerators designed to transform travel experiences. With seamless integration into digital and voice channels, it enhances speed to market, boosts guest satisfaction, drives loyalty, and empowers employees. Leveraging Kore.ai AI for Service, AI for Work, and AI for Process, TravelAssist enables intelligent, real time conversations meeting travelers wherever they are in their journey.

Human Review Services by Objectways Technologies

Our annotators work out of SOC2 compliant facilities and we employ many security controls prescribed by AWS security to ensure customers data is securely accessed via the worker portal. No personal electronic devices are allowed inside the work area.

VOC Analytics Setup Services

An initial process that includes discovery sessions, requirements gathering, and technical assistance to ensure secure integration with the VOC Analytics product listed in AWS Marketplace. It includes connectivity validation, infrastructure configuration, integration testing, functionality such as interaction transcription, and an initial 10-day follow-up by the consulting team. A production instance and assigned technical resources are required.

Text To Speech Model API Packaged by IOanyT Innovations - Ubuntu 22.04

This product has charges associated with it for pre-installed Text To Speech Model API - Ubuntu 22.04

Deepgram Voice AI- Spanish Speech-to-Text (STT)

Deepgram is the enterprise Voice AI platform for building and scaling real time voice applications on AWS. This model transcribes Spanish. Our APIs for Nova or Flux Speech to Text (STT) and Aura Text to Speech (TTS) are natively available in the new SageMaker BiDirectional Streaming API. Additional native touchpoints with Amazon Bedrock, Lex, and Amazon Connect make it simple to compose full voice experiences with the cloud services your teams already trust.

Deepdub eTTS: Text-to-Speech Model for Ultra-Realistic Voices

Deepdub eTTS is a cutting-edge neural text-to-speech model delivering ultra-realistic, human-like voices in 100+ languages and accents. Built for AWS SageMaker JumpStart, it enables developers and enterprises to generate expressive speech with natural prosody, emotion, and clarity, directly within their AWS environment. Easily deployable via SageMaker endpoints, Deepdub eTTS supports both streaming and batch workflows, making it ideal for media localization, conversational AI, eLearning, accessibility, and more. With low-latency inference, fine control over tone and style, and seamless AWS integration, Deepdub eTTS empowers you to create lifelike, engaging audio experiences at scale, without compromising on performance or security.

Supertone Text-to-Speech (TTS)

Create lifelike voices in seconds. Supertone Play delivers emotionally expressive, multilingual TTS & voice-cloning via a high-performance API for games, films, audiobooks & virtual worlds.

Camb.ai Studio

CAMB.AI Studio is a comprehensive SaaS platform that enables Enterprises to translate and localize their content, hyper-realistically, be it video, audio or text, into over 140 languages.

HuggingFace Transformer Ubuntu20 with maintenance support by Apps4Rent

This product has charges associated with it for technical support and maintenance provided by Apps4Rent. The usage charges are USD 0.10/hour.

Voice AI Platform

Smallest Atoms is a platform for deploying hyper-realistic AI agents that can talk to anyone on voice or text, in any language and any voice. Atoms act like AI employees that handle end-to-end workflows across support, sales, and operations, using your business context, tools, and data. They can automate conversations such as answering queries, collecting information, completing transactions, and triggering downstream actions, while continuously learning from mistakes. Atoms integrate into your stack via SDKs and APIs, making it easy for teams to embed real-time, natural conversations into web, mobile, and backend systems.