Audio

Marketplace For Buyers For Vendors For Partners

Purchase this listing from Webvar in AWS Marketplace using your AWS account. In AWS Marketplace, you can quickly launch pre-configured software with just a few clicks. AWS handles billing and payments, and charges on your AWS bill.

Voice AI for developers

Vapi lets developers build, test and deploy voice agents in minutes rather than months.

Jellyfin

This product has charges associated with it for seller support. Jellyfin is an open-source media server software that allows users to organize, manage, and stream their personal media collections, including movies, TV shows, music, and photos, across various devices. It provides a user-friendly interface and supports multiple platforms without subscription fees.

Deepgram Voice AI- Multilingual Speech-to-Text (STT)

Deepgram is the enterprise Voice AI platform for building and scaling real time voice applications on AWS. This model transcribes the following 10 languages at once, allowing for aggressive code switching during the conversation. English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch. Our APIs for Nova or Flux Speech to Text (STT) and Aura Text to Speech (TTS) are natively available in the new SageMaker BiDirectional Streaming API. Additional native touchpoints with Amazon Bedrock, Lex, and Amazon Connect make it simple to compose full voice experiences with the cloud services your teams already trust.

Real Time voice Transcription in Call Centers

Automatic speech recognition (ASR) models transcribe customer calls in real-time, integrating with expert assist models, saving transcription time post call, and enabling improved call analytics.

VoiceVault

VoiceVault is a real-time, voice-native fraud detection solution built for scale. Built on advanced conversational intelligence and designed with real-time protection and compliance in mind, VoiceVault listens to live conversations to identify scams, social engineering, and abuse, then equips fraud and risk teams with the insight they need to take immediate action. Unlike existing tools which rely on recognizing suspicious phone numbers or other obvious trends, VoiceVault analyzes the context, tone, and intent of every conversation to detect fraud as it happens - empowering teams to prevent harm proactively. Financial institutions, gig platforms, insurers, and logistics providers rely on VoiceVault to protect their customers, reduce fraud losses, and build safer platforms.

Pipecat Cloud

Pipecat Cloud is enterprise voice AI built on open source. A managed service to host and scale conversational voice agents and multimodal AI, Pipecat Cloud is purpose-built for Pipecat, the most widely used realtime orchestration framework. It makes it easy for developers to build custom voice agent workflows, at ultra-low latency.

Vodia Phone System

Vodia PBX is a powerful and versatile software solution that enables employees to stay connected on the go. It turns mobile phones, laptops, tablets, PCs and standard VoIP phones into robust communication tools, allowing for seamless and secure communication between employees. With Vodia PBX, your business can stay connected no matter where your employees are located.

Muvi One

Muvi One simplifies the complex process of launching a video streaming service and enables content owners to focus on delivering top-quality content to their audiences. Whether you are a media enterprise, an educational institution, or an independent content creator, Muvi One is the ultimate solution for your video streaming needs. Start your streaming journey today with Muvi One and unlock the potential of your video/audio content.

LAMA Connect & LAMA Mix: Next-Generation Audio Production Software

LAMA Connect & LAMA Mix AMI provides a comprehensive solution for audio production and connectivity. Designed for broadcasters, audio engineers, and production teams, this package combines the power of LAMA Connect's advanced routing capabilities with LAMA Mix's professional-grade audio mixing features. Deploy seamlessly on AWS for unmatched scalability and flexibility

Access Contact Center - TTY Call Management Subscription

Mindgrub Generative AI Music Deep Learning GPU-Optimized AMI

The AWS Deep Learning Base GPU AMI (Ubuntu 20.04) 20230727 with FFmpeg, the Anticipation and Riffusion tools, plus ready-to-go models and configuration from Hugging Face

AmiVoice API

Speech to text for developers

Veritone Illuminate

Veritone Illuminate, powered by aiWARE, is an early case assessment tool that minimizes discovery spend and time for legal teams by transcribing, searching and analyzing audio, video and text-based evidence early in a case to quickly identify, cull-down, and focus on relevant data only. Veritone Illuminate enables law enforcement investigative teams to cost-effectively search, analyze, and explore large amounts of audio, video and text-based evidence.

SAP Exception Management with Agentic AI

Automate SAP exception handling using Agentic AI on AWS.Enable intelligent error detection, root cause analysis, and auto-remediation to boost operational efficiency.

Vulavula Transcribe

Transcribe and interpret spoken language across African languages with Lelapa AI's Vulavula Multilingual ASR Model.

Music Recognition (Trainable Algorithm)

Automatic music genre recognition

Koldan (Monthly)

Our scalable dictation platform is built for high-security environments, offering seamless integration with Philips devices, multi-language support, and hotkey functionality. Dictate text into applications, and take advantage of enterprise-ready features like LDAP, OAuth, and SAML.

NVIDIA ParakeetvTDT 0.6B v2

Parakeet-tdt-0.6b-v2 is a 600-million-parameter automatic speech recognition (ASR) model designed for high-quality English transcription, featuring support for punctuation, capitalization, and accurate timestamp prediction.

Speech Transcription - Egyptian Arabic

Trellis Data's Egyptian Arabic transcription is more than 1.6x times more accurate than Whisper. Leverage our expertise in fine-tuning AI for translation and transcription. Our specialists design and optimize models to handle bespoke languages and dialects, delivering accurate, context-aware results where off-the-shelf tools fall short.

FEEDAE

FEEDAE is a conversational intelligence platform, our AI-powered call analysis platform that automatically transcribes, analyzes, and optimizes 100% of phone conversations in real-time for call centers and sales teams.

Speech Transcription - Indonesian (Bahasa)

Leverage Trellis Data's expertise in fine-tuning AI for translation and transcription. Our specialists design and optimize models to handle bespoke languages and dialects, delivering accurate, context-aware results where off-the-shelf tools fall short.

TEAMSPEAK-SERVER-3.13 on Linux with support by Hanwei

This product has charges associated with it for seller support. TeamSpeak 3 is a voice-over-IP communication software primarily used by gamers for real-time voice chat and file sharing during online gaming sessions.

Audio & Video Call - a complete SDK for in-app chat, voice and video

commsease is Committed to offering a globally leading CPaaS platform for messaging and audio and video calling services and scenario-oriented solutions. Our Audio & Video Call is an easy-to-use RTC solution that supports 128kbps, stereo and HD audio, and 1080p video streaming. You can get integrated with a snippet of 3 lines. We provide a scalable & feature-rich RTC SDK and solutions for in-app chat, voice, and video. Our business is committed to your success, supporting you with technical excellence & 1st class service.

AssemblyAI

AssemblyAI builds AI systems that can understand human speech with superhuman abilities. Starting building with $50 in usage credits during your 90-day free trial. Cancel any time. After your trial ends, you will automatically be enrolled into an AssemblyAI pay-as-you-go plan. Request a private offer for discounted pricing based on your usage profile.

Cyanite Lyrics Extractor - Lyrics 2 Text

The Cyanite Lyrics Extractor extracts the lyrics from a musical audio file using deep learning technology.

Eden AI - Buy credits

Eden AI provides a unique API connected to the best AI engines and combined with a powerful management platform. Eden AI allows users to choose from a variety of features such as text & NLP, speech & audio, OCR, machine translation and many more.

Gradium TTS

Gradium builds the technological backbone, models and infrastructure, to support all voice applications. We work at the finest level of details to design natural, expressive, real-time voice interactions at scale.

Swedish Whisper Media ASR

Production-grade Swedish Whisper ASR model for podcasts, YouTube, TV, and media transcription. Fine-tuned on 120 hours of human-labeled speech at 16kHz.

Speech Transcription - Modern Standard Arabic

Toku Contact Center

Orchestrate customer interactions all channels with an AI-driven and all-in-one contact centre cloud-based solution. Empower agents and supervisors to improve efficiency and reduce costs while enhancing customer experience. This product is available only through Private Offers . Please contact enterprise.sales@toku.co to register your interest.

...