Startups > Deepgram

Deepgram

Deepgram powers voice AI applications with accurate, low-latency speech recognition APIs.

Series C $246M total Founded 2015 San Francisco, California 255 employees

Deepgram is a foundational AI company that builds vertically integrated voice AI infrastructure, controlling model development, data labeling, synthetic data generation, and its own data centers. The platform offers three core capabilities: Listen (speech-to-text), Think (language understanding), and Speak (text-to-speech), with particular strength in handling real-world audio conditions like background noise, overlapping speakers, accents, and technical terminology. Serving over 1,300 organizations including NASA, Spotify, and Twilio, Deepgram differentiates through superior accuracy, lowest latency, and 2-5x more affordable pricing than competitors by owning its entire stack.

Problem solved

Organizations need accurate, fast, affordable voice AI APIs that work reliably in noisy, real-world conditions without requiring extensive custom model training.

Target customer

Enterprise and mid-market companies building voice-enabled applications, contact centers, transcription services, and developers integrating voice AI into SaaS products.

Website LinkedIn Crunchbase Twitter / X

Founders

Scott Stephenson

CEO & Co-Founder

PhD in particle physics from University of Michigan; previously conducted dark matter detection research at underground lab before founding Deepgram in 2015.

Adam Sypniewski

CTO & Co-Founder

Academic background with PhDs and postdocs in machine learning and signal processing.

Magesh Swaminathan

Co-Founder

Co-founder with academic and technical background.

Noah Shutty

Co-Founder

Co-founder with PhD and postdoc experience in AI/ML.

Funding history

Series A $12M 2016 Led by Wing VC · NVIDIA, Y Combinator, Compound, SAP.iO

Series B $47M Unknown Led by Madrona Venture Group · Alkeon

Series B Extension $47M March 2023 Led by Madrona Venture Group · Unknown

Series C $130M January 13, 2026 Led by Unknown · Unknown

Total raised: $246M

Industries

Data Collection and Labeling Artificial Intelligence (AI) Natural Language Processing Developer APIs Speech Recognition

Pricing

Usage-based SaaS model starting free with $200 credit. Pay-as-you-go pricing from $0.0043/min for pre-recorded audio (Nova-3) and $0.0077/min for streaming. Growth tier ($4,000+/year) and Enterprise tier ($15,000+/year) with pre-paid credits and discounts. Voice Agent API priced at $4.50/hour with flexible deployment options.

Notable customers

NASA, Spotify, Twilio, Five9, UpdateAI, Elerian AI, IBM (partnership)

Integrations

IBM (partnership), Stripe, HubSpot, Salesforce

Tech stack

jQuery UI (JavaScript libraries) jQuery (JavaScript libraries) Preact (JavaScript libraries) LottieFiles (CMS) Prism (UI frameworks) Open Graph HTTP/3 WordPress (Blogs) Linkedin Insight Tag (Analytics) Google Analytics (Analytics) Crazy Egg (Analytics) Font Awesome (Font scripts) WP Rocket (Caching) PHP (Programming languages) Google Workspace (Email) Unpkg (CDN) Cloudflare (CDN) HubSpot (Marketing automation) MySQL (Databases) Linkedin Ads (Advertising) Google Tag Manager (Tag managers) Salesforce (CRM) Yoast SEO Premium (SEO) Yoast SEO (SEO) Amazon Web Services (PaaS) WP Engine (PaaS) Complianz (A/B Testing) Google Optimize (A/B Testing) Sendgrid (Email)

Website

deepgram.com/

Competitors

AssemblyAI

Focused API provider but lacks Deepgram's vertical integration of infrastructure and model development.

Google Cloud Speech-to-Text

Large incumbent with broader cloud services but less specialized for production voice applications and higher latency.

AWS Transcribe

Enterprise-focused but higher cost and slower performance in challenging audio conditions compared to Deepgram.

Otter.ai

Focused on transcription and note-taking for end users; Deepgram targets developers and enterprises building voice AI products.

Verbit

Human-powered transcription service with different value proposition; not API-first for developers.

Descript

Focuses on audio/video editing and collaboration; not a foundational voice AI API platform.

OpenAI Whisper

Open-source model with lower accuracy and higher latency; Deepgram optimizes for production performance and reliability.

Speechmatics

Competitive on accuracy but lacks Deepgram's infrastructure control and pricing advantage.

Why this matters: Deepgram is notable for achieving vertical integration in AI infrastructure—building models, managing data, and operating data centers in-house—enabling significantly better accuracy and latency than API-only competitors while charging 2-5x less. Reaching cash-flow positivity in January 2025 with over 50,000 years of audio processed demonstrates product-market fit at scale, positioning Deepgram as a foundational infrastructure layer for voice AI applications across enterprise and developer markets.

Best for: Enterprises and developers building production voice AI applications that require high accuracy, low latency, and cost efficiency across diverse audio conditions.

Use cases

Contact Center Transcription & Analytics

Contact centers integrate Deepgram's Nova-3 model via API to transcribe calls in real-time, enabling immediate sentiment analysis, topic detection, and speaker diarization. Five9 reported 2-4x better accuracy than competitors for transcribing alphanumeric data, improving compliance and quality assurance workflows.

Voice-Enabled SaaS Applications

Developers building voice features (voice commands, dictation, accessibility) use Deepgram's Listen, Think, and Speak APIs to avoid building custom models. The Flux model with integrated end-of-turn detection eliminates separate voice activity detection systems, reducing latency and complexity in conversational applications.

Real-Time Multilingual Transcription

Global enterprises and media platforms use Deepgram for real-time transcription across multiple languages and accents. NASA and Spotify use Deepgram's models to handle diverse audio sources, from technical terminology to multiple speakers, with accuracy exceeding 90%.

Speech-to-Speech Conversations

New use case enabled by Deepgram's speech-to-speech model (2025 launch) operates without intermediate text conversion, preserving tone and emotion. Ideal for voice agents, customer service bots, and virtual assistants requiring natural, contextual responses.

Alternatives

AssemblyAI Pick AssemblyAI if you want a simpler, lighter-weight API without custom deployment options, though accuracy and latency may not match Deepgram.

Google Cloud Speech-to-Text Pick Google if you're already deeply integrated in Google Cloud and prioritize ecosystem consistency over specialized voice AI performance.

AWS Transcribe Pick AWS if you need transcription as part of broader AWS services and have flexible latency requirements, though costs are typically higher.

FAQ

What does Deepgram do? +

Deepgram is a voice AI platform providing developer APIs for speech-to-text (Listen), language understanding (Think), and text-to-speech (Speak). The company builds all models from scratch and hosts its own data centers, enabling superior accuracy in real-world conditions like background noise, overlapping speakers, and accents. It serves over 1,300 organizations including NASA, Spotify, and Twilio.

How much does Deepgram cost? +

Deepgram uses usage-based pricing starting free with $200 credit. Pay-as-you-go rates are $0.0043/min for pre-recorded audio and $0.0077/min for streaming (Nova-3 model). Growth tier starts at $4,000/year with discounts, and Enterprise tier at $15,000/year with custom models and dedicated support. Voice Agent API is $4.50/hour.

What are alternatives to Deepgram? +

Alternatives include AssemblyAI (simpler API), Google Cloud Speech-to-Text (broader ecosystem), AWS Transcribe (AWS-native), Otter.ai (end-user focused), OpenAI Whisper (open-source but lower accuracy), Speechmatics, and Verbit (human transcription). Deepgram differentiates through accuracy, latency, and vertical integration controlling both models and infrastructure.

Who uses Deepgram? +

Enterprise customers include NASA, Spotify, Twilio, Five9, and over 400 enterprise customers as of January 2025. Target users are developers and enterprises building voice-enabled applications, contact centers, transcription services, and SaaS products requiring speech AI. Over 1,300 organizations total use Deepgram APIs.

How does Deepgram compare to AssemblyAI? +

Both provide speech-to-text APIs, but Deepgram owns its infrastructure, data centers, and model development end-to-end, enabling 2-5x lower costs and superior accuracy in challenging conditions. AssemblyAI is lighter-weight and simpler but lacks customization, deployment flexibility, and Deepgram's performance advantages. Five9's testing showed Deepgram 2-4x more accurate than competitors for alphanumeric transcription.