Together AI

Together AI provides open-source generative AI infrastructure for enterprise-scale model deployment.
Series B $534M total Founded 2022 San Francisco, California 167 employees
Together AI is an AI Acceleration Cloud platform enabling developers and researchers to train, fine-tune, and deploy generative AI models at scale. The platform supports over 200 open-source models across all modalities (chat, image, audio, vision, code, embeddings) with 2-3x faster inference than hyperscalers, OpenAI-compatible endpoints, and infrastructure scaling from 16 to 100,000+ GPUs. Founded by Stanford professors and serial entrepreneurs, Together AI is positioned as the cloud infrastructure layer for the open-source AI ecosystem, with recent expansion into data transformation through the Refuel.ai acquisition.
Problem solved
Organizations need high-performance, cost-effective infrastructure to train, fine-tune, and deploy generative AI models without vendor lock-in or reliance on proprietary closed-source APIs.
Target customer
Enterprise developers, AI research teams, and organizations deploying custom AI models at scale requiring high-performance inference, fine-tuning, and infrastructure control.
Founders
V
Vipul Ved Prakash
CEO & Co-Founder
Serial entrepreneur who co-founded Cloudmark (anti-spam), Topsy (acquired by Apple for $200M+ in 2013), and held senior AI/ML director role at Apple until 2018.
C
Ce Zhang
CTO & Co-Founder
Researcher from ETH Zurich focused on data management for machine learning systems.
C
Chris Ré
Co-Founder
Stanford University professor whose lab produced foundational work on data-centric AI.
P
Percy Liang
Co-Founder
Stanford University professor leading the Center for Research on Foundation Models (CRFM).
T
Tri Dao
Chief Scientist
Brought on in summer 2023 to lead AI research and optimization efforts.
Funding history
Seed $20M May 2023 Led by Unknown · Unknown
Series A $106M November 2023 Led by Unknown · Unknown
Series A Extension Unknown Unknown Led by Unknown · Unknown
Series B $305M February 2025 Led by General Catalyst · Prosperity7, Salesforce Ventures, DAMAC Capital, NVIDIA, Kleiner Perkins, March Capital, Emergence Capital, Lux Capital, SE Ventures, Greycroft, Coatue, Definition, Cadenza Ventures, Long Journey Ventures, Brave Capital, SK Telecom
Total raised: $534M
Pricing
Usage-based pricing across three buckets: Serverless Inference ($0.05-$7.00 per million tokens by model), Fine-Tuning ($0.48-$3.20 per million tokens), and GPU Cloud ($2.40/hour for dedicated infrastructure). New accounts receive $25 free credits; startup accelerator offers $15K-$50K in credits.
Notable customers
Salesforce, Zoom, SK Telecom, Hedra, Cognition, Zomato, Krea, Cartesia, The Washington Post, Arcee AI
Integrations
OpenAI-compatible API endpoints, NVIDIA Blackwell GPUs, Refuel.ai (data transformation), Hypertec (GPU cluster co-development)
Tech stack
GSAP (JavaScript frameworks) Swiper (JavaScript libraries) jQuery (JavaScript libraries) core-js (JavaScript libraries) Prism (UI frameworks) Open Graph LottieFiles Google Analytics (Analytics) Google Workspace (Email) Unpkg (CDN) jsDelivr (CDN) cdnjs (CDN) Cloudflare (CDN) Webflow (Page builders) Sendgrid (Email)
Website
Competitors
CoreWeave
GPU cloud provider; Together AI differentiates through integrated open-source model library and higher-level inference abstractions.
Fireworks AI
Serverless inference platform; Together AI offers broader infrastructure control and fine-tuning capabilities alongside inference.
Replicate
Model deployment platform focused on ease-of-use; Together AI targets enterprises needing scale and customization.
Google Vertex AI
Proprietary cloud solution; Together AI emphasizes open-source models and vendor independence.
Amazon SageMaker
AWS ecosystem lock-in; Together AI provides open infrastructure optimized for open-source models.
Why this matters: Together AI represents the emerging infrastructure layer for open-source AI, backed by $534M in funding and led by Stanford researchers and serial entrepreneurs. The company's 200 MW capacity buildout, NVIDIA Blackwell deployment, and Refuel.ai acquisition signal a shift toward decentralized, open-source alternatives to proprietary AI clouds, making it a critical player shaping the future of enterprise AI infrastructure.
Best for: Enterprise teams and AI researchers building production generative AI applications requiring high-performance inference, fine-tuning flexibility, and infrastructure control without vendor lock-in.
Use cases
Custom Model Fine-Tuning at Scale
Organizations train proprietary variants of open-source models on their data using Together's fine-tuning infrastructure, maintaining model ownership and reducing API costs. Zomato built a customer support bot achieving 2x satisfaction improvement and 1,000+ messages/minute throughput.
Low-Latency Token-Heavy Inference
Applications requiring high throughput and sub-millisecond latency use Together Reasoning Clusters achieving 110 tokens/sec decoding and 41+ queries per second at 32 concurrent requests, critical for real-time conversational AI and code generation.
Multi-Modal AI Application Deployment
Teams build applications using 200+ open-source models across chat, image, audio, vision, code, and embeddings modalities via a unified OpenAI-compatible API, eliminating the need to integrate multiple vendor platforms.
Alternatives
CoreWeave Pure GPU cloud provider without integrated model library; choose CoreWeave if you need bare metal GPU access but already have your own orchestration.
Fireworks AI Simpler serverless inference platform; choose Fireworks for rapid deployment over raw performance and fine-tuning customization.
Replicate User-friendly model deployment with no-code abstractions; choose Replicate for faster time-to-value over infrastructure control.
FAQ
What does Together AI do? +
Together AI is an AI Acceleration Cloud providing infrastructure to train, fine-tune, and deploy over 200 open-source generative AI models. It offers 2-3x faster inference than hyperscalers, OpenAI-compatible endpoints, and infrastructure scaling from 16 to 100,000+ GPUs, enabling enterprises to build and operate custom AI applications without vendor lock-in.
How much does Together AI cost? +
Serverless inference ranges from $0.05-$7.00 per million tokens depending on the model. Fine-tuning costs $0.48-$3.20 per million tokens. GPU cloud starts at $2.40/hour for dedicated infrastructure. New accounts receive $25 free credits, and startups can access $15K-$50K credit programs.
What are alternatives to Together AI? +
CoreWeave (GPU cloud infrastructure), Fireworks AI (serverless inference platform), Replicate (ease-of-use model deployment), Google Vertex AI (proprietary cloud), and Amazon SageMaker (AWS ecosystem solution).
Who uses Together AI? +
Enterprise developers, AI research teams, and production AI applications. Notable customers include Salesforce, Zoom, SK Telecom, Zomato, Cognition, and The Washington Post.
How does Together AI compare to CoreWeave? +
Together AI provides a higher-level platform with integrated open-source model library, fine-tuning, and inference APIs, while CoreWeave focuses on raw GPU cloud infrastructure. Together AI is better for teams wanting end-to-end model management; CoreWeave suits those needing bare metal GPU access with their own orchestration.
Tags
generative AI infrastructure open-source models model fine-tuning inference optimization GPU cloud enterprise AI