Startups > d-Matrix

d-Matrix

d-Matrix accelerates AI inference in data centers with energy-efficient specialized hardware.

Series C $450M total Founded 2019 Santa Clara, California 244 employees

d-Matrix develops specialized AI inference accelerator hardware for data centers, featuring the Corsair inference accelerator card, JetStream I/O networking accelerator, and SquadRack rack-scale system. The company uses digital in-memory compute architecture to deliver 3x better cost-performance and 10x faster token generation compared to traditional GPU-based inference solutions. d-Matrix targets the massive AI inference market—where most compute costs occur after model training—by optimizing for real-time latency, energy efficiency, and cost at scale.

Problem solved

Data centers face prohibitive costs and latency challenges serving real-time AI inference at scale, requiring more efficient architectures than general-purpose GPU solutions.

Target customer

Cloud infrastructure providers, hyperscalers, and data center operators deploying large-scale generative AI inference workloads at production scale.

Website LinkedIn Crunchbase Twitter / X

Founders

Sid Sheth

Founder & CEO

Former SVP & GM of Networking Business Unit at Inphi (grew to $1B+), with 20+ years in semiconductor and networking at Intel, NetLogic Microsystems, and Aeluros; M.S.E.E from Purdue University.

Sudeep Bhoja

Founder & CTO

Extensive engineering background in high-speed data interconnects and optical networking; pioneered digital signal processing technologies for data centers at Inphi.

Funding history

Series A $44M April 2022 Led by Nautilus Venture Partners · Entrada Ventures, Playground Global, SK hynix

Series B $110M September 2023 Led by Temasek · Playground Global, M12, Nautilus Venture Partners, Entrada Ventures, Industry Ventures, Ericsson Ventures, Marlan Holding, Mirae Asset, Cortes Capital, Archerman Capital, TGC Square, Lam Capital, Samsung Ventures

Series C $275M November 2025 Led by BullhoundCapital, Triatomic Capital, Temasek · Qatar Investment Authority, EDBI, M12, Microsoft Venture Fund, Nautilus Venture Partners, Industry Ventures, Mirae Asset

Total raised: $450M

Industries

Artificial Intelligence (AI) Cloud Infrastructure Data Center Semiconductor

Pricing

Not publicly available. Hardware sold through partnerships (e.g., Supermicro SquadRack configurations) with custom enterprise terms.

Notable customers

Not disclosed; Supermicro partnership announced for commercial SquadRack availability; revenue generated in last 2 years but no named end-customers public.

Integrations

Arista, Broadcom, Supermicro, open-source frameworks (MLIR compiler stack), standard networking protocols

Tech stack

Alpine.js (JavaScript frameworks) LazySizes (JavaScript libraries) jQuery Migrate (JavaScript libraries) jQuery (JavaScript libraries) Open Graph HTTP/3 WordPress (Blogs) hCaptcha (Security) reCAPTCHA (Security) PHP (Programming languages) Apple iCloud Mail (Webmail) Microsoft 365 (Email) Cloudflare (CDN) MySQL (Databases) Yoast SEO (SEO) Azure (PaaS) WP Engine (PaaS) WPForms (WordPress plugins) EWWW Image Optimizer (WordPress plugins) Priority Hints (Performance)

Website

d-matrix.ai/

Competitors

Groq

Also targets AI inference efficiency but focuses on alternative tensor streaming architecture; broader positioning but less emphasis on in-memory compute.

Graphcore

Develops IPU (Intelligence Processing Unit) accelerators for AI but with different architectural approach; larger company with broader AI ambitions.

Cerebras

Builds large-scale wafer-scale AI chips optimized for training and inference; different form factor and power profile than d-Matrix's modular approach.

Fractile

Competing inference accelerator vendor with different architectural optimizations.

NVIDIA

Dominant GPU market leader; d-Matrix aims to complement not replace, focusing on specific inference use cases where efficiency matters more than general compute flexibility.

Why this matters: d-Matrix has raised $450M at a $2B valuation with backing from marquee investors (BullhoundCapital, Temasek, Microsoft Ventures, Qatar Investment Authority), indicating strong confidence in specialized AI inference acceleration as a major market. The company's recent SquadRack announcement with industry leaders (Supermicro, Arista, Broadcom) suggests momentum toward industry standards and production deployments, positioning it as a credible challenger to GPU dominance in the rapidly growing AI inference market.

Best for: Cloud providers and enterprises running high-volume generative AI inference workloads where latency, power consumption, and cost-per-inference are critical constraints.

Use cases

Real-Time LLM Inference Serving

Data centers serving chatbot and generative AI APIs at production scale need sub-100ms latency for user-facing applications. d-Matrix's Corsair achieves 10x faster token generation, enabling interactive AI experiences without expensive GPU overprovisioning. This is critical for inference at hyperscale where cost per inference directly impacts unit economics.

Batched Inference at Scale

SquadRack's disaggregated rack-scale architecture optimizes batched inference workloads common in enterprise AI deployments. The 3x better cost-performance and 3x higher energy efficiency reduce total cost of ownership versus traditional GPU clusters, making large-scale inference commercially viable for broader use cases.

Energy-Constrained Data Centers

Power-limited data centers or those in regions with high electricity costs can reduce infrastructure costs with d-Matrix's energy-efficient architecture. JetStream's custom I/O design eliminates bottlenecks between compute and memory, maximizing throughput per watt consumed.

Alternatives

NVIDIA GPU Clusters General-purpose GPUs dominate but waste compute resources on inference workloads; choose GPUs if flexibility and training capability are priorities.

Custom Silicon (Google TPU, AWS Trainium) Locked to specific cloud providers with proprietary software stacks; choose if you're already committed to that ecosystem.

CPU-Based Inference Engines Lower cost but significantly slower latency; only viable for non-latency-sensitive batch processing.

FAQ

What does d-Matrix do? +

d-Matrix designs specialized hardware accelerators for AI inference in data centers. The company's Corsair inference card, JetStream I/O accelerator, and SquadRack system deliver optimized performance for serving generative AI models with lower latency and better energy efficiency than traditional GPU solutions. Their digital in-memory compute architecture is purpose-built for the inference phase of AI workloads.

How much does d-Matrix cost? +

Pricing is not publicly disclosed. d-Matrix sells through partnerships with infrastructure providers like Supermicro and uses custom enterprise pricing. Contact the company directly for pricing on SquadRack deployments or Corsair cards.

What are alternatives to d-Matrix? +

Alternatives include NVIDIA's GPU clusters (general-purpose but energy-intensive for inference), cloud vendor custom silicon like Google TPU and AWS Trainium (ecosystem-locked), Groq (alternative inference-focused accelerator), Graphcore IPUs, and Cerebras wafer-scale chips. Choice depends on cost sensitivity, latency requirements, and ecosystem preferences.

Who uses d-Matrix? +

Target customers are cloud providers, hyperscalers, and enterprise data centers deploying production-scale generative AI inference. Specific named end-customers are not publicly disclosed, but partnerships with Supermicro, Arista, and Broadcom indicate adoption path through infrastructure channels.

How does d-Matrix compare to NVIDIA? +

d-Matrix specializes in inference efficiency while NVIDIA dominates general-purpose GPU computing. NVIDIA GPUs are more flexible (handling training and inference) but waste compute resources and power on inference workloads. d-Matrix aims to complement NVIDIA by offering 3x better cost-performance and 10x faster inference for specific use cases, not replace their broader platform.