d-Matrix
d-Matrix accelerates AI inference in data centers with energy-efficient specialized hardware.
d-Matrix develops specialized AI inference accelerator hardware for data centers, featuring the Corsair inference accelerator card, JetStream I/O networking accelerator, and SquadRack rack-scale system. The company uses digital in-memory compute architecture to deliver 3x better cost-performance and 10x faster token generation compared to traditional GPU-based inference solutions. d-Matrix targets the massive AI inference market—where most compute costs occur after model training—by optimizing for real-time latency, energy efficiency, and cost at scale.
Problem solved
Data centers face prohibitive costs and latency challenges serving real-time AI inference at scale, requiring more efficient architectures than general-purpose GPU solutions.
Target customer
Cloud infrastructure providers, hyperscalers, and data center operators deploying large-scale generative AI inference workloads at production scale.
Founders
S
Sid Sheth
Founder & CEO
Former SVP & GM of Networking Business Unit at Inphi (grew to $1B+), with 20+ years in semiconductor and networking at Intel, NetLogic Microsystems, and Aeluros; M.S.E.E from Purdue University.
S
Sudeep Bhoja
Founder & CTO
Extensive engineering background in high-speed data interconnects and optical networking; pioneered digital signal processing technologies for data centers at Inphi.
Funding history
Series A
$44M
April 2022
Led by Nautilus Venture Partners
· Entrada Ventures, Playground Global, SK hynix
Series B
$110M
September 2023
Led by Temasek
· Playground Global, M12, Nautilus Venture Partners, Entrada Ventures, Industry Ventures, Ericsson Ventures, Marlan Holding, Mirae Asset, Cortes Capital, Archerman Capital, TGC Square, Lam Capital, Samsung Ventures
Series C
$275M
November 2025
Led by BullhoundCapital, Triatomic Capital, Temasek
· Qatar Investment Authority, EDBI, M12, Microsoft Venture Fund, Nautilus Venture Partners, Industry Ventures, Mirae Asset
Total raised:
$450M
Pricing
Not publicly available. Hardware sold through partnerships (e.g., Supermicro SquadRack configurations) with custom enterprise terms.
Notable customers
Not disclosed; Supermicro partnership announced for commercial SquadRack availability; revenue generated in last 2 years but no named end-customers public.
Integrations
Arista, Broadcom, Supermicro, open-source frameworks (MLIR compiler stack), standard networking protocols
Tech stack
Alpine.js (JavaScript frameworks)
LazySizes (JavaScript libraries)
jQuery Migrate (JavaScript libraries)
jQuery (JavaScript libraries)
Open Graph
HTTP/3
WordPress (Blogs)
hCaptcha (Security)
reCAPTCHA (Security)
PHP (Programming languages)
Apple iCloud Mail (Webmail)
Microsoft 365 (Email)
Cloudflare (CDN)
MySQL (Databases)
Yoast SEO (SEO)
Azure (PaaS)
WP Engine (PaaS)
WPForms (WordPress plugins)
EWWW Image Optimizer (WordPress plugins)
Priority Hints (Performance)
Website
Competitors
Groq
Also targets AI inference efficiency but focuses on alternative tensor streaming architecture; broader positioning but less emphasis on in-memory compute.
Graphcore
Develops IPU (Intelligence Processing Unit) accelerators for AI but with different architectural approach; larger company with broader AI ambitions.
Cerebras
Builds large-scale wafer-scale AI chips optimized for training and inference; different form factor and power profile than d-Matrix's modular approach.
Fractile
Competing inference accelerator vendor with different architectural optimizations.
NVIDIA
Dominant GPU market leader; d-Matrix aims to complement not replace, focusing on specific inference use cases where efficiency matters more than general compute flexibility.
Why this matters: d-Matrix has raised $450M at a $2B valuation with backing from marquee investors (BullhoundCapital, Temasek, Microsoft Ventures, Qatar Investment Authority), indicating strong confidence in specialized AI inference acceleration as a major market. The company's recent SquadRack announcement with industry leaders (Supermicro, Arista, Broadcom) suggests momentum toward industry standards and production deployments, positioning it as a credible challenger to GPU dominance in the rapidly growing AI inference market.
Best for: Cloud providers and enterprises running high-volume generative AI inference workloads where latency, power consumption, and cost-per-inference are critical constraints.
Use cases
Real-Time LLM Inference Serving
Data centers serving chatbot and generative AI APIs at production scale need sub-100ms latency for user-facing applications. d-Matrix's Corsair achieves 10x faster token generation, enabling interactive AI experiences without expensive GPU overprovisioning. This is critical for inference at hyperscale where cost per inference directly impacts unit economics.
Batched Inference at Scale
SquadRack's disaggregated rack-scale architecture optimizes batched inference workloads common in enterprise AI deployments. The 3x better cost-performance and 3x higher energy efficiency reduce total cost of ownership versus traditional GPU clusters, making large-scale inference commercially viable for broader use cases.
Energy-Constrained Data Centers
Power-limited data centers or those in regions with high electricity costs can reduce infrastructure costs with d-Matrix's energy-efficient architecture. JetStream's custom I/O design eliminates bottlenecks between compute and memory, maximizing throughput per watt consumed.
Alternatives
NVIDIA GPU Clusters
General-purpose GPUs dominate but waste compute resources on inference workloads; choose GPUs if flexibility and training capability are priorities.
Custom Silicon (Google TPU, AWS Trainium)
Locked to specific cloud providers with proprietary software stacks; choose if you're already committed to that ecosystem.
CPU-Based Inference Engines
Lower cost but significantly slower latency; only viable for non-latency-sensitive batch processing.
FAQ
What does d-Matrix do? +
d-Matrix designs specialized hardware accelerators for AI inference in data centers. The company's Corsair inference card, JetStream I/O accelerator, and SquadRack system deliver optimized performance for serving generative AI models with lower latency and better energy efficiency than traditional GPU solutions. Their digital in-memory compute architecture is purpose-built for the inference phase of AI workloads.
How much does d-Matrix cost? +
Pricing is not publicly disclosed. d-Matrix sells through partnerships with infrastructure providers like Supermicro and uses custom enterprise pricing. Contact the company directly for pricing on SquadRack deployments or Corsair cards.
What are alternatives to d-Matrix? +
Alternatives include NVIDIA's GPU clusters (general-purpose but energy-intensive for inference), cloud vendor custom silicon like Google TPU and AWS Trainium (ecosystem-locked), Groq (alternative inference-focused accelerator), Graphcore IPUs, and Cerebras wafer-scale chips. Choice depends on cost sensitivity, latency requirements, and ecosystem preferences.
Who uses d-Matrix? +
Target customers are cloud providers, hyperscalers, and enterprise data centers deploying production-scale generative AI inference. Specific named end-customers are not publicly disclosed, but partnerships with Supermicro, Arista, and Broadcom indicate adoption path through infrastructure channels.
How does d-Matrix compare to NVIDIA? +
d-Matrix specializes in inference efficiency while NVIDIA dominates general-purpose GPU computing. NVIDIA GPUs are more flexible (handling training and inference) but waste compute resources and power on inference workloads. d-Matrix aims to complement NVIDIA by offering 3x better cost-performance and 10x faster inference for specific use cases, not replace their broader platform.
Tags
AI inference
hardware accelerator
data center
semiconductors
energy efficiency
LLM serving
specialized silicon