Imply Data, Inc.

Imply helps enterprises analyze streaming data in real-time at petabyte scale.
Series D $215M total Founded 2015 Burlingame, California 224 employees
Imply provides a real-time analytics platform built on Apache Druid, enabling organizations to analyze streaming and batch data with sub-second query response times. The platform serves enterprises that need interactive analytics at scale, processing millions of events per second across petabytes of data. Unlike traditional data warehouses that struggle with latency and cost at scale, Imply's analytics-in-motion approach delivers fast, scalable, cost-effective insights on event-driven data.
Problem solved
Organizations struggle with analytics solutions that are slow, expensive to scale, and delayed, limiting their ability to make real-time decisions on streaming data.
Target customer
Enterprise companies requiring real-time analytics on high-volume event data, including digital media, fintech, e-commerce, and ad-tech companies with 100B+ daily events.
Founders
F
Fangjin Yang
CEO & Co-Founder
Original Apache Druid author; previously senior engineering roles at Metamarkets and Cisco; BASc in Electrical Engineering and MASc in Computer Engineering from University of Waterloo.
G
Gian Merlino
CTO & Co-Founder
Original Apache Druid author and Apache Druid PMC Chair; previously led data ingestion at Metamarkets and held senior roles at Yahoo; B.S. in Computer Science from Caltech.
V
Vadim Ogievetsky
Co-Founder
Original Apache Druid author; met co-founders at Metamarkets where they developed Druid.
Funding history
Seed $2M October 2015 Led by Khosla Ventures
Series B $30M December 2019 Led by Andreessen Horowitz · Khosla Ventures, Geodesic Ventures
Series C Unknown June 2021 Led by Bessemer Venture Partners
Series D $100M May 2022 Led by Thoma Bravo · OMERS, Bessemer Venture Partners, Andreessen Horowitz, Khosla Ventures
Total raised: $215M
Pricing
Not publicly available; contact for pricing on Imply Polaris (SaaS) and Imply Lumi offerings.
Notable customers
Netflix, Salesforce, Yahoo, Paytm, Roblox, PepsiCo, Charter Communications, Cisco, Twitter, Citrix, Atlassian, Confluent
Integrations
Apache Kafka, Apache Druid ecosystem, cloud platforms (multi-cloud support across North America, Europe, Asia Pacific)
Website
Competitors
Snowflake
General-purpose cloud data warehouse; slower at interactive queries and slice-and-dice analytics at scale compared to Druid's specialized architecture.
Apache Pinot
Open-source competitor with commercial offering through StarTree; Druid differentiates with superior stream integration and lower latency guarantees.
Rockset
Closed-source cloud analytics; Imply differentiates with open-source foundation and superior cost-efficiency at petabyte scale.
Confluent
Event streaming platform; Imply adds the real-time analytics layer for querying and visualizing event data.
DataStax
Database platform; Imply specializes specifically in real-time analytics workloads with sub-second latency.
Why this matters: Imply is a unicorn ($1.1B valuation) that created an entirely new analytics category around 'analytics-in-motion' for streaming data. With founders who created Apache Druid and backing from top-tier VCs (A16Z, Bessemer, Khosla), the company is proving that specialized real-time analytics databases can outperform general-purpose data warehouses at massive scale while reducing costs—critical for enterprises drowning in event data.
Best for: Enterprise organizations that need to query and visualize streaming data at massive scale with millisecond latency, such as digital media platforms, fintech, and ad-tech companies.
Use cases
Real-time Video Quality Monitoring
Netflix ingests 2M events/second to monitor playback quality across millions of concurrent viewers. Imply queries 1.5 trillion rows in milliseconds, enabling immediate detection and resolution of quality issues that would otherwise impact viewer experience.
Cost Optimization at Petabyte Scale
Salesforce reduced storage by 47% and accelerated queries by 30% while ingesting 5B events daily. The platform's segmentation and compression enable storage-efficient analytics on massive datasets that would be prohibitively expensive in traditional data warehouses.
Infrastructure Cost Reduction
Paytm cut infrastructure costs by 50% and boosted analytics performance 10x, freeing 12 weekly engineering hours by switching to Imply for real-time behavioral analytics.
Alternatives
Snowflake Best for batch analytics and reporting; significantly slower for interactive real-time queries at scale.
Apache Pinot with StarTree Open-source alternative; Imply provides more robust enterprise features, superior stream integration, and commercial support.
Clickhouse Open-source columnar database; more general-purpose; Druid specialized for event analytics with streaming ingestion and sub-second guarantees.
FAQ
What does Imply do? +
Imply provides a real-time analytics platform built on Apache Druid that enables organizations to ingest and query streaming and batch data with sub-second response times. It's designed for analytics-in-motion—interactive, scalable, real-time analytics on high-volume event data. Founded by the original creators of Apache Druid, Imply packages the open-source database with enterprise features, cloud hosting (Polaris), and vertical AI analytics (Lumi).
How much does Imply cost? +
Pricing is not publicly available. Imply offers multiple products: Imply Polaris (enterprise SaaS) and Imply Lumi (vertical AI analytics). Contact Imply sales for custom pricing based on data volume, query volume, and feature requirements.
What are alternatives to Imply? +
Snowflake (general-purpose data warehouse, slower at real-time queries), Apache Pinot with StarTree (open-source alternative with commercial support), Clickhouse (open-source columnar DB, less specialized for streaming), Rockset (closed-source cloud analytics), and Aerospike (specialized database).
Who uses Imply? +
Enterprise customers across digital media, fintech, e-commerce, and ad-tech with massive event volumes. Notable public customers include Netflix (2M events/sec), Salesforce (5B events/day), Yahoo (100B events/day), Paytm, Roblox, PepsiCo, Charter Communications, and Cisco.
How does Imply compare to Snowflake? +
Snowflake excels at batch analytics and reporting but struggles with interactive, low-latency queries at scale. Imply is purpose-built for real-time analytics on streaming data, using specialized segmentation and compression to deliver sub-second query times and 47% better storage efficiency. Snowflake is cheaper for batch workloads; Imply is faster and more cost-effective for streaming analytics.
Does Imply require SQL knowledge? +
Yes, Imply supports SQL-based querying through its SQL interface, making it accessible to SQL-familiar analysts. The platform originated PlyQL, a SQL-like query language, and fully supports standard SQL for data exploration and visualization.
Tags
real-time analytics streaming data Apache Druid event analytics petabyte scale data platform sub-second latency