Celebrating two years of Omni 🥳

The CDN for data

Bringing web best practices from Content Delivery Networks to business intelligence

How CDNs work Diagram
Image credit: Cloudflare

The web is built on a series of cascading caches that optimizes for speed by storing information close to the user at the extreme in edge CDNs (Content Delivery Networks). This is how you get instant responsiveness on the latest TikTok videos. This technique should also apply to data. Omni’s new data platform brings the best of extract based performance with processing in the user's browser and intelligent caching in layers from the data warehouse, application and edge workers. In effect, the platform operates as a virtual database leveraging the relational algebra created in a semantic model. This is part of the opportunity in the data ecosystem and why we started Omni, the newest BI and data platform.

Business intelligence tools and data platforms have new demands driven by modern use cases building data applications with composable analytics as Gartner writes about. Omni is designed to take advantage of the latest technology and infrastructure with a deep understanding of these use cases. We are building on the shoulders of giants and deeply integrating with modern infrastructure, using tried and true best practices from the data industry and from data intensive applications.

We all want our data now. 🍴

The rise of cloud native data warehouses and a wide range of new data applications has brought new demand for data infrastructure. Data Feedback Loops as Tomasz Tunguz wrote about are driving for the Data Warehouse to become the new backend. The industry is now using Snowflake and BigQuery to store large volumes of data but are faced with difficulty to drive cost performance.

But data warehouses weren’t originally architected to address these new use cases. The engines are designed for processing massive data volumes but not necessarily the concurrency or latency constraints for modern data applications. Unistore, announced at the recent Snowflake Summit, highlights one way to address OLTP use cases next to Snowflake’s OLAP focus. From the other direction, Google launched AlloyDB, dramatically speeding up analytical workloads on top of their enterprise grade postgres transaction database.

Infrastructure is meeting in the middle.

As a BI platform, Omni can do more. Omni sits next to the user and deeply understands their use case. Omni drives performance optimization in layers and takes advantage of the custom functionality that Snowflake and BigQuery provide (and others in the future). We also natively support BI query optimization, aggregate awareness or leveraging pre-calculated aggregates to speed up query performance. This is in essence creating a cache in the data warehouse. (Side note: Aggregate awareness has conceptually been around for decades but can now be applied in a modern way that is simple and powerful). It is easy to generate and manage that aggregate in Omni. For complex or mission critical workloads, customers can bring that definition into dbt cloud, Airbyte or airflow to manage the ETL process while still getting the benefit of transparent query optimization.

Building the equivalent of a data CDN to optimize for analysis speed

When a user browses TikTok’s popular videos, they hit a cache that can dramatically speed up access end-to-end performance. But there is a variety of higher level caches that act to speed up content closer to the user. Media has long leveraged Content Delivery Networks (CDNs) to cache information to speed up content delivery, lower latency and reduce network traffic.

Omni’s architecture behaves similarly. You leverage precomputed aggregates in the data warehouse. Omni uses relational algebra powered by a semantic model to do intelligent performance optimization as close as possible to the data user. These aggregations can be accelerated in Omni’s in memory, re-queryable cache, at the edge, and in the browser client.

OMNI Semantic Model diagram

Classic aggregate awareness enables the BI platform to do query optimization. Imagine there is a table that logs all transactions for a number of products and a dashboard that has a time series of count of transactions and sum of revenue. A precomputed aggregate table [day, product, revenue, count] is faster and less resource intensive than re-running on the raw transaction table for every dashboard load.

In Omni’s virtualized re-queryable cache, that aggregate table can loaded at the application layer and asynchronously loaded into the users browser. If the user wants to cross-filter and select a specific product or group of products, Omni can answer instantaneously from the aggregate table in the browser without going back to the application server or data warehouse for the raw data. Omni is modernizing the OLAP cube and bringing them to the edge and client.

This new architecture allows Omni’s BI platform to perform faster than extract based desktop applications on top of live dynamic aggregates and gives customers the full granularity, scale, and freshness of cloud native data warehouses. BI shouldn’t force you to either extract or live connect.

Our mission is to help our customers create value from data. Performance is core to that. It has been repeatedly shown even marginal friction dramatically lowers the likelihood of engagement. Omni’s intelligent data CDN and edge cubes optimize data infrastructure.

Omni is the fastest BI and data platform without trading off scale for performance.

If this sounds like an interesting problem to solve, we are hiring for all engineering roles. Join us us to build the business intelligence platform that combines the consistency of a shared data model with the freedom of SQL.