Home Services Industries Insights Our Approach About Assess Your Readiness → Book a Session
Lessons From Hard Problems Data Intelligence Series October 2, 2025

What Netflix Knows About Your Customers That Your BI Team Doesn't

Netflix turns 700 billion events per day into Gold tables that power recommendation AI. Most enterprise teams never make it past Silver. The Gold table gap.

What Netflix Knows About Your Customers That Your BI Team Doesn't — cover illustration

The difference between a company that has data and a company that uses data is always the Gold layer. Raw and clean are necessary but not sufficient. Purpose-built, business-question-focused tables are where AI actually lives.

Netflix knows that you paused a documentary at the 23-minute mark on a Tuesday evening, rewound to the 19-minute mark, watched for four more minutes, and then switched to a comedy series. It knows that this behavioral pattern, combined with 47 other data signals from your viewing history, is a strong predictor that you will watch a specific type of content on Friday night. It used this knowledge to select the thumbnail it showed you last Friday, and you clicked it.

This is not magic. It is the Gold layer. And the gap between what Netflix does with behavioral data and what most enterprise analytics teams do with their equivalent data — customer interaction logs, transaction histories, product usage events — is not a model gap.

It is a data architecture gap.

Netflix's Data Architecture: The Medallion Model at 700 Billion Events Per Day

Every interaction a Netflix user has with the platform generates an event: a play, a pause, a search, a scroll, a rating, a browse. These raw events are Bronze-layer data — timestamped records of what happened, with no interpretation, no aggregation, and no business context.

Netflix ingests approximately 700 billion of these events every day.

The Silver layer transformation converts raw events into user-session-level data: structured records that represent coherent viewing sessions rather than individual interactions, with engineered features like session duration, content completion rate, platform used, and time-of-day context. This transformation requires significant data engineering: session boundary detection, event deduplication, anomaly filtering, and feature engineering that encodes domain knowledge about what makes a viewing session meaningful.

The Gold layer is where the distance from most enterprise analytics becomes stark. Netflix builds purpose-specific Gold tables for each AI application: a personalization features table for the recommendation engine, a churn signals table for retention modeling, a content performance table for acquisition decisions, an A/B test outcome table for product experimentation. Each Gold table is owned by a specific team, has explicit SLAs for freshness and quality, and is designed around the specific data contract the consuming model requires.

What Most Enterprise BI Teams Build Instead

  • A single 'master customer table' that tries to serve all analytics use cases simultaneously and therefore serves none of them optimally
  • Dashboards that query Silver-level data directly, performing ad-hoc aggregations that are inconsistent across reports and never validated as training inputs for AI
  • Data marts that duplicate Silver tables with minor transformations, adding storage cost without adding the semantic enrichment that Gold tables require
  • 'Data lakes' that are, in practice, Bronze layer repositories with a discovery interface — data that is technically accessible but practically unusable without significant additional transformation

Identifying Your Organization's Gold Table Gap

The Gold table gap is the space between the data your organization produces and the data your AI applications actually need. It exists in every organization that has not explicitly designed its data pipeline around the consumption requirements of its AI models.

Identifying the gap requires working backward from the AI application to the data requirements. What question is the AI answering? What data does it need to answer that question accurately? At what grain? With what features? With what freshness? Comparing that data specification to what currently exists in your pipeline reveals the gap. In most organizations, the gap is not that the raw data doesn't exist. It is that the raw data has never been transformed into the form the AI needs.

Building Gold Tables: The Five Design Principles

  • Single purpose — each Gold table is designed to answer a specific business question for a specific consumer. A Gold table that tries to serve multiple AI applications is a Silver table with a Gold name.
  • Explicit data contract — the schema, grain, freshness, and quality standards of every Gold table are documented and governed as a formal contract between the data producer and the AI consumer
  • Feature engineering at the layer, not at the model — all feature transformations are implemented in the Gold table, not in the model training code, ensuring that training and inference use identical feature logic
  • Full lineage — every Gold table documents the Silver tables it was derived from, enabling rapid debugging when model outputs degrade
  • SLA-backed freshness — Gold tables have explicit, monitored freshness guarantees because AI models degrade when they receive stale features
THE ENTERPRISE APPLICATION

Identify the three AI applications that would deliver the most business value for your organization. For each one, document the ideal Gold table it would consume: what are the required fields, at what grain, with what freshness, derived from which source data? Then compare that ideal Gold table to what currently exists. The gap you identify is your data engineering roadmap. Close it, and you will have solved 80% of the problem of making the AI work. Netflix's recommendation engine is not smarter than your analytics team's models because Netflix hired better data scientists. It is smarter because Netflix spent fifteen years building the data infrastructure that produces the Gold tables the models need. The investment is in the pipeline, not the algorithm. It always has been.

Follow OBT on LinkedIn
Back to all articles

Is your data ready for what AI needs?

Book a Session