What Netflix Knows About Your Customers That Your BI Team Doesn't

The difference between a company that has data and a company that uses data is always the Gold layer. Raw and clean are necessary but not sufficient. Purpose-built, business-question-focused tables are where AI actually lives.

Netflix knows that you paused a documentary at the 23-minute mark on a Tuesday evening, rewound to the 19-minute mark, watched for four more minutes, and then switched to a comedy series. It knows that this behavioral pattern, combined with 47 other data signals from your viewing history, is a strong predictor that you will watch a specific type of content on Friday night. It used this knowledge to select the thumbnail it showed you last Friday, and you clicked it.

This is not magic. It is the Gold layer. And the gap between what Netflix does with behavioral data and what most enterprise analytics teams do with their equivalent data — customer interaction logs, transaction histories, product usage events — is not a model gap.

It is a data architecture gap.

Netflix's Data Architecture: The Medallion Model at 700 Billion Events Per Day

Every interaction a Netflix user has with the platform generates an event: a play, a pause, a search, a scroll, a rating, a browse. These raw events are Bronze-layer data — timestamped records of what happened, with no interpretation, no aggregation, and no business context.

Netflix ingests approximately 700 billion of these events every day.

The Silver layer transformation converts raw events into user-session-level data: structured records that represent coherent viewing sessions rather than individual interactions, with engineered features like session duration, content completion rate, platform used, and time-of-day context. This transformation requires significant data engineering: session boundary detection, event deduplication, anomaly filtering, and feature engineering that encodes domain knowledge about what makes a viewing session meaningful.

The Gold layer is where the distance from most enterprise analytics becomes stark. Netflix builds purpose-specific Gold tables for each AI application: a personalization features table for the recommendation engine, a churn signals table for retention modeling, a content performance table for acquisition decisions, an A/B test outcome table for product experimentation. Each Gold table is owned by a specific team, has explicit SLAs for freshness and quality, and is designed around the specific data contract the consuming model requires.

What Most Enterprise BI Teams Build Instead

A single 'master customer table' that tries to serve all analytics use cases simultaneously and therefore serves none of them optimally
Dashboards that query Silver-level data directly, performing ad-hoc aggregations that are inconsistent across reports and never validated as training inputs for AI
Data marts that duplicate Silver tables with minor transformations, adding storage cost without adding the semantic enrichment that Gold tables require
'Data lakes' that are, in practice, Bronze layer repositories with a discovery interface — data that is technically accessible but practically unusable without significant additional transformation

Identifying Your Organization's Gold Table Gap

The Gold table gap is the space between the data your organization produces and the data your AI applications actually need. It exists in every organization that has not explicitly designed its data pipeline around the consumption requirements of its AI models.

Identifying the gap requires working backward from the AI application to the data requirements. What question is the AI answering? What data does it need to answer that question accurately? At what grain? With what features? With what freshness? Comparing that data specification to what currently exists in your pipeline reveals the gap. In most organizations, the gap is not that the raw data doesn't exist. It is that the raw data has never been transformed into the form the AI needs.

Building Gold Tables: The Five Design Principles

Single purpose — each Gold table is designed to answer a specific business question for a specific consumer. A Gold table that tries to serve multiple AI applications is a Silver table with a Gold name.
Explicit data contract — the schema, grain, freshness, and quality standards of every Gold table are documented and governed as a formal contract between the data producer and the AI consumer
Feature engineering at the layer, not at the model — all feature transformations are implemented in the Gold table, not in the model training code, ensuring that training and inference use identical feature logic
Full lineage — every Gold table documents the Silver tables it was derived from, enabling rapid debugging when model outputs degrade
SLA-backed freshness — Gold tables have explicit, monitored freshness guarantees because AI models degrade when they receive stale features

THE ENTERPRISE APPLICATION

Identify the three AI applications that would deliver the most business value for your organization. For each one, document the ideal Gold table it would consume: what are the required fields, at what grain, with what freshness, derived from which source data? Then compare that ideal Gold table to what currently exists. The gap you identify is your data engineering roadmap. Close it, and you will have solved 80% of the problem of making the AI work. Netflix's recommendation engine is not smarter than your analytics team's models because Netflix hired better data scientists. It is smarter because Netflix spent fifteen years building the data infrastructure that produces the Gold tables the models need. The investment is in the pipeline, not the algorithm. It always has been.

LessonsFromHardProblems DataIntelligenceSeries OneBigTable

Follow OBT on LinkedIn

What Netflix Knows About Your Customers That Your BI Team Doesn't

Netflix's Data Architecture: The Medallion Model at 700 Billion Events Per Day

What Most Enterprise BI Teams Build Instead

Identifying Your Organization's Gold Table Gap

Building Gold Tables: The Five Design Principles

Is your data ready for what AI needs?