Data Observability
Initially came up with the idea of data downtime while at Gainsight
Why did New Relic or Datadog not have data monitoring features?
Primary users of APM and infra monitoring are engineering teams
Data teams have different pains
What do you measure in data observability?
Spoke to 150 organizations from small startups to large companies; 5 pillars of data observability - freshness/timeliness, volume, distribution, schema (Fields added removed), lineage (upstream, downstream assets)
Everything is becoming more data driven - 2-3 data warehouses, BI tools, data lakes
How were potential customers solving the issue?
Two main ways - one was manually validating data; 2-3 sets of eyes on data before people used it
One person is looking at it; counting rows, number of rows similar today to yesterday
Spend a lot of time with customers in a way that is not biased;
Showed a CTO this mockup and he said this is a great product but terrible slides LOL
If they had a magic wand and they could solve it tomorrow what would it be