Data Observability

  • Initially came up with the idea of data downtime while at Gainsight

  • Why did New Relic or Datadog not have data monitoring features?

    • Primary users of APM and infra monitoring are engineering teams

    • Data teams have different pains

  • What do you measure in data observability?

    • Spoke to 150 organizations from small startups to large companies; 5 pillars of data observability - freshness/timeliness, volume, distribution, schema (Fields added removed), lineage (upstream, downstream assets)

  • Everything is becoming more data driven - 2-3 data warehouses, BI tools, data lakes

  • How were potential customers solving the issue?

    • Two main ways - one was manually validating data; 2-3 sets of eyes on data before people used it

    • One person is looking at it; counting rows, number of rows similar today to yesterday

  • Spend a lot of time with customers in a way that is not biased;

    • Showed a CTO this mockup and he said this is a great product but terrible slides LOL

    • If they had a magic wand and they could solve it tomorrow what would it be