Rockset

Rockset

  • Series B Raise

  • Rockset focus

    • Indexing for fast queries on large amounts of semi-structured data

    • CTO and Cofounder (Dhruba Borthakur) - worked on HDFS at Yahoo, Founder of RocksDB - storage engine built at facebook

    • Rockset is distributed system for data processing

    • Converged Indexing

    • Rockset Use Cases

      • Mostly batch processing

      • Hadoop - batch processing focused

      • Spark - Stream processing

      • Rockset - large data sets, query latency is super low, data latency (data is ready to query quickly)

        • SQL system, aggregations, joins

        • Queries have to be fast

        • High QPS - could be your user facing analytical database

        • Low data latency - avoid pipelining and ETL before it gets loaded into a queryable system

      • Focus: analytical applications - not workload reporting; fleet management company and trucks coming into loading zone - what things need to be loaded into truck; online gaming system with tons of players and you need leaderboards - most recent data to show leaderboards

      • Rockset is real-time indexing database on live data and avoiding ETL/pipelines

  • Design

    • Analytical database - no transactions

  • Aggregator Leaf Tailer

    • It has streams coming from other databases

    • Tailer data from streams writes to leaf nodes; high volume of writes means more tailers

    • Leaf Nodes make it ready - more data to store need more leaf nodes

    • 2 level aggregator to serve SQL applications - lots of queries, then add more aggregators

    • Different from lambda architecture - writes are separated from the reads, want latency to be low - need to handle burst traffic; user facing database - want low latency

    • facebook newsfeed is an analytical app - need to look at a ton of data and decide what to show you quickly

    • Scale tailers, leafs, and aggregators

    • MemSQL, Scuba, Spam reduction systems do it, LinkedIn feed uses it as well

  • Benefits of ALT

    • Allows for latency to be consistent; people really wanted fast latency queries and that's why key-value stores got popular; SQL queries have generally inconsistent execution

    • Complex queries on the fly

    • Separate reads and writes CQRS - write compute separation from read compute

  • Converged Indexing

    • NoSQL database; dump JSON, XML - builds and indexes everything in your data; builds a row based index (like Postgres) and a column store (Snowflake) and an inverted index (like elastic)

    • No need to maintain indexes

    • Do you maintain consistency across indexes?

      • We have atomic updates but no ACID transactions

    • Leaf nodes house converged indexing

    • Storage - rockset stores it in three formats and tricks it into key-values; R fields are rows, C fields are columns, S fields are inverted index

    • Support for arrays and nested documents

  • Q: The index is stored in the same tablespace right? We are doing it all in the same tablespace?

    • RocksDB - leaving a lot of efficiency - RocksDB supports page level compression, need to be a byte (from the R)

    • RocksDB has delta-encoding - 1 byte per 4k bytes

    • Cockroach uses RocksDB

    • Most people go for column store for large data sets; trying to reduce size

    • Inverted index - different encoding

  • Challenges with Converged Index?

    • Disk size could grow

    • Want atomic writes for consistency - need to update all indexes after a write happens

  • Rockset uses Doc Sharding

    • All indices for a doc stored on one machine; entire document and indices are stored on one machine

    • Query needs to fan out to all machines in database

    • Sharding - if one machine is slow you may struggle

    • Complex SQL queries however, you can run in parallel with tons of machines - drastically reduces query latency

    • Elasticsearch does doc-sharding

  • RocksDB LSM

    • Log-structured merge stree

    • One single write - amortized the random writes into sequential writes on storage; background compaction; handle sparse fields and are able to index them

  • Smart Schemas

    • JSON but SQL queries

    • Automatically generate a schema based on the fields that are present at time of ingestion; schema is reduced based on when you make the query

    • Associate type with every value inside the column

    • Have joins, aggregations, sorts and SQL types

  • How does SELECT * work here?

    • You will get back NULLs for type mismatches (String and int for age, the strings may be NULL)

  • What % of customers use SQL vs. JSON?

    • Most people use SQL

    • Query Lambda - expose to developers that don't really know SQL

  • Challenges with Smart Schemas

    • Challenge 1: Additional CPU usage for processing queries

      • Relational database may have separate schema and data storage

      • Schemaless - schema for every value

    • Challenge 2: Requires increased disk space

      • Use a little more than strict relational tables

      • Type hoisting - if all the types for a given field are the same then we set it at the beginning - makes it a bit more efficient

  • Cloud Native and Scaling

    • Cost of 1 CPU for 100 mins == cost of 100 CPU for 1 min

    • With cloud - dynamically provision for current demand

    • Challenge is in the software

    • Tailers are pure CPUs - can be scaled up and down

    • Use kubernetes and AWS for scaling

    • Leaf nodes - also can be scaled up

    • Spin up more aggregators for more

    • Shared storage

    • RocksDB cloud - layer on tops of RocksDB - every time a file gets produced it gets pushed to S3; durability comes from S3

    • Performance - we use zero-copy clones of S-3

  • Are you running the open source RocksDB

    • We are running opens source rocksdb

  • Compaction

    • It needs to be tuned

    • It typically runs on the node that runs RocksDB; we have done remote compaction - it makes an RPC to a compaction tier; shared storage architecture

  • ALT provides SQL on JSON

    • no need to manage indexes

  • Focus is always on latency and not efficiency - we will spin up a bunch of machines to attack a query rather than try to make it the most efficient

  • Rockset Competitors

    • Apache Druid (Imply)

    • Similar architecture to MemSQL

  • Real time data is growing significantly

  • Ad optimization, A/B testing, fraud detection, real-time customer 360

  • Gaming, fleet tracking, real time recommendations

  • Rockset chooses the index based on the query type

  • Ingest compute and query compute can be scaled separately

  • Data volumes - 10s and 100s of TB; tends to be data in use and not petabytes; they don't need petabytes worth of data for operating in real-time

  • Rockset has support for joins; no support for joins in elasticsearch

  • To get around no joins with elasticsearch; people will add in a transformation step (join using spark) and then push to elastic; tends to duplicate storage, add an addition tool / service to keep up

  • Ingesting data

    • Elastic - need beats and logstash and ingestion pipelines; need to reindex around new documents coming in

    • Rockset

      • Minimizes data latency; produce data and able to query it in a second

  • Demo