Rockset

Series B Raise
Rockset focus
- Indexing for fast queries on large amounts of semi-structured data
- CTO and Cofounder (Dhruba Borthakur) - worked on HDFS at Yahoo, Founder of RocksDB - storage engine built at facebook
- Rockset is distributed system for data processing
- Converged Indexing
- Rockset Use Cases
  - Mostly batch processing
  - Hadoop - batch processing focused
  - Spark - Stream processing
  - Rockset - large data sets, query latency is super low, data latency (data is ready to query quickly)
    - SQL system, aggregations, joins
    - Queries have to be fast
    - High QPS - could be your user facing analytical database
    - Low data latency - avoid pipelining and ETL before it gets loaded into a queryable system
  - Focus: analytical applications - not workload reporting; fleet management company and trucks coming into loading zone - what things need to be loaded into truck; online gaming system with tons of players and you need leaderboards - most recent data to show leaderboards
  - Rockset is real-time indexing database on live data and avoiding ETL/pipelines
Design
- Analytical database - no transactions
Aggregator Leaf Tailer
- It has streams coming from other databases
- Tailer data from streams writes to leaf nodes; high volume of writes means more tailers
- Leaf Nodes make it ready - more data to store need more leaf nodes
- 2 level aggregator to serve SQL applications - lots of queries, then add more aggregators
- Different from lambda architecture - writes are separated from the reads, want latency to be low - need to handle burst traffic; user facing database - want low latency
- facebook newsfeed is an analytical app - need to look at a ton of data and decide what to show you quickly
- Scale tailers, leafs, and aggregators
- MemSQL, Scuba, Spam reduction systems do it, LinkedIn feed uses it as well
Benefits of ALT
- Allows for latency to be consistent; people really wanted fast latency queries and that's why key-value stores got popular; SQL queries have generally inconsistent execution
- Complex queries on the fly
- Separate reads and writes CQRS - write compute separation from read compute
Converged Indexing
- NoSQL database; dump JSON, XML - builds and indexes everything in your data; builds a row based index (like Postgres) and a column store (Snowflake) and an inverted index (like elastic)
- No need to maintain indexes
- Do you maintain consistency across indexes?
  - We have atomic updates but no ACID transactions
- Leaf nodes house converged indexing
- Storage - rockset stores it in three formats and tricks it into key-values; R fields are rows, C fields are columns, S fields are inverted index
- Support for arrays and nested documents
Q: The index is stored in the same tablespace right? We are doing it all in the same tablespace?
- RocksDB - leaving a lot of efficiency - RocksDB supports page level compression, need to be a byte (from the R)
- RocksDB has delta-encoding - 1 byte per 4k bytes
- Cockroach uses RocksDB
- Most people go for column store for large data sets; trying to reduce size
- Inverted index - different encoding
Challenges with Converged Index?
- Disk size could grow
- Want atomic writes for consistency - need to update all indexes after a write happens
Rockset uses Doc Sharding
- All indices for a doc stored on one machine; entire document and indices are stored on one machine
- Query needs to fan out to all machines in database
- Sharding - if one machine is slow you may struggle
- Complex SQL queries however, you can run in parallel with tons of machines - drastically reduces query latency
- Elasticsearch does doc-sharding
RocksDB LSM
- Log-structured merge stree
- One single write - amortized the random writes into sequential writes on storage; background compaction; handle sparse fields and are able to index them
Smart Schemas
- JSON but SQL queries
- Automatically generate a schema based on the fields that are present at time of ingestion; schema is reduced based on when you make the query
- Associate type with every value inside the column
- Have joins, aggregations, sorts and SQL types
How does SELECT * work here?
- You will get back NULLs for type mismatches (String and int for age, the strings may be NULL)
What % of customers use SQL vs. JSON?
- Most people use SQL
- Query Lambda - expose to developers that don't really know SQL
Challenges with Smart Schemas
- Challenge 1: Additional CPU usage for processing queries
  - Relational database may have separate schema and data storage
  - Schemaless - schema for every value
- Challenge 2: Requires increased disk space
  - Use a little more than strict relational tables
  - Type hoisting - if all the types for a given field are the same then we set it at the beginning - makes it a bit more efficient
Cloud Native and Scaling
- Cost of 1 CPU for 100 mins == cost of 100 CPU for 1 min
- With cloud - dynamically provision for current demand
- Challenge is in the software
- Tailers are pure CPUs - can be scaled up and down
- Use kubernetes and AWS for scaling
- Leaf nodes - also can be scaled up
- Spin up more aggregators for more
- Shared storage
- RocksDB cloud - layer on tops of RocksDB - every time a file gets produced it gets pushed to S3; durability comes from S3
- Performance - we use zero-copy clones of S-3
Are you running the open source RocksDB
- We are running opens source rocksdb
Compaction
- It needs to be tuned
- It typically runs on the node that runs RocksDB; we have done remote compaction - it makes an RPC to a compaction tier; shared storage architecture
ALT provides SQL on JSON
- no need to manage indexes
Focus is always on latency and not efficiency - we will spin up a bunch of machines to attack a query rather than try to make it the most efficient
Rockset Competitors
- Apache Druid (Imply)
- Similar architecture to MemSQL

Real time data is growing significantly
Ad optimization, A/B testing, fraud detection, real-time customer 360
Gaming, fleet tracking, real time recommendations
Rockset chooses the index based on the query type
Ingest compute and query compute can be scaled separately
Data volumes - 10s and 100s of TB; tends to be data in use and not petabytes; they don't need petabytes worth of data for operating in real-time
Rockset has support for joins; no support for joins in elasticsearch
To get around no joins with elasticsearch; people will add in a transformation step (join using spark) and then push to elastic; tends to duplicate storage, add an addition tool / service to keep up
Ingesting data
- Elastic - need beats and logstash and ingestion pipelines; need to reindex around new documents coming in
- Rockset
  - Minimizes data latency; produce data and able to query it in a second
Demo