Rockset
Rockset
Rockset focus
Indexing for fast queries on large amounts of semi-structured data
CTO and Cofounder (Dhruba Borthakur) - worked on HDFS at Yahoo, Founder of RocksDB - storage engine built at facebook
Rockset is distributed system for data processing
Converged Indexing
Rockset Use Cases
Mostly batch processing
Hadoop - batch processing focused
Spark - Stream processing
Rockset - large data sets, query latency is super low, data latency (data is ready to query quickly)
SQL system, aggregations, joins
Queries have to be fast
High QPS - could be your user facing analytical database
Low data latency - avoid pipelining and ETL before it gets loaded into a queryable system
Focus: analytical applications - not workload reporting; fleet management company and trucks coming into loading zone - what things need to be loaded into truck; online gaming system with tons of players and you need leaderboards - most recent data to show leaderboards
Rockset is real-time indexing database on live data and avoiding ETL/pipelines
Design
Analytical database - no transactions
Aggregator Leaf Tailer
It has streams coming from other databases
Tailer data from streams writes to leaf nodes; high volume of writes means more tailers
Leaf Nodes make it ready - more data to store need more leaf nodes
2 level aggregator to serve SQL applications - lots of queries, then add more aggregators
Different from lambda architecture - writes are separated from the reads, want latency to be low - need to handle burst traffic; user facing database - want low latency
facebook newsfeed is an analytical app - need to look at a ton of data and decide what to show you quickly
Scale tailers, leafs, and aggregators
MemSQL, Scuba, Spam reduction systems do it, LinkedIn feed uses it as well
Benefits of ALT
Allows for latency to be consistent; people really wanted fast latency queries and that's why key-value stores got popular; SQL queries have generally inconsistent execution
Complex queries on the fly
Separate reads and writes CQRS - write compute separation from read compute
Converged Indexing
NoSQL database; dump JSON, XML - builds and indexes everything in your data; builds a row based index (like Postgres) and a column store (Snowflake) and an inverted index (like elastic)
No need to maintain indexes
Do you maintain consistency across indexes?
We have atomic updates but no ACID transactions
Leaf nodes house converged indexing
Storage - rockset stores it in three formats and tricks it into key-values; R fields are rows, C fields are columns, S fields are inverted index
Support for arrays and nested documents
Q: The index is stored in the same tablespace right? We are doing it all in the same tablespace?
RocksDB - leaving a lot of efficiency - RocksDB supports page level compression, need to be a byte (from the R)
RocksDB has delta-encoding - 1 byte per 4k bytes
Cockroach uses RocksDB
Most people go for column store for large data sets; trying to reduce size
Inverted index - different encoding
Challenges with Converged Index?
Disk size could grow
Want atomic writes for consistency - need to update all indexes after a write happens
Rockset uses Doc Sharding
All indices for a doc stored on one machine; entire document and indices are stored on one machine
Query needs to fan out to all machines in database
Sharding - if one machine is slow you may struggle
Complex SQL queries however, you can run in parallel with tons of machines - drastically reduces query latency
Elasticsearch does doc-sharding
RocksDB LSM
Log-structured merge stree
One single write - amortized the random writes into sequential writes on storage; background compaction; handle sparse fields and are able to index them
Smart Schemas
JSON but SQL queries
Automatically generate a schema based on the fields that are present at time of ingestion; schema is reduced based on when you make the query
Associate type with every value inside the column
Have joins, aggregations, sorts and SQL types
How does SELECT * work here?
You will get back NULLs for type mismatches (String and int for age, the strings may be NULL)
What % of customers use SQL vs. JSON?
Most people use SQL
Query Lambda - expose to developers that don't really know SQL
Challenges with Smart Schemas
Challenge 1: Additional CPU usage for processing queries
Relational database may have separate schema and data storage
Schemaless - schema for every value
Challenge 2: Requires increased disk space
Use a little more than strict relational tables
Type hoisting - if all the types for a given field are the same then we set it at the beginning - makes it a bit more efficient
Cloud Native and Scaling
Cost of 1 CPU for 100 mins == cost of 100 CPU for 1 min
With cloud - dynamically provision for current demand
Challenge is in the software
Tailers are pure CPUs - can be scaled up and down
Use kubernetes and AWS for scaling
Leaf nodes - also can be scaled up
Spin up more aggregators for more
Shared storage
RocksDB cloud - layer on tops of RocksDB - every time a file gets produced it gets pushed to S3; durability comes from S3
Performance - we use zero-copy clones of S-3
Are you running the open source RocksDB
We are running opens source rocksdb
Compaction
It needs to be tuned
It typically runs on the node that runs RocksDB; we have done remote compaction - it makes an RPC to a compaction tier; shared storage architecture
ALT provides SQL on JSON
no need to manage indexes
Focus is always on latency and not efficiency - we will spin up a bunch of machines to attack a query rather than try to make it the most efficient
Rockset Competitors
Apache Druid (Imply)
Similar architecture to MemSQL
Real time data is growing significantly
Ad optimization, A/B testing, fraud detection, real-time customer 360
Gaming, fleet tracking, real time recommendations
Rockset chooses the index based on the query type
Ingest compute and query compute can be scaled separately
Data volumes - 10s and 100s of TB; tends to be data in use and not petabytes; they don't need petabytes worth of data for operating in real-time
Rockset has support for joins; no support for joins in elasticsearch
To get around no joins with elasticsearch; people will add in a transformation step (join using spark) and then push to elastic; tends to duplicate storage, add an addition tool / service to keep up
Ingesting data
Elastic - need beats and logstash and ingestion pipelines; need to reindex around new documents coming in
Rockset
Minimizes data latency; produce data and able to query it in a second
Demo