Postgres

Join Planning
- If the plan is good then query performance will be acceptable
- Query might have 20 or more joins in it
Postgres - lots of great open source
- Pgadmin - simple Postgres admin interface
- Tons of plugin libraries
Performance
- Multi-version concurrency control - multiple cores for concurrent queries
- Non-blocking indexes - create and update indexes without screwing up reads and writes to update indexes
- Partial Indexes - can index on a subset of of data
- Datatypes - boolean, ranges (times), arrays, XML, JSONB, money, blob
- Indexes - helps with performance - Btree, GIN, GIST, BRIN, Hash
- Extensions - corn jobs, stored procedures, fuzzy matching, postGIS

Episode 496: Bruce Momjian on Multi-Version Concurrency Control in Postgres (MVCC)

www.se-radio.net/2022/01/episode-496-bruce-momjian-on-multi-version-concurrency-control-in-postgres-mvcc/

Stonebreaker built Postgres for extensibility and its come in handy a ton
I of ACID - Isolation
- You interact with the database kind of like you are the only one interacting with the database
- Static data is the easiest to work with; volatile data and multiple users is hard;
- ACID capabilities make it easier to interact with the database
- Isolation - I won't see other people's changes while I'm doing my own work
- Complex stuff gets pushed into the database; all of the shared state within the database is easier to manage rather than having applications share everything with eachother
- Isolation prevents you from seeing other people's changes while you are doing your own work
Locking as a solution?
- One giant lock and prevent anyone from going in while I go in; that doesn't work super well; concurrency is terrible
- 70-90s databases - approach was lets not lock everything - we will make the locks granular; you would lock a table at a time; when I was in it no one else could get into it;
- Then they got to the point where they could lock pages; broke down data into pages - I will modify these pages and then
- Some of the databases got to row-level locking; row that I'm modifying
  - First problem - huge amount of locking; don't really solve the concurrency problem; just chopping it into smaller pieces
  - Second problem - worse problem. Lock escalation - the database might not know what your intent was; if you locked a lot of rows - then it might lock the whole page - escalate the lock - you could imagine multiple people doing these things together and running into eachother
  - Early 90's and prior there was a huge issue of lock escalation; as job got bigger it started spilling out into other places
- Informix - 1990s; I would run it at 8:30pm; need to guess how big the lock table to make sure you didn't overflow anything
  - So hard to track these locks
  - Databases became this ugly thing that was hard to deal with
If all we have is read then everyone can share; if everyone has read and write access its an issue
- When you have a single copy of the row, as soon as one dev modifies it its not in the table anymore
- All of a sudden - not only the writers were affected but also the readers
Multi-version concurrency control
- Different method of doing database updates
- The space saving way was to have one copy of the row; the way this paper wanted to solve it - it would create multiple versions of a single row; this seemed ludicrous; could have 5 copies of the same row in the database
- The problem of someone coming to read data when someone is updating it; if we create a new version of the row with new data; we could allow the readers to read the old version of the row; at the same time a newer version of the row is being created (maybe committed/or not); makes me able to give snapshots to users in the database; problem of readers getting blocked by writers goes away
Snapshot
- A record that gets created when you start your query; once you take that snapshot at the beginning, allows us to distinguish which of the multiple versions of the row should be available to you
- Snapshot says which of 5 rows is visible to my transaction; maybe one or none of those is visible
Ids
- Transaction - read only queries like Select - it doesn't typically get a transaction IDs;
Visibility
- Who has visibility into the data at what time
- If someone writes a 1 and someone else writes a 2 at the same time what do I see
- It depends on when the query was started
- Someone who started a transaction before me or after me potentially could see a different set of values
- High write volume requirements -need to split apart visibility concept and let people see different things - that's what MVCC helps you do
What is the algorithm for determining which row to read or write if there are multiple versions of the same row
- When you get your snapshot at the beginning, guarantee you see all committed transactions before your snapshot
- Corrollary: Any work in progress or work that starts after your snapshot has started will not be visible to me
Readers never block writers; writers never block readers
- Readers clearly don't block other readers
What about two writes at the same time?
- Writers do block other writers
- We have to know if previous transaction completes or not; need to update the most recent version of this row
- Have to see newest version of that row; when you try to update or insert a row where a row has been inserted but not committed; we stop the insert or Update until the transaction commits or aborts; we get a lock on it
- In some distributed databases you might have some situation were you don't lock
Transaction Isolation levels
- Pre-committed - every new statement gets a new snapshot; every query in a multi-statement transaction
- Repeatable read - all statements in a transaction gets one snapshot; all queries in transaction see a consistent query
- Seriealizable - as you are running through multi-statement transaction you may read some rows (As they are updated); can't update though; will check to see if anything has been modified underneath you while you are running your multi-statement transaction
Dirty Read idea
- Discard ACID requirements
- I want to see data while its being modified; don't care about my snapshot
- It used to be (pre-MVCC) - you would compute the number and just run it and have caveats about number accuracy
- MVCC allows readers to read frequently - most of the time it is pretty accurate but obviously its based on snapshot at time of query
- Atomicity - important to postgres users; wouldn't want to see changing data as transactions are processing
Extensibility of Postgres - how to add new data type with MVCC
- A lot of databases have tried to add extensibility after the fact; Postgres was designed initially for it
- Has a lot of system tables that store indexing tables, aggregates, stored procedure languages, custom data types and data types
- Define input function, how long it will be, and output function - casting functions to move between data types - not super hard to set up; Postgres fits it into rows and puts transaction IDs on the front
Do most Enterprise databases support MVCC?
- Oracle does - take old row and put it into an undo segment - pointers to different versions
- Microsoft has it as an option - tried to add MVCC; hard to add retroactively add it; a lot of Microsoft SQL was used to locking procedure
- DB2 has it available but not default
- Cassandra uses something similar maybe; maybe for MySQL and MariaDB too
- Postgres is a bit unusual - just leaves the rows in the table
When did Postgres add MVCC?
- We added it in 2001/02
- Postgres had the ideal of time travel - see database as of last week; you would keep old versions - allow you to access old versions of the row
- Clean up of rows can be hard in high write systems
Clean Up process
- Pruning - lightweight operation; removes old versions of the row; as you are doing a sequential scan - you are seeing the transaction IDs and see what was expired by what transaction IDs; if no one can see a specific row - the system will restructure the page and free up the page
- Indexes - split two pages (generally expensive transaction) - there were a lot of dead rows in the index - didn't necessarily need to split - could have great benefits to postgres
- Auto-Vaccum - wakes up every minute; frees up space in background; query won't really be working with
Garbage Collection is similar
- C - postgres written in C; allocate everything and free everything
- Perl - language counts references; frees memory when references are 0
- Java case - allocate stuff on the fly; garbage collector looks at all objects and says which ones are visible and which ones have been thrown away
- Oracle - more similar to Perl - undo segment where old rows go to; manage references to that a bit differently
Storage Space Re-use
- Page pruning and autovaccum - taking data that is no longer useful and freeing it up; a page that was 90% full is now less; if pages at end of table are empty - can truncate it down and get rid of pages
- Same with indexes too
- We don't free up all potential space to operating system - if you delete every other row - we might not free up all of the rows; there is a manual command - vacuum full that could compress it down; but that locks the table so tough to do; a lot of maintenance operations
- Can't unsplit a Btree page easily - lock it, create a new copy, and delete the old one
SQL developers
- Generally need to know how they work
Write amplification
- Struggle with Postgres
- Because of MVCC - we issue somewhat more writes than other systems; we have the old and new versions within same page/table and age them out; garbage collection is issuing writes to storage
- Hintbits - tells us whether transaction was avoided; we update that at various times
- Freeze operations on wrapping for transaction id's
- Its a bit more write heavy than other databases
- Its getting better but hard to solve without adding other issues
- MVCC comes with cost - hintbits, cleanup, freezing on wrap around
Momjian.us; pglife