Chapter 7: Transactions

Problem:
- Database, application, network could fail at any time
- Race conditions can occur
Transactions: method for grouping several reads & writes together
- Either succeeds (commit) or fails (abort/rollback)
- Don’t have to deal with partial failures
Created to simplify the programming models for apps communicating with dbs
Not everything needs transactions
Almost all relational dbs and some others have them today, tracing back to IBM System R (first SQL db)
- With some new dbs & models (nosql/graph, distributed, partitioned), sometimes transactions were abandoned
The Meaning of ACID
- Atomicity
  - In multi-threading, one thread executing atomic operation means no way another thread could see half-finished result of that operation
    - In the ACID context, this is really Isolation, not Atomicity!
  - In ACID, atomicity refers to a how a batch of writes either all succeed or all fail (rollback/abort if fails halfway due to network, disk full, integrity constraint, etc) - safe to retry on failure
  - “Abortability” or “all-or-nothing” would be a better name
- Consistency
  - Super overloaded term (replica consistency, eventual consistency, consistent hashing, CAP theorem)
  - In ACID, consistency = application specific notion of db being in a good state
    - The data’s invariants remain true (e.g. balanced credits/debits in a financial db)
    - The app dev still has to specify the invariants and write transactions that satisfy them
    - Sometimes db can help with e.g. foreign key & uniqueness constraints
    - Consistency in ACID is a property of the application - C doesn’t really belong in ACID
- Isolation
  - Concurrently executing transactions are isolated from one another and cannot step on each other’s toes
  - Serializability = each transaction can pretend it’s the only one running on the db
    - They may actually run concurrently, but result is same as if they had executed serially
  - Most dbs don’t implement serializability as it’s a performance hit
  - Snapshot isolation is a weaker but more often used feature
- Durability
  - Once a transaction has committed successfully, data it has written won’t be forgotten even if there is a hardware fault or crash
  - Usually involves non-volatile storage like hard drive or SSD and write-ahead logs
  - In replicated db, may mean that some critical number of nodes received data (defines commit completion)
  - No such thing as perfect durability (e.g. all hard disks and backups destroyed)
Single-Object and Multi-Object Operations
- Multi-object writes
  - Objects = rows/documents/records
  - Atomicity = all-or-nothing guarantee, Isolation = transactions isolated from each other
  - Lack of isolation may result in a user seeing no 0 unread despite there being one
  - Lack of atomicity could result in inconsistent state
  - Multi-object transactions need a way to track all the operations for the same transaction
    - Could be TCP connection identifier, but not ideal as it couples the connection and the transaction
    - Better to use a transaction manager of sorts
- Single-object writes
  - e.g. updating a 20kB JSON document
    - network disconnect after 10kB written
    - power fails when partially updated
    - read occurs while write in progress
  - For one node, single-object, dbs almost always try to handle these issues to guarantee atomicity & isolation
  - Common too is increment over read-modify-write and compare-and-set (change as long as it hasn’t been changed by someone else)
  - Note that these aren’t really transactions - transactions involve grouping multiple operations across multiple objects into one execution unit
- The need for multi-object transactions
  - Applications can go reasonably far with only single object insert/update/delete and single-object operations
  - Cases where multi-object coordination is needed:
    - Inserting several records that refer to one another with foreign key refs
    - Updating denormalized data in document dbs
    - Updating primary and (maybe multiple) secondary indexes together
- Handling errors and aborts
  - ACID dbs behave like “if atomicity, isolation, or durability is at risk, abort transaction”
  - Leaderless replication dbs do “best effort” more often - puts responsibility on app dbs to recover from errors
  - The whole point of aborts is to enable safe retries, so shame Django/ActiveRecord just bubble up aborts as errors
  - You don’t always want to retry though:
    - If transaction actually succeeded but there was network error
    - DB overload causing error, retries just increase load
    - If error is due to permanent error (e.g. constraint violation), pointless to retry
    - Undesirable to run side effects of transaction (eg email) twice
Weak Isolation Levels
- Can run N transactions in parallel as long as they don’t touch the same data
- Hard to catch concurrency issues with testing
- Common for dbs to not have full serializability (pretend nothing runs concurrently) because of performance issues - often just implement “weak isolation” which only protect against some concurrency issues
  - Even many financial-targeted ACID dbs only have weak isolation, and has caused major money loss
- Read Committed
  - 2 guarantees:
    - Only read data that has been committed (no dirty reads)
    - Only overwrite data that has been committed (no dirty writes)
  - No dirty reads
    - Only read any (new) data after all data committed
    - Prevents users/other transactions from seeing the db in a partially updated state
    - Prevents reads of uncommitted and then rolled-back data from aborted transactions
  - No dirty writes
    - Dirty write: earlier write is part of an uncommitted transaction and later transaction overwrites uncommitted value
    - Note that when the second write happens after the first commit, that’s NOT a dirty write, but it’s still prone to the race counter condition
  - Implementing Read Committed
    - Very popular - default in Oracle, Postgres, MemSQL, etc
    - Use row-level locks: when a transaction wants to modify an object, first acquire a lock on that object. Hold lock until committed or aborted.
    - Preventing dirty reads: could use the same lock as the transaction for reads, but not good in practice - long-running write transactions block all reads
- Snapshot Isolation and Repeatable Read
- Preventing Lost Updates
- Write Skew and Phantoms
Serializability