• Problem:
    • Database, application, network could fail at any time
    • Race conditions can occur
  • Transactions: method for grouping several reads & writes together
    • Either succeeds (commit) or fails (abort/rollback)
    • Don’t have to deal with partial failures
  • Created to simplify the programming models for apps communicating with dbs
  • Not everything needs transactions
  • Almost all relational dbs and some others have them today, tracing back to IBM System R (first SQL db)
    • With some new dbs & models (nosql/graph, distributed, partitioned), sometimes transactions were abandoned
  • The Meaning of ACID
    • Atomicity

      • In multi-threading, one thread executing atomic operation means no way another thread could see half-finished result of that operation
        • In the ACID context, this is really Isolation, not Atomicity!
      • In ACID, atomicity refers to a how a batch of writes either all succeed or all fail (rollback/abort if fails halfway due to network, disk full, integrity constraint, etc) - safe to retry on failure
      • “Abortability” or “all-or-nothing” would be a better name
    • Consistency

      • Super overloaded term (replica consistency, eventual consistency, consistent hashing, CAP theorem)
      • In ACID, consistency = application specific notion of db being in a good state
        • The data’s invariants remain true (e.g. balanced credits/debits in a financial db)
        • The app dev still has to specify the invariants and write transactions that satisfy them
        • Sometimes db can help with e.g. foreign key & uniqueness constraints
        • Consistency in ACID is a property of the application - C doesn’t really belong in ACID
    • Isolation

      Untitled

      • Concurrently executing transactions are isolated from one another and cannot step on each other’s toes
      • Serializability = each transaction can pretend it’s the only one running on the db
        • They may actually run concurrently, but result is same as if they had executed serially
      • Most dbs don’t implement serializability as it’s a performance hit
      • Snapshot isolation is a weaker but more often used feature
    • Durability

      • Once a transaction has committed successfully, data it has written won’t be forgotten even if there is a hardware fault or crash
      • Usually involves non-volatile storage like hard drive or SSD and write-ahead logs
      • In replicated db, may mean that some critical number of nodes received data (defines commit completion)
      • No such thing as perfect durability (e.g. all hard disks and backups destroyed)
  • Single-Object and Multi-Object Operations
    • Multi-object writes

      • Objects = rows/documents/records
      • Atomicity = all-or-nothing guarantee, Isolation = transactions isolated from each other
      • Lack of isolation may result in a user seeing no 0 unread despite there being one

      Untitled

      • Lack of atomicity could result in inconsistent state

      Untitled

      • Multi-object transactions need a way to track all the operations for the same transaction
        • Could be TCP connection identifier, but not ideal as it couples the connection and the transaction
        • Better to use a transaction manager of sorts
    • Single-object writes

      • e.g. updating a 20kB JSON document
        • network disconnect after 10kB written
        • power fails when partially updated
        • read occurs while write in progress
      • For one node, single-object, dbs almost always try to handle these issues to guarantee atomicity & isolation
      • Common too is increment over read-modify-write and compare-and-set (change as long as it hasn’t been changed by someone else)
      • Note that these aren’t really transactions - transactions involve grouping multiple operations across multiple objects into one execution unit
    • The need for multi-object transactions

      • Applications can go reasonably far with only single object insert/update/delete and single-object operations
      • Cases where multi-object coordination is needed:
        • Inserting several records that refer to one another with foreign key refs
        • Updating denormalized data in document dbs
        • Updating primary and (maybe multiple) secondary indexes together
    • Handling errors and aborts

      • ACID dbs behave like “if atomicity, isolation, or durability is at risk, abort transaction”
      • Leaderless replication dbs do “best effort” more often - puts responsibility on app dbs to recover from errors
      • The whole point of aborts is to enable safe retries, so shame Django/ActiveRecord just bubble up aborts as errors
      • You don’t always want to retry though:
        • If transaction actually succeeded but there was network error
        • DB overload causing error, retries just increase load
        • If error is due to permanent error (e.g. constraint violation), pointless to retry
        • Undesirable to run side effects of transaction (eg email) twice
  • Weak Isolation Levels
    • Can run N transactions in parallel as long as they don’t touch the same data
    • Hard to catch concurrency issues with testing
    • Common for dbs to not have full serializability (pretend nothing runs concurrently) because of performance issues - often just implement “weak isolation” which only protect against some concurrency issues
      • Even many financial-targeted ACID dbs only have weak isolation, and has caused major money loss
    • Read Committed
      • 2 guarantees:

        • Only read data that has been committed (no dirty reads)
        • Only overwrite data that has been committed (no dirty writes)
      • No dirty reads

        • Only read any (new) data after all data committed

        Untitled

        • Prevents users/other transactions from seeing the db in a partially updated state
        • Prevents reads of uncommitted and then rolled-back data from aborted transactions
      • No dirty writes

        Untitled

        • Dirty write: earlier write is part of an uncommitted transaction and later transaction overwrites uncommitted value
        • Note that when the second write happens after the first commit, that’s NOT a dirty write, but it’s still prone to the race counter condition

        Untitled

      • Implementing Read Committed

        • Very popular - default in Oracle, Postgres, MemSQL, etc
        • Use row-level locks: when a transaction wants to modify an object, first acquire a lock on that object. Hold lock until committed or aborted.
        • Preventing dirty reads: could use the same lock as the transaction for reads, but not good in practice - long-running write transactions block all reads
    • Snapshot Isolation and Repeatable Read
    • Preventing Lost Updates
    • Write Skew and Phantoms
  • Serializability