Data Consitency Patterns in Cloud Native Applications

Data Consistency Patterns in Cloud Native Applications Ryan Knight @knight_cloud

2 Agenda • What is Data Consistency? • Data Consistency in Microservices • Application Tier Consistency • Strong Consistency with Distributed Databases • Linearizable Consistency Patterns

No One Solution “Do you want your data right or right now?” - Pat Helland PACELC Theorem -> More than CAP • In the absence of network partitions the trade-off is between latency and consistency - Daniel Abadi Understand what types of concurrency problems exist Evaluate trade-offs in the differing approaches Minimize Development Complexity

Consistency Challenges Dirty Reads - Read Uncommitted Write Read Skew / Non-Repeatable Reads Read your own Writes Lost Updates Write Skew

Write Skew Two concurrent transactions each determine what they are writing based on reading a data set which overlaps what the other is writing begriffs.com

Consistency in ACID Transactions ACID - Atomic, Consistent, Isolated and Durable • Many different levels of ACID Atomicity • All or Northing. - it all happened or it didn’t. • You don’t see things in the state in-between being processed. • Ability to abort an operation and roll-back Isolation • Concurrently executing transactions are isolated from each other. • Let the application developer pretend there is not two operations happening in parallel. Durable - Writes Stick Consistent • Enforce Invariants • Data must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof THIS IS NOT THE CONSISTENCY WE ARE LOOKING FOR

Credit to Peter Bailis and Aphyr - jepsen.io/consistency Consistency Models Credit to Peter Bailis and Aphyr at jepsens.io

Linearizable and Serializable Consistency Serializability - multi-operation, multi-object, arbitrary total order Linearizability - single-operation, single-object, real-time order Linearizability plus Serializability provides Strict Serializability Peter Bailis - Linearizability versus Serializability

What is Serializability? Serializability Consistency • Transaction isolation. • Concurrency issues when one transaction reads data that is concurrently modified by another transaction. Or when two transactions try to simultaneously modify the same data. • Database guarantees that two transactions have the same effect as if they where run serially. Or they have the illusion of running one at a time without any concurrency.

Levels of Serializable Isolation Repeatable Reads - Read and Write Locks • Prevents Phantom Reads • Write skew still possible Read Committed - Write Locks that prevent: • Dirty reads - only see data once the transaction is committed • Dirty writes - only overwrite data that has been committed Read Uncommitted

Linearizability Eventual vs. Strong Consistency is talking about Linearizability Guarantees that the order of reads and writes to a single register or row will always appear the same on all nodes. Appearance that there is only one copy of the data. It doesn’t group operations into transactions. It doesn’t address problems like dirty reads, phantom reads, etc.. Guarantees read-your-write behavior

Linearizable Consistency in CAP Theorem CAP Theorem is about “atomic consistency” • Atomic consistency refers only to a property of a single request/response operation sequence. • Linearizability Linearizable Consistency in CAP Theorem

AP w/ Session Based Consistency Yellow Nodes Causal consistency Monotonic reads / writes Strong consistency with a single process only Not isolated

AP Consistency Blue Nodes Sacrifice consistency for higher-availability and partition tolerance   Maintains availability even when the network is down Monotonic atomic view Read committed / uncommitted

CP Consistency Strict Serializable - Combine Serializable plus Linearizability Provides Highest Level of Consistency  Moves Complexity of Transactions out of Microservices into the Database   True ACID Transactions Atomicity allows for rollback of transactions Complete Isolation of Transaction High Levels of Safety Guarantees

Data Consistency in Microservices

From Monolith to Microservices Data Consistency was easy in a monolith application - single source of truth w/ ACID transactions Move to microservices each service became a bounded context that owns and manages its data. Data Consistency became very difficult w/ microservices

Consistency Challenges with Data in Microservices Traditional ACID transactions did not scale Data orchestration between multiple services - Number of Microservices Increases Number of Interactions Stateful or Stateless Data rehydration for things like service failures and rolling updates.

Eventual Consistency CAP Theorem • Force choice between Global Scale or Strong Consistency Eventual Consistency • Sacrificed consistency for availability and partition tolerance. • Really a Necessary Evil • Last Write Wins - What if I can’t loose a write? • Write now and figure it out later Pushed complexity of managing consistency to application tier

Return of Strong Consistency Rise of Databases providing strong consistency and global scale Possible to push complexity of consistency back to the database Not a panacea for data consistency challenges

Distributed System Design Heart of distributed system design is a requirement for a consistent, performant, and reliable way of managing data.

Advantages of Application Tier Consistency Low Read / Write Latency High-Throughput Read your Writes - Same session only Requires application to enforce session stickiness

Disadvantages of Application Tier Consistency No Isolation and limited atomicity Consistency problems are far harder to solve in the application tier where Increased Complexity

Use Cases of Application Tier Consistency Music Playlists Shopping Carts Social Media Posts

Patterns for Application Tier Consistency Sticky Sessions • Session Consistency • Differing Levels of Lineriazability • Example - Akka Clustering

Sticky Sessions Whether or not read-your-write, session and monotonic consistency can be achieved depends in general on the "stickiness" of clients to the server that executes the distributed protocol for them. If this is the same server every time than it is relatively easy to guarantee read-your-writes and monotonic reads. - Werner Vogels 2007

Akka Clustering Pin Session to an Actor - Sticky Session Akka Clustering Libraries • Cluster Sharding • Cluster Singleton • Cluster Proxy Akka Persistence Akka Distributed Data w/ Conflict Free Replicated Data Types (CRDTs)

What are CRDT’s? CRDT - Conflict Free Replicated Data Types Data types that guarantee convergence to the same value without any synchronization mechanism Consistency without Consensus Avoid distributed locks, two-phase commit, etc. Data Structure that tells how to build the value Sacrifice linearizability (guaranteed ordering ) while remaining correct

Akka Distributed Data CRDT’s Monotonic Sequences - Sequence that always increases or always decreases Monotonic Sequences are eventually consistent without any need for coordination protocols  GCounter, PNCounter - Grow Only Counter / Positive Negative Counter GSet, ORSet - Grow Only Set, Observe Remove Set ORMap, PNCounterMap, LWWMap Flag, LWWRegister

Akka Cluster Strengths Strong Consistency within a Single Actor Monotonic Read / Writes High Availability High Throughput and Low Latency Can be AP with a Split Brain Resolver Reduced latency because no db roundtrip

Akka Cluster Weaknesses Akka Distributed Data limited to CRDT’s Akka Distributed Data has a limited data size • All entries are held in memory • Limit of 100,000 top level elements • Replication via gossip with large data gets slower No Isolation of Data Consistency of Akka Persistence depends on backing data store

Strong Consistency with Global Distributed Databases

Advantages of Strict Serializable Consistency Decrease Application Tier Complexity Reduce Cognitive Overhead Increased Developer Productivity Increased Focus on Business Value Strong Isolation Most implementations also provide strong atomicity

Use Cases for Global Transactions Processing financial transactions Fulfilling and managing orders Anytime there needs to be coordination of complex transactions across multiple data sources.

Strengths and Weaknesses of Strict Serializable Consistency Read your Writes - Across sessions Prevent Phantom Reads, Write Skew, etc. Higher Read / Write Latency Lower Throughput

Disadvantages Transactions are hard. Distributed transactions are harder. Distributed transactions over the WAN are final boss hardness. I'm all for new DBMSs but people should tread carefully. - Andy Pavlo on Twitter

Not All Transactions are the Same Distributed Multi-Value Concurrency (MVCC) / Snapshots Differences in Transaction Protocol • Global Ordering Done in a Single Phase vs. Multi-Phase • Pre or Post Commit Transaction Resolution Different levels of consistency Maximum scope of a transaction • Single Record vs. Multiple Records Transaction can be regional or global

Differing Interpretations of Consistency and ACID Consistency and ACID Spectrum Week Isolation Level Scope of Transaction - Single Row Eventually Consistent Strongest Isolation Level Scope of Transaction - Distributed Across Partitions Serializable Consistency

Lots of Options Google Spanner - 2 Phase Commit with dependency on proprietary atomic clocks Coackroach & YugaByte - Open Source version of Spanner with 2 Phase Commits and Hybrid Clocks Fauna - Single Phase Commit with no hard dependency on clocks FoundationDB - Serializable Snapshot Isolation AWS Dynamo Transactions - Multiple Object with limits to single region AWS Aurora - Multi-Master coming soon • Low Latency Read Replicas • Fault-tolerant - replicates six copies of your data across three Availability Zone

Google Spanner External consistency, an isolation level even stricter than   serializability Relation Integrity Constraints 99.999% availability SLA Uses a global commit timestamps to guarantee ordering of   transactions using the TrueTime API. Multiple Shards with 2PC Single Shard Avoids 2PC for Writes / Read-only Transactions also avoid 2 PC No Downtime upgrades - Maintenance done by moving data between nodes Downside is cost and some limitations to the SQL model and schema design

CoackroachDB Open source version of Spanner Hybrid Logical Clock similar to a vector clock for ordering of transactions Challenges with clock skew - waits up to 250 MS on reads Provides linearizability on single key and overlapping keys Transactions that span disjoint set of key it only provides serializability and not linearizability Some edge cases cause anomalies called “casual reverse” - Jepsen analysis

YugaByte Also uses Hybrid Logical Clock Currently supports snapshot isolation Serializable isolation level work in progress Distributed Transactions to multiple partitions require a provisional record https://docs.yugabyte.com/latest/architecture/transactions/distributed-txns/

Fauna DB Distributed Global Scale OLTP Database with Global Transactions Cloud or On-Prem Temporality Multi-Tenancy Advanced Security Model w/ Row Level Security Document Model Multiple Indexes per table (class) similar to materialized views

Fauna Consensus Algorithm Transactions can include multiple rows - not restricted to data in a single row or shard Transaction resolution based on the Calvin protocol - pre-ordering of transactions before commit Global transaction ordering provides serializable consistency Distributed log based algorithm scales throughput with cluster size by partitioning the log

Fauna Upsides Single-Phase Consensus Algorithm provides lowest possible global latency   Low Latency Snapshot Reads   No difference in multi-partition and single-partition transactions   Powerful Query Language - Complex Transaction in a single query

Fauna Downsides Proprietary Query Language Higher Write Latency with Global Transactions Writes always pay the cost of multi-partition transactions

Distributed Saga Overview Central Coordinator • Manages Complex Transaction Logic • Uses Event Sourcing to store state • State managed in an distributed log Split work into idempotent executors / requests Requires compensating requests for   dealing with failures / aborting transaction Effectively Once instead of Exactly Once

Distributed Saga Strengths Fault Tolerant / HA Composable executors Isolation of complex code into coordinator Atomicity/Abortability if created by the developer

Distributed Saga Weaknesses No Consistency Weak isolation No Guaranteed Atomicity - Unsafe partially committed states Complexity with versioning of saga logic Increased application complexity Rollback and recovery logic required in application tier Idempotentcy impossible for some services

Linearizable Consistency Patterns

Cassandra LWT Design Approach Cassandra Lightweight Transactions (LWT) Use a single partition as a record locking mechanism Use a CAS type operation • Compare and Swap • Compare and Set a new value Cassandra Batch is not a traditional DB Batch • Only Atomic within a single partition

Upside of Cassandra LWT Linearizable Consistency within a single partition All the benefits of Cassandra • Great read / write performance • High Availability / Fault Tolerant • Tunable Consistency If transactions are only needed for a small portion of the application then LWT’s are useful

Downsides of Cassandra LWT Transaction only applies to a single partition High-Latency with multi-phase commit Does not provide isolation of transaction Expensive consensus algorithm - 4 roundtrips via Paxo’s

Global Scale Database Transactions In-Depth

Classic 2 Phase Commit / Non-Deterministic • 2 Phase Commit Rolling Dice on Effects being Applied • Transaction Coordinator Walks across all involved partitions through out cluster • Prepare Phase— Determining Effects and Contention • Commit Phase - Tells Partitions if they succeed or failed • 2 Rounds of Global Consensus Writes - Non-Deterministic

Calvin Protocol / Deterministic Commit / Single Global Consensus Round Three Phases • Query Speculative Execution / Calculate Effects • Non-Deterministic Query => Deterministic Transaction => Pure Function over state • Global Log Commit / Ordering Phase • Deterministic Transaction Results Applied Writes - Deterministic

Non-Deterministic / Classic 2 Phase Commit • Phase One • Transaction Effects Determined => Intents Written • Determining Transaction success or failure requires global coordination • Phase Two • Tell partitions if the succeed or need to abort • 2 Phase Commit Rolling Dice on Effects being Applied Deterministic Commit / Single Global Consensus Round • When transaction is committed to log outcome is pre-determined • Pre-Commit Speculative Calculate the Effects • Transaction Committed => Transaction is Order => Effects Determined and Applied • Transaction is a Pure Function over the State of the Database • No Rollback of Effects Deterministic vs. Non-Deterministic

Calvin inspired • Single, global consensus domain per cluster • All transactions handled identically on each replica • Transaction batching maximizes throughput (In contrast) Spanner inspired • Multiple consensus domains, one per shard • Per shard consensus ==> challenges in multi-shard transactions • Wall clock TS approach to solve this problem • Introduces “window of uncertainty” • Consistency guarantee is lost any time clock skew exceeds “uncertainty” threshold • Spanner addresses with unique Google hardware and API • Others, software only solution Consistency Without Clocks: Inspired by Calvin

Spanner / Spanner Inspired • Problem consistent reads with concurrent writes • Read using 2PC have high latency • Alternative is choose a stable snapshot time for reads - How stale do you want your data? • Spanner guarantees stable read timestamps through clock synchronization Read Consistency with Clocks

Calvin Protocol • Read only transactions avoid transaction pipeline • Log based total ordering allows partitions to guarantee stable read • Local partition waits until data is consistent to perform reads at a given snapshot • Wait only happens when a node is behind on applying transactions Read Consistency Without Clocks

Calvin Transaction Processing Coordin ator Log Data Storage Coordin ator Log Data Storage Coordin ator Log Data Storage Client AWS

Calvin Transaction Processing Coordin ator Log Data Storage Coordin ator Log Data Storage Coordin ator Log Data Storage AWS Client

Thank You Ryan Knight @knight_cloud

Data Consitency Patterns in Cloud Native Applications

More Related Content

What's hot

Similar to Data Consitency Patterns in Cloud Native Applications

Recently uploaded

Data Consitency Patterns in Cloud Native Applications