CAP theorem and distributed systems

CAP theorem and distributed systems introduction Nikanovich Yury

Distributed Systems Distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages.

Scalability Scalability is the ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. Two particularly relevant aspects • Performance • Availability

Permformance aspects • Short response time/low latency for a given piece of work • High throughput (rate of processing work) • Low utilization of computing resource(s)

Availability Availability = uptime / (uptime + downtime) Availability % Downtime per year 90 % ("one nine") More than one month 99.9 % ("three nines") Less than 9 hours 99.9999% ("six nines") 31 seconds

Fault tolerance Fault tolerance ability of a system to behave in a welldefined manner once faults occur. Failures are norm.

Replication Replication is making copies of the same data on multiple machines.

Replication Master-Slave Replication Node B Node B Multi-Master Replication Load Balancer Master Slave A Slave B Slave C Write Operation Read Operation Replication Node D Node BNode A Node C

Consistency More Consistency Faster reads/writes Strong Probabilistic CRDTs RedBlue Perkey sequentional Casual Eventual

Why strong consistency is hard to achieve Nodes • each node executes a program concurrently • knowledge is local • global state is potentially out of date • nodes can fail and recover from failure independently • messages can be delayed or lost • clocks are not synchronized across nodes Links • Asynchronous system model. • No timing assumptions. • No bound on message transmission delay • Useful clocks do not exist.

ACID • Atomic • Consistent • Isolated • Durable

BASE • Basic Availability • Softstate • Eventual consistency

CAP theorem It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: • Consistency (all nodes see the same data at the same time) • Availability (every request received by a nonfailing [database] node in the system must result in a [nonerror] response) • Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures)

CAP theorem Consistency Availability Partition tolerance Pick 2

CAP theorem and Internet ConsistencyAvailability Partition tolerance

CP systems Available Available

CP systems (Partition) Partition Unavailable Available

CP systems Protocols: • Strict quorum protocols (paxos, raft, zab) • 2PC Storages: • MongoDB • HBase • Zookeeper

AP systems Available (consistent reads/writes) Available (consistent reads/writes)

AP systems (Partition) Partition Available (inconsistent reads/writes) Available (inconsistent reads/writes)

AP systems Protocols: • Partial quorum protocols Storages: • Couch DB • Cassandra • Amazon Dynamo

CAP fifteen years later ● Partitions are rare, there is little reason to forfeit C or A when the system is not partitioned. ● Choice between C and A can occur many times within the same system at very fine granularity. ● All three properties are more continuous than binary. ● Most software doesn’t neatly fit CP/AP definition.

CAP theorem and distributed systems

In this document

More Related Content

What's hot

Similar to CAP theorem and distributed systems

More from Klika Tech, Inc

Recently uploaded

CAP theorem and distributed systems