1 By Colin McCabe Building Microservices with Kafka
2 About Me
3 Network Services Services ● Expose functionality over the network ● Manage state and business logic Important aspects ● Availability ● Maintainability ● Consistency ● Extensibility
4 Microservices vs Monolithic Services Microservices ● Multiple components ● Loose coupling ● Organized around capabilities Monolithic Services ● “One big app” ● Usually a single process ● No separation of concerns
5 Why Microservices? ● Microservices ○ Bounded contexts ○ Easier to test ○ Easier to scale to multiple servers ○ Easier to scale to multiple teams ○ More robust ● Monolithic Services ○ Easier to get started with
6 Synchronous Communication vs. Asynchronous ● Synchronous ○ Request / Response ○ REST ○ gRPC ○ Apache Thrift ● Asynchronous ○ Message queue ○ Kafka ○ ZeroMQ Service A Service B Service A Service B
7 A synchronous microservice example User MySQL Frontend HDFS Metrics Emailer
8 An asynchronous microservice example User MySQL Frontend HDFSMetrics Emailer
9 An asynchronous microservice example User MySQL Frontend HDFSMetrics Emailer ???????
10 Apache Kafka ● A distributed streaming platform ● That lets you publish and subscribe to streams of records ● … in a fault-tolerant, real-time way ○ https://kafka.apache.org/intro ● Open source ○ https://www.confluent.io/download/
11 Kafka History ● Kafka was built at LinkedIn around 2010 ● https://cwiki.apache.org/confluence/display/KAFKA/Powered+By ● Multi-platform: clients in Java, Scala, C, C++, Python, Go, C#, …
12 Kafka Concepts: the 10,000 foot view ● Messages ○ Key ○ Value ○ Timestamp ● Producer ● Consumer ● Connector ● Stream Processor
13 Topics ● A topic is a category or feed name ● Divided into partitions ● Can have multiple consumers and producers per topic ● Can view older messages
14 Log-Compacted Topics ● Regular topics ○ A stream of messages: key/value pairs ○ As new messages are added, old ones may be deleted to make space. ● Log-Compacted Topics ○ Retain the last known value for each key ○ Can easily fetch the value associated with a key ○ Acts a little bit like a table
15 Consumer Groups
16 Kafka and Durability ● Old messages can be replayed ● Kafka persists messages to disk ● Durability levels are configurable
17 Kafka for Microservices
18 Kafka Connect docs.confluent.io/current/connect/ Connector Instance ● Responsible for copying data between Kafka and an external system Connector Task Connector Plugin
19 Kafka Connect: Batteries Included
20 Kafka Streams ● docs.confluent.io/current/streams/ ● Process data, not just transport it ● Makes stream processing simpler and easier ● Applications are fault-tolerant and elastic-- the scaling and load-balancing are done by Kafka. ● The inputs and outputs are just Kafka topics. ● A library, not a framework.
21 Kafka Streams ● Tables and streams are duals ○ A stream can be viewed as the changelog for a table ○ A table is just a cache of the latest value associated with a key in a stream ● An aggregate like a join or a count can be viewed as a table K1 -> V1 K2 -> V2 K3 -> V3 KStream KTable K1, V1 K2, V2 K3, V3
22 Wordcount using Kafka Streams KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> lines = builder.stream("lines"); KTable<String, Long> wordCounts = lines .flatMapValues(line -> Arrays.asList(line.toLowerCase().split("W+"))) .groupBy((key, word) -> word) .count("counts"); wordCounts.to(Serdes.String(), Serdes.Long(), "counts"); KafkaStreams streams = new KafkaStreams(builder, conf); streams.start();
23 Exactly-Once Semantics ● Reliable delivery: a very hard problem in distributed systems. ○ At most once ○ At least once ○ Exactly once
24 Achieving Exactly-Once Semantics ● Idempotence: exactly once in-order semantics per partition ● Transactions: atomic writes across multiple partitions ● Streams support ● https://www.confluent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
25 Conclusion ● The loose coupling, deployability, and testability of microservices makes them a great way to scale. ● Apache Kafka is an incredibly useful building block for many different microservices. ● Kafka is reliable and does the heavy lifting ● Kafka is more than just a pipe -- Kafka Streams can process data in realtime. Libraries, not frameworks. Deploy your way.
26 Thank You!

Building Microservices with Apache Kafka by Colin McCabe

  • 1.
    1 By Colin McCabe BuildingMicroservices with Kafka
  • 2.
  • 3.
    3 Network Services Services ● Exposefunctionality over the network ● Manage state and business logic Important aspects ● Availability ● Maintainability ● Consistency ● Extensibility
  • 4.
    4 Microservices vs MonolithicServices Microservices ● Multiple components ● Loose coupling ● Organized around capabilities Monolithic Services ● “One big app” ● Usually a single process ● No separation of concerns
  • 5.
    5 Why Microservices? ● Microservices ○Bounded contexts ○ Easier to test ○ Easier to scale to multiple servers ○ Easier to scale to multiple teams ○ More robust ● Monolithic Services ○ Easier to get started with
  • 6.
    6 Synchronous Communication vs.Asynchronous ● Synchronous ○ Request / Response ○ REST ○ gRPC ○ Apache Thrift ● Asynchronous ○ Message queue ○ Kafka ○ ZeroMQ Service A Service B Service A Service B
  • 7.
    7 A synchronous microserviceexample User MySQL Frontend HDFS Metrics Emailer
  • 8.
    8 An asynchronous microserviceexample User MySQL Frontend HDFSMetrics Emailer
  • 9.
    9 An asynchronous microserviceexample User MySQL Frontend HDFSMetrics Emailer ???????
  • 10.
    10 Apache Kafka ● Adistributed streaming platform ● That lets you publish and subscribe to streams of records ● … in a fault-tolerant, real-time way ○ https://kafka.apache.org/intro ● Open source ○ https://www.confluent.io/download/
  • 11.
    11 Kafka History ● Kafkawas built at LinkedIn around 2010 ● https://cwiki.apache.org/confluence/display/KAFKA/Powered+By ● Multi-platform: clients in Java, Scala, C, C++, Python, Go, C#, …
  • 12.
    12 Kafka Concepts: the10,000 foot view ● Messages ○ Key ○ Value ○ Timestamp ● Producer ● Consumer ● Connector ● Stream Processor
  • 13.
    13 Topics ● A topicis a category or feed name ● Divided into partitions ● Can have multiple consumers and producers per topic ● Can view older messages
  • 14.
    14 Log-Compacted Topics ● Regulartopics ○ A stream of messages: key/value pairs ○ As new messages are added, old ones may be deleted to make space. ● Log-Compacted Topics ○ Retain the last known value for each key ○ Can easily fetch the value associated with a key ○ Acts a little bit like a table
  • 15.
  • 16.
    16 Kafka and Durability ●Old messages can be replayed ● Kafka persists messages to disk ● Durability levels are configurable
  • 17.
  • 18.
    18 Kafka Connect docs.confluent.io/current/connect/ Connector Instance ●Responsible for copying data between Kafka and an external system Connector Task Connector Plugin
  • 19.
  • 20.
    20 Kafka Streams ● docs.confluent.io/current/streams/ ●Process data, not just transport it ● Makes stream processing simpler and easier ● Applications are fault-tolerant and elastic-- the scaling and load-balancing are done by Kafka. ● The inputs and outputs are just Kafka topics. ● A library, not a framework.
  • 21.
    21 Kafka Streams ● Tablesand streams are duals ○ A stream can be viewed as the changelog for a table ○ A table is just a cache of the latest value associated with a key in a stream ● An aggregate like a join or a count can be viewed as a table K1 -> V1 K2 -> V2 K3 -> V3 KStream KTable K1, V1 K2, V2 K3, V3
  • 22.
    22 Wordcount using KafkaStreams KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> lines = builder.stream("lines"); KTable<String, Long> wordCounts = lines .flatMapValues(line -> Arrays.asList(line.toLowerCase().split("W+"))) .groupBy((key, word) -> word) .count("counts"); wordCounts.to(Serdes.String(), Serdes.Long(), "counts"); KafkaStreams streams = new KafkaStreams(builder, conf); streams.start();
  • 23.
    23 Exactly-Once Semantics ● Reliabledelivery: a very hard problem in distributed systems. ○ At most once ○ At least once ○ Exactly once
  • 24.
    24 Achieving Exactly-Once Semantics ●Idempotence: exactly once in-order semantics per partition ● Transactions: atomic writes across multiple partitions ● Streams support ● https://www.confluent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
  • 25.
    25 Conclusion ● The loosecoupling, deployability, and testability of microservices makes them a great way to scale. ● Apache Kafka is an incredibly useful building block for many different microservices. ● Kafka is reliable and does the heavy lifting ● Kafka is more than just a pipe -- Kafka Streams can process data in realtime. Libraries, not frameworks. Deploy your way.
  • 26.