1C O N F I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® tom.green@confluent.io
Confluent Partner Briefing Agenda & Housekeeping ● Introduction to KSQL Presentation ● KSQL Demonstration ● Q & A - please ask your questions via that chat box Follow up materials ● The slides and recording will be emailed
Confluent Partner Briefing Process streams of data in real time, as they occur. 11010 1 01011 1 00110 1 10001 0 3 Apache Kafka is a Distributed Streaming Platform Publish and subscribe to streams of data similar to a message queue or enterprise messaging system. 11010 1 01011 1 00110 1 10001 0 Store streams of data in a fault tolerant way. 11010 1 01011 1 00110 1 10001 0
5C O N F I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® to write real-time applications in SQL
7C O N F I D E N T I A L Lower the bar to enter the world of streaming User Population CodingSophistication Core developers who use Java/Scala Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts streams
8C O N F I D E N T I A L KSQL CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8; streams Lowering the bar: KSQL vs. Kafka Streams Lower the bar to enter the world of streaming vs.
9C O N F I D E N T I A L KSQL ● You write only SQL. No Java, Python, or other boilerplate to wrap around it! ● Create KSQL user defined functions in Java when needed. CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8;
10C O N F I D E N T I A L New user experience: interactive stream processing
11C O N F I D E N T I A L KSQL can be used interactively + programmatically ksql> 1 UI POST /query 2 CLI 3 REST 4 Headless
12C O N F I D E N T I A L All you need is Kafka and KSQL 1.Build & package 2. Submit job required for fault-tolerance ksql> SELECT * FROM myStream Without KSQL With KSQL processing storage
13 ● SQL ● Interactive service ● ... KSQL Kafka Streams API Stream Processing in Kafka Producer/Consumer ● Stateful stream processing ● Window operations ● DSL ○ map(), filter(), join(), aggregate(), … ... Kafka Consumer Group Protocol ● Elasticity ● Fault Tolerance ...
14C O N F I D E N T I A L Stream/Table Duality
15C O N F I D E N T I A L Alice + €50 The Stream-Table Duality Stream (payments) Table (balance) time Alice €50 Bob + €18 Alice €50 Alice €50 Bob €18 Alice + €25 Alice €50 Bob €18 Alice €75 Bob €18 Alice – €60 Alice €75 Bob €18 Alice €15 Bob €18
16C O N F I D E N T I A L KSQL Examples
17C O N F I D E N T I A L Data exploration KSQL example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
18C O N F I D E N T I A L Example: Retail KSQL joins the two streams in real-time Stream of shipments that arrive Stream of purchases from online and physical stores
19C O N F I D E N T I A L Example: IoT, Automotive, Connected Cars KSQL joins the two streams in real-time Kafka Connect streams data in Cars send telemetry data via Kafka API Kafka Streams application to notify customers
20C O N F I D E N T I A L KSQL for Real-Time Monitoring ● Log data monitoring ● Tracking and alerting ● Syslog data ● Sensor / IoT data ● Application metrics CREATE STREAM syslog_invalid_users AS SELECT host, message FROM syslog WHERE message LIKE '%Invalid user%'; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
21C O N F I D E N T I A L KSQL for Anomaly Detection ● Identify patterns or anomalies in real- time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, COUNT(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING COUNT(*) > 3;
22C O N F I D E N T I A L KSQL for Streaming ETL ● Joining, filtering, and aggregating streams of event data CREATE STREAM vip_actions AS SELECT user_id, page, action FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum';
23C O N F I D E N T I A L KSQL for Data Transformation ● Easily make derivations of existing topics CREATE STREAM pageviews_avro WITH (PARTITIONS=6, VALUE_FORMAT='AVRO') AS SELECT * FROM pageviews_json PARTITION BY user_id;
26C O N F I D E N T I A L Demo
27C O N F I D E N T I A L Example: CDC from DB via Kafka to Elastic KSQL processes table changes in real-time Kafka Connect streams data in Kafka Connect streams data out ratings
28C O N F I D E N T I A L Deployment Scalability and HA
29 ● SQL ● Interactive service ● ... KSQL Kafka Streams API Stream Processing in Kafka Producer/Consumer ● Stateful stream processing ● Window operations ● DSL ○ map(), filter(), join(), aggregate(), … ... Kafka Consumer Group Protocol ● Elasticity ● Fault Tolerance ...
30C O N F I D E N T I A L How to run KSQL #1 Interactive KSQL, for development & testing ksql> POST /query Kafka Cluster (data) KSQL Cluster (processing) KSQL does not run on Kafka brokers! ...
31C O N F I D E N T I A L How to run KSQL #2 Headless KSQL, for production Kafka Cluster (data) servers started with same .sql file KSQL Cluster (processing) ... interaction for UI, CLI, REST is disabled
32 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic ● Interactive Service ○ REST API ○ Command Topic
33 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
34 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
35 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
36 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
37 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic Fault Tolerance:
38 KSQL Deployment KSQL Server KSQL Server Command Topic Fault Tolerance:
39 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic Scale Out:
40 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic KSQL Server Scale Out:
41C O N F I D E N T I A L How to run KSQL KSQL Server (JVM process) …and many more… DEB, RPM, ZIP, TAR downloads http://confluent.io/ksql Docker images confluentinc/cp-ksql-server confluentinc/cp-ksql-cli
43C O N F I D E N T I A L Resources and Next Steps confluentinc/ksql http://confluent.io/ksql http://cnfl.io/slack
44C O N F I D E N T I A L

Introduction to KSQL: Streaming SQL for Apache Kafka®

  • 1.
    1C O NF I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® tom.green@confluent.io
  • 2.
    Confluent Partner Briefing Agenda& Housekeeping ● Introduction to KSQL Presentation ● KSQL Demonstration ● Q & A - please ask your questions via that chat box Follow up materials ● The slides and recording will be emailed
  • 3.
    Confluent Partner Briefing Process streamsof data in real time, as they occur. 11010 1 01011 1 00110 1 10001 0 3 Apache Kafka is a Distributed Streaming Platform Publish and subscribe to streams of data similar to a message queue or enterprise messaging system. 11010 1 01011 1 00110 1 10001 0 Store streams of data in a fault tolerant way. 11010 1 01011 1 00110 1 10001 0
  • 4.
    5C O NF I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® to write real-time applications in SQL
  • 5.
    7C O NF I D E N T I A L Lower the bar to enter the world of streaming User Population CodingSophistication Core developers who use Java/Scala Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts streams
  • 6.
    8C O NF I D E N T I A L KSQL CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8; streams Lowering the bar: KSQL vs. Kafka Streams Lower the bar to enter the world of streaming vs.
  • 7.
    9C O NF I D E N T I A L KSQL ● You write only SQL. No Java, Python, or other boilerplate to wrap around it! ● Create KSQL user defined functions in Java when needed. CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8;
  • 8.
    10C O NF I D E N T I A L New user experience: interactive stream processing
  • 9.
    11C O NF I D E N T I A L KSQL can be used interactively + programmatically ksql> 1 UI POST /query 2 CLI 3 REST 4 Headless
  • 10.
    12C O NF I D E N T I A L All you need is Kafka and KSQL 1.Build & package 2. Submit job required for fault-tolerance ksql> SELECT * FROM myStream Without KSQL With KSQL processing storage
  • 11.
    13 ● SQL ● Interactiveservice ● ... KSQL Kafka Streams API Stream Processing in Kafka Producer/Consumer ● Stateful stream processing ● Window operations ● DSL ○ map(), filter(), join(), aggregate(), … ... Kafka Consumer Group Protocol ● Elasticity ● Fault Tolerance ...
  • 12.
    14C O NF I D E N T I A L Stream/Table Duality
  • 13.
    15C O NF I D E N T I A L Alice + €50 The Stream-Table Duality Stream (payments) Table (balance) time Alice €50 Bob + €18 Alice €50 Alice €50 Bob €18 Alice + €25 Alice €50 Bob €18 Alice €75 Bob €18 Alice – €60 Alice €75 Bob €18 Alice €15 Bob €18
  • 14.
    16C O NF I D E N T I A L KSQL Examples
  • 15.
    17C O NF I D E N T I A L Data exploration KSQL example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
  • 16.
    18C O NF I D E N T I A L Example: Retail KSQL joins the two streams in real-time Stream of shipments that arrive Stream of purchases from online and physical stores
  • 17.
    19C O NF I D E N T I A L Example: IoT, Automotive, Connected Cars KSQL joins the two streams in real-time Kafka Connect streams data in Cars send telemetry data via Kafka API Kafka Streams application to notify customers
  • 18.
    20C O NF I D E N T I A L KSQL for Real-Time Monitoring ● Log data monitoring ● Tracking and alerting ● Syslog data ● Sensor / IoT data ● Application metrics CREATE STREAM syslog_invalid_users AS SELECT host, message FROM syslog WHERE message LIKE '%Invalid user%'; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
  • 19.
    21C O NF I D E N T I A L KSQL for Anomaly Detection ● Identify patterns or anomalies in real- time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, COUNT(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING COUNT(*) > 3;
  • 20.
    22C O NF I D E N T I A L KSQL for Streaming ETL ● Joining, filtering, and aggregating streams of event data CREATE STREAM vip_actions AS SELECT user_id, page, action FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum';
  • 21.
    23C O NF I D E N T I A L KSQL for Data Transformation ● Easily make derivations of existing topics CREATE STREAM pageviews_avro WITH (PARTITIONS=6, VALUE_FORMAT='AVRO') AS SELECT * FROM pageviews_json PARTITION BY user_id;
  • 22.
    26C O NF I D E N T I A L Demo
  • 23.
    27C O NF I D E N T I A L Example: CDC from DB via Kafka to Elastic KSQL processes table changes in real-time Kafka Connect streams data in Kafka Connect streams data out ratings
  • 24.
    28C O NF I D E N T I A L Deployment Scalability and HA
  • 25.
    29 ● SQL ● Interactiveservice ● ... KSQL Kafka Streams API Stream Processing in Kafka Producer/Consumer ● Stateful stream processing ● Window operations ● DSL ○ map(), filter(), join(), aggregate(), … ... Kafka Consumer Group Protocol ● Elasticity ● Fault Tolerance ...
  • 26.
    30C O NF I D E N T I A L How to run KSQL #1 Interactive KSQL, for development & testing ksql> POST /query Kafka Cluster (data) KSQL Cluster (processing) KSQL does not run on Kafka brokers! ...
  • 27.
    31C O NF I D E N T I A L How to run KSQL #2 Headless KSQL, for production Kafka Cluster (data) servers started with same .sql file KSQL Cluster (processing) ... interaction for UI, CLI, REST is disabled
  • 28.
    32 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic ● Interactive Service ○ REST API ○ Command Topic
  • 29.
    33 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic
  • 30.
    34 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic
  • 31.
    35 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic
  • 32.
    36 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic
  • 33.
    37 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic Fault Tolerance:
  • 34.
    38 KSQL Deployment KSQL ServerKSQL Server Command Topic Fault Tolerance:
  • 35.
    39 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic Scale Out:
  • 36.
    40 KSQL Deployment KSQL ServerKSQL Server KSQL Server Command Topic KSQL Server Scale Out:
  • 37.
    41C O NF I D E N T I A L How to run KSQL KSQL Server (JVM process) …and many more… DEB, RPM, ZIP, TAR downloads http://confluent.io/ksql Docker images confluentinc/cp-ksql-server confluentinc/cp-ksql-cli
  • 38.
    43C O NF I D E N T I A L Resources and Next Steps confluentinc/ksql http://confluent.io/ksql http://cnfl.io/slack
  • 39.
    44C O NF I D E N T I A L