1

We updated our kafka version to 2.0 from 0.9

Need help in finding the right client library for spark structured streaming

"org.apache.spark" %% "spark-streaming-kafka-0-10" % "2.3.0" 

doesnt work.

This is the error it throws:

11:46:18.061 [stream execution thread for [id = e393ea37-8009-4ce0-b996-94f767994fb8, runId = bc15eb7d-876d-4e01-8ee5-22205ec7fdcb]] DEBUG org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-2, groupId=spark-kafka-source-8ce7f26f-e342-4b0d-85f1-a9f641b79629-1052905425-driver-0] Completed connection to node -1. Fetching API versions. 11:46:18.061 [stream execution thread for [id = e393ea37-8009-4ce0-b996-94f767994fb8, runId = bc15eb7d-876d-4e01-8ee5-22205ec7fdcb]] DEBUG org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-2, groupId=spark-kafka-source-8ce7f26f-e342-4b0d-85f1-a9f641b79629-1052905425-driver-0] Initiating API versions fetch from node -1. 11:46:18.452 [stream execution thread for [id = e393ea37-8009-4ce0-b996-94f767994fb8, runId = bc15eb7d-876d-4e01-8ee5-22205ec7fdcb]] DEBUG org.apache.kafka.common.network.Selector - [Consumer clientId=consumer-2, groupId=spark-kafka-source-8ce7f26f-e342-4b0d-85f1-a9f641b79629-1052905425-driver-0] Connection with kafka-muhammad-45e0.aivencloud.com/18.203.67.147 disconnected java.io.EOFException: null at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:119) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:335) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:296) at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:560)

1 Answer 1

1

Please provide more information (how about errors?!). Kafka brokers are engineered to be backwards compatible, this helps with projects that lag significantly behind the latest APIs (like Spark).

The artifactId you're referencing spark-streaming-kafka-0-10 is for Spark Streaming, in order to use Spark Structured Streaming you'll want to use spark-sql-kafka-0-10_2.11

Sign up to request clarification or add additional context in comments.

3 Comments

Added the logs. Also, I am using both spark-sql-kafka-0-10_2.11 and spark-streaming-kafka-0-10 Looking for a library that supports kafka2.0
Like I said, Kafka 2.0 can be used as the brokers for the Spark libraries you're specifying. The EOF error you paste looks like the Kafka broker closing your connection after a period of idle. I can't really help more unless you provide the code that generates that error.
No worries Chris, I wasn't passing the ssl properties correctly. The ssl, truststore properties have to be prefixed by kafka. I was initially using "ssl.truststore.location" instead of "kafka.ssl.truststore.location"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.