1

I'm trying to read a Kafka topic via Spark structured streaming inside spark-shell but it seems like that I don't get any line from Kafka.

Kafka alone works fine (tested with console-consumer and console-producer):

~/opt/bd/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic testtopic --from-beginning Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper]. first thrid fifth seventh eight bla blal2 das ist testmaschine hallo kleiner blsllslsd 

This is the code I'm running in the spark-shell:

ds1 = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", "localhost:2181") .option("subscribe", "testtopic") .option("startingOffsets" , "earliest") .load() ds1.writeStream.format("console").start 

I'm expecting that I get the messages that are already stored for this topic in Kafka and that all messages will be printed in the Spark shell. But there is nothing printed. Where is my mistake? I'm using Spark 2.0.2 and Kafka 010.2.

1 Answer 1

3

You need to change the port for Kafka bootstrap servers. Like this-

ds1 = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", "localhost:9092") .option("subscribe", "testtopic") .option("startingOffsets" , "earliest") .load() ds1.writeStream.format("console").start 

Then you will be able to get values from readStream.

I hope it helps!

Sign up to request clarification or add additional context in comments.

1 Comment

Now it works. Thanks a lot. I mustn't use the zookeeper port but the port specified for the broker.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.