Structured Streaming exception: Append output mode not supported for streaming aggregations

Question

I am getting the following error when I run my spark job:

org.apache.spark.sql.AnalysisException: Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets;;

I am not sure if the issue is being caused due to lack of a watermark,which I don't know how to apply in this context. Following is the aggregation operation applied:

def aggregateByValue(): DataFrame = { df.withColumn("Value", expr("(BookingClass, Value)")) .groupBy("AirlineCode", "Origin", "Destination", "PoS", "TravelDate", "StartSaleDate", "EndSaleDate", "avsFlag") .agg(collect_list("Value").as("ValueSeq")) .drop("Value") }

Usage:

val theGroupedDF = theDF .multiplyYieldByHundred .explodeDates .aggregateByValue val query = theGroupedDF.writeStream .outputMode("append") .format("console") .start() query.awaitTermination()

ChiralCarbon · Accepted Answer · 2018-02-12 09:05:08Z

Changing the outputMode to complete solved the issue.

val query = theGroupedDF.writeStream .outputMode("complete") .format("console") .start() query.awaitTermination()

Rama Salahat · Accepted Answer · 2022-01-08 00:17:37Z

adding this would solve the problem:

 val theGroupedDF = theDF .multiplyYieldByHundred .explodeDates .aggregateByValue //code bellow .withColumn("timestamp", current_timestamp()) .withWatermark("timestamp", "10 minutes")

Collectives™ on Stack Overflow

Structured Streaming exception: Append output mode not supported for streaming aggregations

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related