Subscribe to RSS

Question 1

Im trying to make a PySpark connection to Cassandra DB indexed with Geomesa. Searching about it, I noticed that it uses the Geotools spark runtime since there is no optimized runtime for Cassandra. I'...

Question 2

I got the problem while running spark-submit --master spark://localhost:7077 \ --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1, \ org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.1 \ -...

Question 3

[cassandra running from docker windows] and I am running spark from wsl2 spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.12:3.5.1 [its spark-shell after the command above] and ...

Question 4

I am composing these services in separate docker containers all on the same confluent network: broker: image: confluentinc/cp-server:7.4.0 hostname: broker container_name: broker ...

Question 5

Please help me fix the above errors based on the code I used The proccesing_data.py code is used to process data using spark-streaming import logging from pyspark.sql import SparkSession from ...

Question 6

I'm try to make connection from Pyspark to Cassandra in virtual environment and the services is installed by docker. I've been using the --packages method to solve the dependencies but it seems doesn'...

Question 7

I'm building an application that allows me to use pyspark to combine an oracle and a cassandra table. The cassandra table's count is in a scale of 100s of millions with the oracle one at a few 1000. ...

Question 8

I'm trying to create a pipeline in streaming that is calling an API using Airflow, then processing it with Kafka and inserting the data into Cassandra using Spark. I'm struggling when inserting data ...

Question 9

i am trying to register custom code(for map) like below val session: CqlSession = CassandraConnector.apply(spark.sparkContext).openSession() val codecRegistry: MutableCodecRegistry = session....

Question 10

I am using spark-cassandra-connector_2.11 and version 2.5.2 in my scala application and want to register a custom map codec, but facing issues. Is there any way to register it. I did the same thing on ...

Question 11

I'm working airflow into the Docker container on WindowsPC. I have some problems with apache airflow spark submit operator. I want to write data to a remote Cassandra server. When I was using df.write....

Question 12

I had this error: py4j.protocol.Py4JJavaError: An error occurred while calling o59.start. : java.lang.NoClassDefFoundError: com/datastax/spark/connector/util/Logging at java.base/java.lang....

Question 13

I understand that when inserting data, tombstones might be created because of existing null values in the columns of the dataframe. To mitigate this issue and minimize tombstones, insertion queries ...

Question 14

Context : Sorry to bother you with I am struggling to install a Cassandra-Spark connector. My goal is to install it to use Spark-SQL since Cassandra has strong limitations to do requests. I have : ...

Question 15

Running the Python code does not connect to Spark and does not create a database in Cassandra either. I have confirmed the services are up on docker and accessible from the PC. I placed the .jar files ...

Collectives™ on Stack Overflow

PySpark/Spark connection to Geomesa Cassandra DB

Executors for PySpark app always finish with "state KILLED exitStatus 143"

unable to connect to cassandra from apache spark:com.datastax.oss.driver.api.core.connection.ClosedConnectionException: Lost connection to remote peer

Can't connect/write stream from spark container to table in cassandra container

ERROR SparkContext: Failed to add file java.io.FileNotFoundException: Jar to Spark not Found

PySpark connection to Cassandra returns "Py4JJavaError: An error occurred while calling o54.start"

Pyspark connector slow during joins

Spark is not inserting data into Cassandra when using ```writeStream```

Ignoring codec because it collides with previously generated codec

register a custom codec in cassandra connector

How To Use Spark Submit Operator With Cassandra Remote Server In Apache Airflow

PySpark app returns "NoClassDefFoundError: com/datastax/spark/connector/util/Logging"

Pyspark cassandra connector generates tombstones during writing

Simple Python app failed to load, getting "ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra"

Unable to connect to spark

Hot Network Questions