I am a MacOS user and I just downloaded Apache Spark. I then put it in /usr/local/spark. Here is what inside my .bash_profile:
export SPARK_HOME="/usr/local/spark" export PYSPARK_PYTHON=python3 export PATH=$PATH:$SPARK_HOME/bin #export PYSPARK_DRIVER_PYTHON="jupyter" #export PYSPARK_DRIVER_PYTHON_OPTS="notebook" The problem is, when type pyspark to enter the pyspark shell, then type these two lines:
spark = SparkSession.builder.appName("preprocessing").config("spark-master", "local").getOrCreate() df = spark.read.format("csv").option("header","true").option("inferSchema", "true").option("delimiter",",").load("src/census-income.data") An error occurs:
2018-10-02 19:55:24 ERROR PoolWatchThread:118 - Error in trying to obtain a connection. Retrying in 7000ms java.sql.SQLException: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.setReadOnly(Unknown Source) at com.jolbox.bonecp.ConnectionHandle.setReadOnly(ConnectionHandle.java:1324) at com.jolbox.bonecp.ConnectionHandle.<init>(ConnectionHandle.java:262) at com.jolbox.bonecp.PoolWatchThread.fillConnections(PoolWatchThread.java:115) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:82) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: ERROR 25505: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.sql.conn.GenericAuthorizer.setReadOnlyConnection(Unknown Source) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.setReadOnly(Unknown Source) ... 8 more - Spark version: 2.3.2
- Python version: 3.7.0