0

I am a MacOS user and I just downloaded Apache Spark. I then put it in /usr/local/spark. Here is what inside my .bash_profile:

export SPARK_HOME="/usr/local/spark" export PYSPARK_PYTHON=python3 export PATH=$PATH:$SPARK_HOME/bin #export PYSPARK_DRIVER_PYTHON="jupyter" #export PYSPARK_DRIVER_PYTHON_OPTS="notebook" 

The problem is, when type pyspark to enter the pyspark shell, then type these two lines:

spark = SparkSession.builder.appName("preprocessing").config("spark-master", "local").getOrCreate() df = spark.read.format("csv").option("header","true").option("inferSchema", "true").option("delimiter",",").load("src/census-income.data") 

An error occurs:

2018-10-02 19:55:24 ERROR PoolWatchThread:118 - Error in trying to obtain a connection. Retrying in 7000ms java.sql.SQLException: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.setReadOnly(Unknown Source) at com.jolbox.bonecp.ConnectionHandle.setReadOnly(ConnectionHandle.java:1324) at com.jolbox.bonecp.ConnectionHandle.<init>(ConnectionHandle.java:262) at com.jolbox.bonecp.PoolWatchThread.fillConnections(PoolWatchThread.java:115) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:82) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: ERROR 25505: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.sql.conn.GenericAuthorizer.setReadOnlyConnection(Unknown Source) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.setReadOnly(Unknown Source) ... 8 more 
  • Spark version: 2.3.2
  • Python version: 3.7.0
0

2 Answers 2

1

Can you try deleting the file metastore_db/dbex.lck from the current directory (SPARK_HOME)?

Source: https://github.com/bpn1/ingestion/wiki/Troubleshooting

Sign up to request clarification or add additional context in comments.

Comments

1

Spark is trying to load from HDFS. Apparently you don't have hadoop installed and spark is failing to connect to HDFS. If you want to load from local file system, you have to specify it explicitly:

file:///src/census-income.data

1 Comment

thank you, I have been like two weeks and this resolved my problem!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.