0

I am trying to connect to locally hosted mysql database. I tried two systems - R and Python.

Here is screenshot of my set .profile

enter image description here

and .bash_profile

enter image description here

Here is what I did:

  • Tried to connect pyspark to local mysql, I get error see below (trace error of pyspark)

    Py4JJavaError Traceback (most recent call last) in () 5 sparkClassPath = os.getenv('SPARK_CLASSPATH', '/Users/me/mysql-connector-java-8.0.11/mysql-connector-java-8.0.11.jar') 6 sqlContext = SQLContext(SparkContext.getOrCreate()) ----> 7 sqlContext.read.format("jdbc").options(url="jdbc:mysql://127.0.0.1:4040",driver = "com.mysql.jdbc.Driver", dbtable = "product",user="root",password='').load()

     ~/Documents/spark/spark-2.2.1-bin-hadoop2.7/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options) 163 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) 164 else: --> 165 return self._df(self._jreader.load()) 166 167 @since(1.4) ~/Documents/spark/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args) 1131 answer = self.gateway_client.send_command(command) 1132 return_value = get_return_value( -> 1133 answer, self.gateway_client, self.target_id, self.name) 1134 1135 for temp_arg in temp_args: ~/Documents/spark/spark-2.2.1-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a, **kw) 61 def deco(*a, **kw): 62 try: ---> 63 return f(*a, **kw) 64 except py4j.protocol.Py4JJavaError as e: 65 s = e.java_exception.toString() ~/Documents/spark/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 317 raise Py4JJavaError( 318 "An error occurred while calling {0}{1}{2}.\n". --> 319 format(target_id, ".", name), value) 320 else: 321 raise Py4JError( Py4JJavaError: An error occurred while calling o93.load. : java.lang.ClassNotFoundException: com.mysql.jdbc.Driver at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:38) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:78) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:78) at scala.Option.foreach(Option.scala:257) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:78) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748) 
  • Tried to connect to sparlyr to local mysql, I get error, see below (trace error of sparklyr)

trace error of sparklyr enter image description here

Question: How do I connect to mysql from either pyspark or sparklyr?

2

1 Answer 1

0

SPARK_CLASSPATH has been deprecated few years ago and is no longer supported. Assuming the jar contains a valid driver version use either:

  • spark.jars configuration option.

or combined:

  • spark.driver.extraClassPath
  • spark.executor.extraClassPath

options.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks user9807949. How would I know which version is compatible? I am so terrified with the configuration. I am tying to do this in Mac High Sierra 10.13.4

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.