Your request makes sense only if the resulting data can fit into your main memory (i.e. you can safely use collect()); on the other hand, if this is the case, admittedly you have absolutely no reason to use Spark at all.
Anyway, given this assumption, here is a general way to convert a single-column features Spark dataframe (Rows of DenseVector) to a NumPy array using toy data:
spark.version # u'2.2.0' from pyspark.ml.linalg import Vectors import numpy as np # toy data: df = spark.createDataFrame([(Vectors.dense([0,45,63,0,0,0,0]),), (Vectors.dense([0,0,0,85,0,69,0]),), (Vectors.dense([0,89,56,0,0,0,0]) ,), ], ['features']) dd = df.collect() dd # [Row(features=DenseVector([0.0, 45.0, 63.0, 0.0, 0.0, 0.0, 0.0])), # Row(features=DenseVector([0.0, 0.0, 0.0, 85.0, 0.0, 69.0, 0.0])), # Row(features=DenseVector([0.0, 89.0, 56.0, 0.0, 0.0, 0.0, 0.0]))] np.asarray([x[0] for x in dd]) # array([[ 0., 45., 63., 0., 0., 0., 0.], # [ 0., 0., 0., 85., 0., 69., 0.], # [ 0., 89., 56., 0., 0., 0., 0.]])