How to create Spark Row from list of key-value pairs

Question

Suppose I have got a list of key-value pairs:

kvs = [('x', 0), ('a', 1)]

Now I'd like to create a Spark Row from kvs with the same order of keys as in kvs.
How to do it in Python ?

you can use OrderedDict stackoverflow.com/questions/38253385/… — Harry
– Harry, Commented Oct 1, 2017 at 11:10

Harry · Accepted Answer · 2017-10-01 11:06:25Z

1

I haven't run it yet but may you check once I will edit after running if fails.

from pyspark.sql import Row kvs = [('x', 0), ('a', 1)] h = {} [h.update({k:v}) for k,v in kvs] row = Row(**h)

answered Oct 1, 2017 at 11:06

Harry

3181 silver badge9 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Michael Over a year ago

Thanks but it does not preserve the order of the pairs in kvs,

Harry Over a year ago

Check how to preserve order using OrderedDict stackoverflow.com/questions/38253385/…

zero323 · Accepted Answer · 2017-10-01 11:24:07Z

You can:

from pyspark.sql import Row Row(*[k for k, _ in kvs])(*[v for _, v in kvs])

but in my opinion it is better to avoid Row whatsoever. Other than being a convenient class to represent local values fetched from the JVM backend, it has no special meaning in Spark. In almost every context:

tuple(v for _, v in kvs)

is perfectly valid replacement for Row.

Collectives™ on Stack Overflow

How to create Spark Row from list of key-value pairs

2 Answers 2

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Linked

Related