I am translating Scala / Spark deep learning model into Python / PySpark. After reading the df all variables are interpreted as strings type. I need to cast them as float. Doing this one by one is easy, I think it would be like this:
format_number(result['V1'].cast('float'),2).alias('V1')
, but there is 31 columns How to do it all at once. The columns are "V1" to "V28" and "Time", "Amount", "Class"
Scala solution to it is this:
// cast all the column to Double type.
val df = raw.select(((1 to 28).map(i => "V" + i) ++ Array("Time", "Amount", "Class")).map(s => col(s).cast("Double")): _*)
How to do the same in PySpark?