Skip to content Skip to sidebar Skip to footer
Showing posts with the label Apache Spark

How To Use Scala Udf In Pyspark?

I want to be able to use a Scala function as a UDF in PySpark package com.test object ScalaPySpark… Read more How To Use Scala Udf In Pyspark?

Pyspark: Cx_oracle.interfaceerror: Not A Query

i need to perform update query in spark job. i am trying below code. but facing issues. import cx_O… Read more Pyspark: Cx_oracle.interfaceerror: Not A Query

How To Apply The Describe Function After Grouping A Pyspark Dataframe?

I want to find the cleanest way to apply the describe function to a grouped DataFrame (this questio… Read more How To Apply The Describe Function After Grouping A Pyspark Dataframe?

Pyspark: Need To Show A Count Of Null/empty Values Per Each Column In A Dataframe

I have a spark dataframe and need to do a count of null/empty values for each column. I need to sho… Read more Pyspark: Need To Show A Count Of Null/empty Values Per Each Column In A Dataframe

Pyspark Parse Fixed Width Text File

Trying to parse a fixed width text file. my text file looks like the following and I need a row i… Read more Pyspark Parse Fixed Width Text File

What Type Should The Dense Vector Be, When Using Udf Function In Pyspark?

I want to change List to Vector in pySpark, and then use this column to Machine Learning model for … Read more What Type Should The Dense Vector Be, When Using Udf Function In Pyspark?