Skip to content Skip to sidebar Skip to footer

How To Use Pandas Udf Functionality In Pyspark

I have a spark frame with two columns which looks like: +-------------------------------------------------------------+------------------------------------+ |docId

Solution 1:

Done. Simple function can help to achieve this:

@pandas_udf(returnType=StringType())
def convert_id(id):
    converted = id.map(lambda x : str(bs.a85encode(bytearray.fromhex(str(x).replace("-", ""))))[2:-1])
    return converted

Post a Comment for "How To Use Pandas Udf Functionality In Pyspark"