How To Use Pandas Udf Functionality In Pyspark
I have a spark frame with two columns which looks like: +-------------------------------------------------------------+------------------------------------+ |docId
Solution 1:
Done. Simple function can help to achieve this:
@pandas_udf(returnType=StringType())
def convert_id(id):
converted = id.map(lambda x : str(bs.a85encode(bytearray.fromhex(str(x).replace("-", ""))))[2:-1])
return converted
Post a Comment for "How To Use Pandas Udf Functionality In Pyspark"