Skip to content Skip to sidebar Skip to footer

Correct Way Of Writing Two Floats Into A Regular Txt

I am running a big job, in cluster mode. However, I am only interested in two floats numbers, which I want to read somehow, when the job succeeds. Here what I am trying: from pyspa

Solution 1:

At the first glance there is nothing particularly (you should context manager in case like this instead of manually closing but it is not the point) wrong with your code. If this script is passed to spark-submit file will be written to the directory local to the driver code.

If you submit your code in the cluster mode it will be an arbitrary worker node in your cluster. If you're in doubt you can always log os.getcwd() and socket.gethostname() to figure out which machine is used and what is the working directory.

Finally you cannot use standard Python IO tools to write to HDFS. There a few tools which can achieve that including native dask/hdfs3.


Post a Comment for "Correct Way Of Writing Two Floats Into A Regular Txt"