Skip to content Skip to sidebar Skip to footer

How To Gzip Files In Tmp Folder

Using an AWS Lambda function, I download an S3 zipped file and unzip it. For now I do it using extractall. Upon unzipping, all files are saved in the tmp/ folder. s3.download_file(

Solution 1:

Depending on the sizes of the files, I would skip writing the .gz file(s) to disk. Perhaps something based on s3fs | boto and gzip.

import contextlib
import gzip

import s3fs

AWS_S3 = s3fs.S3FileSystem(anon=False) # AWS env must be set up correctly

source_file_path = "/tmp/your_file.txt"
s3_file_path = "my-bucket/your_file.txt.gz"with contextlib.ExitStack() as stack:
    source_file = stack.enter_context(open(source_file_path , mode="rb"))
    destination_file = stack.enter_context(AWS_S3.open(s3_file_path, mode="wb"))
    destination_file_gz = stack.enter_context(gzip.GzipFile(fileobj=destination_file))
    whileTrue:
        chunk = source_file.read(1024)
        ifnot chunk:
            break
        destination_file_gz.write(chunk)

Note: I have not tested this so if it does not work, let me know.

Post a Comment for "How To Gzip Files In Tmp Folder"