Gzipfile Not Supported By S3?
I am trying to iterate through some file paths so that I gzip each file individually. Each item in the testList contains strings (paths) like this: /tmp/File. After gzipping them,
Solution 1:
Assuming each file can fit into memory, you can simply do this to compress the data in-memory and package it in a BytesIO for the S3 API to read.
import boto3
import gzip
import io
s3 = boto3.client("s3")
bucket = s3_resource.Bucket("testunzipping")
for i in testList:
fileName = i.replace("/tmp/DataPump_10000838/", "")
withopen(i, "rb") as f_in:
gzipped_content = gzip.compress(f_in.read())
bucket.upload_fileobj(
io.BytesIO(gzipped_content),
fileName,
ExtraArgs={"ContentType": "text/plain", "ContentEncoding": "gzip"},
)
If that's not the case, you can use a tempfile to compress the data onto disk first:
import boto3
import gzip
import io
import shutil
s3 = boto3.client("s3")
bucket = s3_resource.Bucket("testunzipping")
for i in testList:
fileName = i.replace("/tmp/DataPump_10000838/", "")
with tempfile.TemporaryFile() as tmpf:
withopen(i, "rb") as f_in, gzip.GzipFile(mode="wb", fileobj=tmpf) as gzf:
shutil.copyfileobj(f_in, gzf)
tmpf.seek(0)
bucket.upload_fileobj(
tmpf,
fileName,
ExtraArgs={"ContentType": "text/plain", "ContentEncoding": "gzip"},
)
Post a Comment for "Gzipfile Not Supported By S3?"