-
Notifications
You must be signed in to change notification settings - Fork 350
Description
Question
Hello team,
I am using pyiceberg to load data into an Iceberg table stored in Amazon S3.
While doing this, I am facing an explicit deny from an AWS Service Control Policy (SCP) that blocks multipart uploads without encryption. I cannot modify the SCP.
Error excerpt:
OSError: When initiating multiple part upload for key 'iceberg/DEV/dataset/test_4_matdoc/metadata/...' in bucket 'pt-s3-project-bucketname': AWS Error ACCESS_DENIED during CreateMultipartUpload operation: User: arn:aws:sts::... is not authorized to perform: s3:PutObject with an explicit deny in a service control policy
This happens during calls like:
def load_iceberg_table(table, arrow_table): catalog = load_catalog("glue", **{"type": "glue"}) iceberg_table: Table = catalog.load_table(f"{DATABASE}.{table}") try: logger.info("Appending data to Iceberg table...") iceberg_table.append(df=arrow_table) logger.info("Successfully appended data to Iceberg table.") except ClientError as e: logger.error(f"Iceberg append ClientError: {e}") raise except Exception as e: logger.error(f"Unexpected Iceberg error: {e}") raise
From my understanding, pyiceberg uses S3 multipart upload under the hood, but I haven’t found a documented way to configure SSE-KMS or SSE-S3 parameters for these writes.
Question:
Is there currently a way to pass S3 upload parameters (like ServerSideEncryption, SSEKMSKeyId) via load_catalog, append, or FileIO configuration?
If not, could this be added as a feature so that environments with encryption-required SCPs can still use pyiceberg without policy changes?
Thanks!