Bucket name sometimes missing in the S3 URL - causing failures for the same bucket and code that was previously successful for operations like get object and put object. #4187
Labels
bug
This issue is a confirmed bug.
p2
This is a standard priority issue
response-requested
Waiting on additional information or feedback.
s3
Describe the bug
S3 access failing for the same bucket and code that was previously successful. Debug trace shows that the URL used during failure does not include the bucket name either as host or in the path.
Success
2024-06-27 21:21:12,116 botocore.regions [DEBUG] Calling endpoint provider with parameters: {'Bucket': 'xxxx', 'Region': 'us-east-1', 'UseFIPS': False, 'UseDualStack': False, 'ForcePathStyle': False, 'Accelerate': False, 'UseGlobalEndpoint': True, 'Key': 'xxxx/xxxx.xlsx', 'DisableMultiRegionAccessPoints': False, 'UseArnRegion': True}
2024-06-27 21:21:12,116 botocore.regions [DEBUG] Endpoint provider result: https://xxxx.s3.amazonaws.com
Failure
2024-07-02 18:22:11,094 botocore.regions [DEBUG] Calling endpoint provider with parameters: {'Region': 'us-east-1', 'UseFIPS': False, 'UseDualStack': False, 'ForcePathStyle': False, 'Accelerate': False, 'UseGlobalEndpoint': True, 'DisableMultiRegionAccessPoints': False, 'UseArnRegion': True}
2024-07-02 18:22:11,095 botocore.regions [DEBUG] Endpoint provider result: https://s3.amazonaws.com
The URL in the getobject call is also showing same behavior which seems to cause the access denied error.
Expected Behavior
Successfully download object.
Current Behavior
Failure with Access Denied after it worked successfully for the same code.
Reproduction Steps
#Note : The issue occurrence is unpredictable.
import pandas as pd
import boto3
from io import BytesIO
from pyspark.sql.functions import upper
import logging
from botocore.config import Config
boto3.set_stream_logger('', logging.DEBUG)
#boto3.set_stream_logger('')
Initialize S3 client
s3 = boto3.client('s3')
INBOUND_S3_BUCKET = "xxxx"
INBOUND_FILE_PATH = 'xxx/xxxx.xlsx'
obj = s3.get_object(Bucket = INBOUND_S3_BUCKET, Key = INBOUND_FILE_PATH)
Possible Solution
Unknown
Additional Information/Context
No response
SDK version used
1.34.137
Environment details (OS name and version, etc.)
Linux, databricks
The text was updated successfully, but these errors were encountered: