Skip to content

503 when trying to upload with EC storage policy #2112

@M-Pixel

Description

@M-Pixel
ISSUE TYPE
  • Bug Report
COMPONENT NAME

oioswift (maybe?)

SDS VERSION
openio 7.0.1
CONFIGURATION
# OpenIO managed
[OPENIO]
# endpoints
conscience=10.147.19.4:6000
zookeeper=10.147.19.2:6005,10.147.19.3:6005,10.147.19.4:6005
proxy=10.147.19.2:6006
event-agent=beanstalk://10.147.19.2:6014
ecd=10.147.19.2:6017

udp_allowed=yes

ns.meta1_digits=2
ns.storage_policy=ECLIBEC144D1
ns.chunk_size=104857600
ns.service_update_policy=meta2=KEEP|3|1|;rdir=KEEP|1|1|;

iam.connection=redis+sentinel://10.147.19.2:6012,10.147.19.3:6012,10.147.19.4:6012?sentinel_name=OPENIO-master-1
container_hierarchy.connection=redis+sentinel://10.147.19.2:6012,10.147.19.3:6012,10.147.19.4:6012?sentinel_name=OPENIO-master-1
bucket_db.connection=redis+sentinel://10.147.19.2:6012,10.147.19.3:6012,10.147.19.4:6012?sentinel_name=OPENIO-master-1

sqliterepo.repo.soft_max=1000
sqliterepo.repo.hard_max=1000
sqliterepo.cache.kbytes_per_db=4096
OS / ENVIRONMENT
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
SUMMARY

When trying to upload a file that is large enough to match an erasure-code storage policy, a 503 error is returned.

I confirmed that files small enough to match a simple (replication) storage policy do not experience the same issue. I confirmed that the issue is not unique to a particular EC implementation (ISA-L vs libEC).

I could not identify any relevant diagnostic information in any of the logs that I thought might be relevant. Before going through the effort of providing an exact repro, I was hoping I could get guidance on how I can more precisely identify the problem.

I do have one suspicion: My cluster has 3 RAWX at the moment. Perhaps the inability to locate 6 unique RAWX for ECLIBEC63D1's 6 data chunks is causing a timeout that results in the 503?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions