Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run-stp-tests.sh test hangs with large open file limit number #989

Closed
glaubitz opened this issue Jun 18, 2024 · 4 comments
Closed

run-stp-tests.sh test hangs with large open file limit number #989

glaubitz opened this issue Jun 18, 2024 · 4 comments
Assignees
Labels
platform issue Issue is specific to an OS or desktop priority-low
Milestone

Comments

@glaubitz
Copy link

On Debian, systemd is configured to enable a large open file limit number of 1073741816.

This causes the run-stp-tests.sh script to timeout:

Running command tests...
Performing 5.1-lpadmin.sh: FAIL
Performing 5.2-lpc.sh: PASS
Performing 5.3-lpq.sh: FAIL
Performing 5.4-lpstat.sh: FAIL
Performing 5.5-lp.sh: FAIL
Performing 5.6-lpr.sh: FAIL
Performing 5.7-lprm.sh: FAIL
Performing 5.8-cancel.sh: FAIL
Performing 5.9-lpinfo.sh: FAIL
Performing restart test: ./run-stp-tests.sh: 811: kill: No such process

E: Build killed with signal TERM after 600 minutes of inactivity

This has been reported in Debian as: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1073046

@zdohnal
Copy link
Member

zdohnal commented Jun 18, 2024

I was able to to reproduce the issue by setting ulimit for -n to the Debian default number - I'm looking into it further.

@michaelrsweet
Copy link
Member

We used to have an issue with the cupsd startup attempting to close all files, but we added a limit of 1024 files to close on startup. But since this is for the restart test I'm not sure what might be happening...

@zdohnal
Copy link
Member

zdohnal commented Jun 20, 2024

I haven't confirmed it with my gdb run, but IMO what the guy from Debian saw is correct - calloc() returns NULL, because we try to allocate too much memory, because we take size from MaxFDs, which is taken from system max number of files.

The fix might be simple (probably limit MaxFDs to some sane number), but unfortunately I'm occupied with stuff around Centos Stream 10 atm... I hope I look into it before my vacation.

@michaelrsweet
Copy link
Member

I've updated the upper limit for MaxFDs to 65535 since more than that doesn't really matter (can't have more than 65535 active TCP connections from a single system - protocol limitation):

[master a66f419] Limit the maximum number of file descriptors to 64k-1 (Issue #989)

[2.4.x beb8440] Limit the maximum number of file descriptors to 64k-1 (Issue #989)

Please let me know it this solves your issues...

@michaelrsweet michaelrsweet self-assigned this Aug 14, 2024
@michaelrsweet michaelrsweet added priority-low platform issue Issue is specific to an OS or desktop labels Aug 14, 2024
@michaelrsweet michaelrsweet added this to the v2.4.x milestone Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform issue Issue is specific to an OS or desktop priority-low
Projects
None yet
Development

No branches or pull requests

3 participants