Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: leak in reaper connect function with terminationSignal #2878

Open
xsteadfastx opened this issue Nov 6, 2024 · 4 comments
Open

[Bug]: leak in reaper connect function with terminationSignal #2878

xsteadfastx opened this issue Nov 6, 2024 · 4 comments
Labels
bug An issue with the library

Comments

@xsteadfastx
Copy link

xsteadfastx commented Nov 6, 2024

Testcontainers version

0.34.0

Using the latest Testcontainers version?

Yes

Host OS

linux

Host arch

amd64

Go version

1.23.2

Docker version

Client: Docker Engine - Community
 Version:           26.1.4
 API version:       1.45
 Go version:        go1.21.11
 Git commit:        5650f9b
 Built:             Wed Jun  5 11:29:19 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.1.4
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       de5c9cf
  Built:            Wed Jun  5 11:29:19 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.33
  GitCommit:        d2d58213f83a351ca8f528a95fbd145f5654e957
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

Client: Docker Engine - Community
 Version:    26.1.4
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
WARNING: No swap limit support
    Version:  v2.27.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 5
 Server Version: 26.1.4
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d2d58213f83a351ca8f528a95fbd145f5654e957
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  apparmor
 seccomp
   Profile: builtin
 Kernel Version: 5.4.0-186-generic
 Operating System: Ubuntu 20.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.32GiB
 Name: troy
 ID: 2WIM:7GHM:XDAV:VDAR:XRVJ:WGK7:YU4F:JHWE:J25C:CPE2:YE5U:5EMH
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

What happened?

im testing with goleak.VerifyTestMain(m) and with the new version of testcontainers there is a goroutine leak in reaper. it looks like the terminationSignal chan wont gets closed so the goroutine hangs. but its just a wild guess here.

Relevant log output

goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 99 in state chan receive, with github.com/testcontainers/testcontainers-go.(*Reaper).connect.func1 on top of the stack:
github.com/testcontainers/testcontainers-go.(*Reaper).connect.func1()
        /home/marv/wip/hemingway/vendor/github.com/testcontainers/testcontainers-go/reaper.go:527 +0x1d3
created by github.com/testcontainers/testcontainers-go.(*Reaper).connect in goroutine 22
        /home/marv/wip/hemingway/vendor/github.com/testcontainers/testcontainers-go/reaper.go:522 +0x311
 Goroutine 195 in state chan receive, with github.com/testcontainers/testcontainers-go.(*Reaper).connect.func1 on top of the stack:
github.com/testcontainers/testcontainers-go.(*Reaper).connect.func1()
        /home/marv/wip/hemingway/vendor/github.com/testcontainers/testcontainers-go/reaper.go:527 +0x1d3
created by github.com/testcontainers/testcontainers-go.(*Reaper).connect in goroutine 141
        /home/marv/wip/hemingway/vendor/github.com/testcontainers/testcontainers-go/reaper.go:522 +0x311
 Goroutine 206 in state chan receive, with github.com/testcontainers/testcontainers-go.(*Reaper).connect.func1 on top of the stack:
github.com/testcontainers/testcontainers-go.(*Reaper).connect.func1()
        /home/marv/wip/hemingway/vendor/github.com/testcontainers/testcontainers-go/reaper.go:527 +0x1d3
created by github.com/testcontainers/testcontainers-go.(*Reaper).connect in goroutine 166
        /home/marv/wip/hemingway/vendor/github.com/testcontainers/testcontainers-go/reaper.go:522 +0x311
]

Additional information

No response

@xsteadfastx xsteadfastx added the bug An issue with the library label Nov 6, 2024
@xsteadfastx
Copy link
Author

checked again.. with version 0.33.0 it works without the leak.

@xsteadfastx
Copy link
Author

xsteadfastx commented Nov 6, 2024

just a wild guess: it just sends true to termSignal on an error. before it always did that. (just some naive guessing obervations) ;-)

im talking about this code here:

		defer func() {
			if err != nil {
				termSignal <- true
			}
		}()

if i change it to

		// Cleanup on error.
		defer func() {
			if termSignal  != nil {
				termSignal <- true
			}
		}()

it doesnt leak anymore.

it also doesnt make sense to me, that it should send true to termSignal if an error but it already returns before that defer is set. so its never called and just leaks into infinity.

		termSignal, err := r.Connect()
		if err != nil {
			return nil, fmt.Errorf("reaper connect: %w", err)
		}

		// Cleanup on error.
		defer func() {
			if err != nil {
				termSignal <- true
			}
		}()

@xsteadfastx
Copy link
Author

i read through the main branch: for CreateContainer there changed some logic here. But not for ReuseOrCreateContainer and CreateNetwork.

@mdelapenya
Copy link
Member

Hi @xsteadfastx thanks for reporting this. I see the different behaviours you described here:

  1. there is a cleanupTermSignal that is deferred in the CreateContainer method
// cleanupTermSignal triggers the termination signal if it was created and an error occurred.
func (c *DockerContainer) cleanupTermSignal(err error) {
	if c.terminationSignal != nil && err != nil {
		c.terminationSignal <- true
	}
}
  1. The ReuseOrCreateContianer, and the CreateNetwork methods are not using that, but instead:
	if err != nil {
		termSignal <- true
	}

I think we must apply the behaviour in 1) consistently, as you suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue with the library
Projects
None yet
Development

No branches or pull requests

2 participants