Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: "--headlessmode=new" with Chromedriver 128 in container results in SessionNotCreatedException #14457

Open
1dEraNCeSIv0 opened this issue Aug 29, 2024 · 14 comments

Comments

@1dEraNCeSIv0
Copy link

1dEraNCeSIv0 commented Aug 29, 2024

What happened?

Recently we upgraded our Jenkins to the latest version and most of our (Java based) Selenium tests started failing in their pipelines, SessionNotCreatedException caused by a timeout in org.openqa.selenium.remote.http.AddSeleniumUserAgent.

Upon further investigation we found that the following combination of circumstances causes consistent failure:

  • Use chromium 128 / chromedriver 128
  • Use the new headless mode as --headless=new
  • Run from a docker container

I've browsed the issues here to check if it's been reported before and it looks similar to this issue, might be the same cause.

For now our workaround is to downgrade the chromium / chromedriver version our Jenkins runs with. We could also switch our tests to --headless=old but I see that as a fix of last resort. I'd much rather Selenium and new chromedriver versions work together out of the box, even in new headless mode.

How can we reproduce the issue?

See this repository for a minimal reproducing example. For instructions on how to reproduce the issue please see the readme.

Relevant log output

> Task :test

SeleniumTest > headlessNew() FAILED
    org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
    Host info: host: '83b1fb817f96', ip: '172.17.0.1'
        at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:545)
        at app//org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:234)
        at app//org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:163)
        at app//org.openqa.selenium.chromium.ChromiumDriver.<init>(ChromiumDriver.java:114)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:88)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:83)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:72)
        at app//SeleniumTest.createChromeDriver(SeleniumTest.java:26)
        at app//SeleniumTest.headlessNew(SeleniumTest.java:9)

        Caused by:
        org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
        Build info: version: '4.23.0', revision: '4df0a231af'
        System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.10.0-18-amd64', java.version: '21.0.4'
        Driver info: driver.version: ChromeDriver
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:399)
            at app//org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
            at app//org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:355)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:89)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:75)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:61)
            at app//org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:162)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.invokeExecute(DriverCommandExecutor.java:216)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:174)
            at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:527)
            ... 8 more

            Caused by:
            java.util.concurrent.TimeoutException
                at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
                at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
                at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:382)
                ... 18 more

SeleniumTest > headlessOld() STANDARD_ERROR
    Aug 29, 2024 12:25:03 PM org.openqa.selenium.devtools.CdpVersionFinder findNearestMatch
    WARNING: Unable to find an exact match for CDP version 128, returning the closest version; found: 127; Please update to a Selenium version that supports CDP version 128

Gradle Test Executor 1 finished executing tests.

Operating System

Alpine 3.20, Debian 12

Selenium version

4.19.1, 4.23

What are the browser(s) and version(s) where you see this issue?

Chrome 128

What are the browser driver(s) and version(s) where you see this issue?

Chromedriver 128

Are you using Selenium Grid?

No

Copy link

@1dEraNCeSIv0, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@pujagani
Copy link
Contributor

Thank you for sharing the details. I tried to reproduce the issue but was not able to.

Docker command:
docker run --rm -it -p 4444:4444 -p 5900:5900 -p 7900:7900 --shm-size 2g selenium/standalone-chromium:latest

Selenium Java code:

public class ChromeHeadlessv128 {

  public static void main(String[] argv) throws Exception {
    ChromeOptions options = new ChromeOptions();
    options.addArguments("--headless=new");

    WebDriver driver = new RemoteWebDriver(options, false);
    driver.get("https://www.google.com/");

    driver.getTitle();
    driver.quit();
  }
}

Is the error happening each time? or is it intermittent? How can we reproduce this?

@pujagani
Copy link
Contributor

pujagani commented Aug 30, 2024

I am able to reproduce it if I run multiple sessions in parallel or run multiple sessions sequentially.

@1dEraNCeSIv0
Copy link
Author

1dEraNCeSIv0 commented Aug 30, 2024

I tried reproducing it again locally and noticed that the image was broken due to line endings changing upon up- and download.
I've also had to allow newer chrome versions as the alpine repo doesn't seem to keep the specific 128 version that was up to date yesterday available. Note that this means that once 129 becomes available the Dockerfile will probably build that into the image instead. But I don't know of any quick way to pin the version.

Long story short, I believe I've fixed the Dockerfile and the following steps should now work again to reproduce the issue using the repository linked above:

  • Clone the repo
  • Navigate to project root folder
  • run docker build -t reproducer . (or any other image name)
  • run docker run reproducer "/root/gradlew -i test"
  • wait for the timeout to happen (around 5min)

If there's any issues with the image please let me know. It should cause the issue consistently, my error-rate so far is 100% in maybe 10 attempts.
Regarding parallelism or running multiple sessions, the demo repo above uses the default settings for all of these - but I'm not sure what these are.

@VietND96
Copy link
Member

VietND96 commented Sep 2, 2024

@pujagani, can you try to reproduce the same again with image selenium/standalone-chromium:latest (updated on Aug-31 1:00 AM IST). I guess it appears from chromium version 128.0.6613.113

@pujagani
Copy link
Contributor

pujagani commented Sep 9, 2024

@VietND96 Thank you! Let me try it out and provide my findings here.

@pujagani
Copy link
Contributor

pujagani commented Sep 9, 2024

I am able to reproduce the issue (using #14457 (comment) not consistently, it failed one time with "selenium/standalone-chromium:latest" when using "options.addArguments("--headless=new");". Without headless or when using the old headless mode "options.addArguments("--headless");", it works as expected all the time though. Unable to find a pattern here.

@pujagani
Copy link
Contributor

pujagani commented Sep 9, 2024

With the demo repo shared, I am able to see the error described in the issue. But those are two different things. I was trying to run tests on my machine pointing to the docker-selenium grid and was not able to reproduce the issue accurately on the last attempt. But the repo is trying to run tests inside the docker container locally without using the Grid. I have a feeling this is not a Selenium issue.

@pujagani
Copy link
Contributor

pujagani commented Sep 9, 2024

In the demo repo shared I have made the following updates:

  1. Updated selenium to latest version :

testImplementation("org.seleniumhq.selenium:selenium-java:4.24.0")

  1. Updated the chromium and chromedriver versions in the dockerfile:

RUN apk add chromium>128.0.6613.119-r0 chromium-chromedriver>128.0.6613.119-r0

After this, I no longer see the error. Sharing the output below:

Caching disabled for task ':test' because:
  Build cache is disabled
Task ':test' is not up-to-date because:
  No history is available.
Starting process 'Gradle Test Executor 1'. Working directory: /root Command: /opt/java/openjdk/bin/java -Dorg.gradle.internal.worker.tmpdir=/root/build/tmp/test/work @/root/.gradle/.tmp/gradle-worker-classpath15735176167891063660txt -Xmx512m -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -ea worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Test Executor 1'
Successfully started process 'Gradle Test Executor 1'

Gradle Test Executor 1 started executing tests.
Gradle Test Executor 1 finished executing tests.

> Task :test
Finished generating test XML results (0.005 secs) into: /root/build/test-results/test
Generating HTML test report...
Finished generating test html results (0.011 secs) into: /root/build/reports/tests/test

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.10/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 17s
2 actionable tasks: 1 executed, 1 up-to-date

@pujagani
Copy link
Contributor

pujagani commented Sep 9, 2024

@1dEraNCeSIv0 Can you please try it and provide an update?

@1dEraNCeSIv0
Copy link
Author

I've incorporated your changes into the repository, no changes. Feel free to check if I made an error when editing the project.

Caching disabled for task ':test' because:
  Build cache is disabled
Task ':test' is not up-to-date because:
  No history is available.
Starting process 'Gradle Test Executor 1'. Working directory: /root Command: /opt/java/openjdk/bin/java -Dorg.gradle.internal.worker.tmpdir=/root/build/tmp/test/work @/root/.gradle/.tmp/gradle-worker-classpath6100839871221666423txt -Xmx512m -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -ea worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Test Executor 1'
Successfully started process 'Gradle Test Executor 1'

SeleniumTest > headlessNew() FAILED
    org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
    Host info: host: '42d9f1c3b18f', ip: '172.17.0.2'
        at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:563)
        at app//org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:245)
        at app//org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:174)
        at app//org.openqa.selenium.chromium.ChromiumDriver.<init>(ChromiumDriver.java:114)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:88)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:83)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:72)
        at app//SeleniumTest.createChromeDriver(SeleniumTest.java:26)
        at app//SeleniumTest.headlessNew(SeleniumTest.java:9)

        Caused by:
        org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
        Build info: version: '4.24.0', revision: '748ffc9bc3'
        System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.15.153.1-microsoft-standard-WSL2', java.version: '21.0.4'
        Driver info: driver.version: ChromeDriver
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:418)
            at app//org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
            at app//org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:374)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:89)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:75)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:61)
            at app//org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:162)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.invokeExecute(DriverCommandExecutor.java:216)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:174)
            at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:545)
            ... 8 more

            Caused by:
            java.util.concurrent.TimeoutException
                at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
                at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
                at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:401)
                ... 18 more

Gradle Test Executor 1 finished executing tests.

> Task :test FAILED

2 tests completed, 1 failed
Finished generating test XML results (0.01 secs) into: /root/build/test-results/test
Generating HTML test report...
Finished generating test html results (0.017 secs) into: /root/build/reports/tests/test

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> There were failing tests. See the report at: file:///root/build/reports/tests/test/index.html

* Try:
> Run with --scan to get full insights.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.
BUILD FAILED in 3m 29s


For more on this, please refer to https://docs.gradle.org/8.10/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.
2 actionable tasks: 1 executed, 1 up-to-date

FWIW I also checked the specific installed chromium version when I run the image and it's the one you used

~ # apk list -i | grep chrome
chromium-chromedriver-128.0.6613.119-r0 x86_64 {chromium} (BSD-3-Clause) [installed]

@pujagani
Copy link
Contributor

pujagani commented Sep 9, 2024

Thank you for trying. I am not sure how to help further since I am unable to reproduce it consistently on my end.

@1dEraNCeSIv0
Copy link
Author

I've finally found that one of my colleagues can run the tests and they consistently work for them as well. I'll be digging more into that next week, hopefully I'll be able to narrow down the exact causes of the error. I'll let you know once I find out more

@1dEraNCeSIv0
Copy link
Author

Okay, I've find some time to look into this further.

Turns out that my colleague who can run the tests consistently... cannot actually run them consistently. Upon retrying today they exhibited the same behavior I've mostly been getting.

So I've tried it on a windows machines and get the error 100% of the time
I've tried it on a debian based linux machine and I get the error some 90% of the time
My colleague tried on nixos and they got the same error
My other colleague tried on a ubunut based linux distribution and they get it ... I don't know how often but they've at least seen success and error both

Now, ideally none of that would matter anyway because within the docker image we should all have the same state to run the same code with, getting the same results. Apparently this is not the case but at this point I have no idea what the differences are.

Some other things I tried: Upgrading to chromium / chromium-driver 129. It has had mixed results.

  • First I ran into a couple of instances where upgrading to 129 would apparently fix the issue for all further tests in the container. Up to 10 test executions in a row that would work.
  • Later I realized that the tests still fail once I run /root/gradlew clean after the update
  • I've also realized that even in headless=old mode the error can happen, it just happens more rarely

So the takeaway is that unspecified things happened that fixed the error temporarily but generally speaking 129 seems to exhibit the same issue we've been seeing with 128.

The only way I know that reliably avoids the timeout error is by using a chromium/chromedriver version < 127. And if it works perfectly on your end I wish what causes that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants