Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulations on nodes running serially. #12

Closed
alanroche opened this issue Jun 27, 2016 · 8 comments
Closed

Simulations on nodes running serially. #12

alanroche opened this issue Jun 27, 2016 · 8 comments

Comments

@alanroche
Copy link

Hi,

Using the version 1.0.6
We have noticed that tests on multiple nodes seem to be running serially. ie. The simulation runs against one node, - then the next, then the next and so on.

Standard out etc. does not appear to be interspersed or mixed up as one would expect.

It almost looks like the threadpool executor in GatlingAwsMojo is running on a single thread, though the configuration doesn't seem to suggest so. Another possibility perhaps is a synchronized block has been introduced in the SSH client etc

I am going to have investigate this in more detail tomorrow, - but we are wondering if this is something someone has noticed already?

Thanks,
Alan

@ingojaeckel
Copy link
Contributor

Hey @alanroche,

thanks for reporting this! That sounds like a regression in 1.0.6. Can you confirm if this was working fine on 1.0.5?

Thanks,
Ingo

@alanroche
Copy link
Author

alanroche commented Jun 28, 2016

I tried with a 1.0.6-SNAPSHOT which we believe was working OK before (on a different machine).
We get the same results.

Going to try on 1.0.5 next

@alanroche
Copy link
Author

alanroche commented Jun 28, 2016

OK, so the same issue is there on 1.0.5, - but after some investigation I have found the problem and solution/workaround.

This had a few of us stuck for a day or two, - so it would be very helpful to others to make a note in the README.mds as a gotcha, - hopefully others may then come to a quicker solution if they run into it.
It was a tricky one to track down and could burn a lot of time for other users.

The issue comes from SSH generating secure random numbers using a native (OS) random number generator. We are running on VMs so it is likely that this is also a problem that is exasperated by running from a VM, - hard to say for sure thoughj

The problem is basically this:
https://issues.jenkins-ci.org/browse/JENKINS-20108
http://stackoverflow.com/questions/137212/how-to-solve-performance-problem-with-java-securerandom

The culprit thread in the stack trace is this:

"pool-4-thread-2" #25 prio=5 os_prio=0 tid=0x00007f0e1493b800 nid=0x860 runnable [0x00007f0dcbb92000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:410) at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:427) - locked <0x00000000e0c432e0> (a java.lang.Object) at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:329) at sun.security.provider.NativePRNG.engineGenerateSeed(NativePRNG.java:224) at java.security.SecureRandom.generateSeed(SecureRandom.java:533) at net.schmizz.sshj.transport.random.BouncyCastleRandom.<init>(BouncyCastleRandom.java:44) at net.schmizz.sshj.transport.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:36) at net.schmizz.sshj.transport.random.BouncyCastleRandom$Factory.create(BouncyCastleRandom.java:31) at net.schmizz.sshj.transport.random.SingletonRandomFactory.<init>(SingletonRandomFactory.java:27) at net.schmizz.sshj.DefaultConfig.initRandomFactory(DefaultConfig.java:117) at net.schmizz.sshj.DefaultConfig.<init>(DefaultConfig.java:93) at net.schmizz.sshj.SSHClient.<init>(SSHClient.java:143) at com.ea.gatling.SshClient.getSshClient(SshClient.java:115) at com.ea.gatling.SshClient.executeCommand(SshClient.java:55) at com.ea.gatling.AwsGatlingExecutor.runGatlingTest(AwsGatlingExecutor.java:114) at com.ea.gatling.AwsGatlingExecutor.run(AwsGatlingExecutor.java:141) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

There are a few solutions:
urandom actually seems to work OK, - the problem for us seems to be that it is using /dev/random and not urandom.

  • Set -Djava.security.egd=file:/dev/urandom in MAVEN_OPTS
  • Comment out securerandom.source=file:/dev/random $JAVA_HOME/lib/security/java.security
  • OR : Update securerandom.source to file:/dev/urandom *
    ^ No this isn't a typo with the DOT apparently you need it otherwise java rewrites it to /dev/./random !!!!
    Later this evening, I will look into whether there is a way to incorporate this into the code as a fix, - if even just to detect the scenario and issue a warning.

@ingojaeckel
Copy link
Contributor

Thanks for your investigation on this! Can you post more information about your environment e.g. JDK version, OS, Jenkins version, Maven version?

@alanroche
Copy link
Author

  • JDK/JRE 8u92
  • Happens on both CentOS 6.8 and Ubuntu 16.04
  • Maven 3.3.9

I will need to check Jenkins version in the morning when I get back into the office, -though the issue isn't really related to Jenkins. Interesting though that they(Jenkins) encountered the same issue with Jenkins slaves which would also be over SSH.

@alanroche
Copy link
Author

@petrvlcek
Copy link
Contributor

Hi,

I'm not sure if this is the same problem, but I have experienced very low performance of SSH client when running load tests on Jenkins.

I have successfully resolved this issue by reusing SSH configuration for all subsequent creations of SSH client. My solution is in pull request #17 if you are interested.

@ingojaeckel
Copy link
Contributor

closing this since @petrvlcek 's PR got merged at the end of 2016. let me know if this is still an issue @alanroche

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants