Skip to content

Conversation

@mergify
Copy link
Contributor

@mergify mergify bot commented Dec 10, 2025

Release notes

Removal of duplicated gems in logstash artifacts.

What does this PR do?

Bundler is used to manage a gem environment that is shipped with logstash
artifacts. By default, bundler will install newer/duplicate gems than shipped
with ruby distributions (in logstash's case jruby). Duplicate gems in the
shipped environment can cause issues with code loading with ambiguous gem specs
or gem activation issues. This commit adds a step to compute the duplicate gems
managed with bundler (and therefore direct/transitive dependencies of
logstash/plugins) and removes copies shipped with jruby. Note that there are
two locations to do the deduplication at. Both the stdlib gems as well as what
jruby refers to as "bundled" gems. The existing pattern for excluding files from
artifacts is used to implement the deduplication. Note that for the standard lib
gems only remove duplicate gemspec files as removal of the code itself triggers
noisy warning from ruby and code loading problems.

Why is it important/What is the impact to the user?

In some cases security scanners would pick up vendored/standard lib gems which typically trail in version shipped with the jruby distrubuted with logstash artifacts. While the newer code was loaded for logstash (and therefore not a practical threat) the scanner would still produce noise and require justifications. By removing old/duplicated gems we remove the false positives on the scanners.

How to test this PR locally

Build a container artifact and look for duplicated gems:

➜  logstash git:(deduplicate-gem-env) ✗ ARCH="aarch64" rake artifact:docker
Using system java: /Users/cas/.jenv/shims/java
Skipping bundler install...
Building logstash-core using gradle
./gradlew assemble
To honour the JVM settings for this build a single-use Daemon process will be forked. For more on this, please refer to https://docs.gradle.org/8.11.1/userguide/gradle_daemon.html#sec:disabling_the_daemon in the Gradle documentation.
Daemon will be stopped at the end of the build

> Task :downloadJRuby UP-TO-DATE
Download https://repo1.maven.org/maven2/org/jruby/jruby-dist/9.4.13.0/jruby-dist-9.4.13.0-bin.tar.gz

BUILD SUCCESSFUL in 4s
33 actionable tasks: 2 executed, 31 up-to-date
[plugin:install-default] Installing default plugins
Installing logstash-codec-avro, logstash-codec-cef, logstash-codec-collectd, logstash-codec-dots, logstash-codec-edn, logstash-codec-edn_lines, logstash-codec-es_bulk, logstash-codec-fluent, logstash-codec-graphite, logstash-codec-json, logstash-codec-json_lines, logstash-codec-line, logstash-codec-msgpack, logstash-codec-multiline, logstash-codec-netflow, logstash-codec-plain, logstash-codec-rubydebug, logstash-filter-aggregate, logstash-filter-anonymize, logstash-filter-cidr, logstash-filter-clone, logstash-filter-csv, logstash-filter-date, logstash-filter-de_dot, logstash-filter-dissect, logstash-filter-dns, logstash-filter-drop, logstash-filter-elastic_integration, logstash-filter-elasticsearch, logstash-filter-fingerprint, logstash-filter-geoip, logstash-filter-grok, logstash-filter-http, logstash-filter-json, logstash-filter-kv, logstash-filter-memcached, logstash-filter-metrics, logstash-filter-mutate, logstash-filter-prune, logstash-filter-ruby, logstash-filter-sleep, logstash-filter-split, logstash-filter-syslog_pri, logstash-filter-throttle, logstash-filter-translate, logstash-filter-truncate, logstash-filter-urldecode, logstash-filter-useragent, logstash-filter-uuid, logstash-filter-xml, logstash-input-azure_event_hubs, logstash-input-beats, logstash-input-couchdb_changes, logstash-input-dead_letter_queue, logstash-input-elasticsearch, logstash-input-exec, logstash-input-file, logstash-input-ganglia, logstash-input-gelf, logstash-input-generator, logstash-input-graphite, logstash-input-heartbeat, logstash-input-http, logstash-input-http_poller, logstash-input-jms, logstash-input-pipe, logstash-input-redis, logstash-input-stdin, logstash-input-syslog, logstash-input-tcp, logstash-input-twitter, logstash-input-udp, logstash-input-unix, logstash-input-elastic_serverless_forwarder, logstash-integration-jdbc, logstash-integration-kafka, logstash-integration-logstash, logstash-integration-rabbitmq, logstash-integration-snmp, logstash-integration-aws, logstash-output-csv, logstash-output-elasticsearch, logstash-output-email, logstash-output-file, logstash-output-graphite, logstash-output-http, logstash-output-lumberjack, logstash-output-nagios, logstash-output-null, logstash-output-pipe, logstash-output-redis, logstash-output-stdout, logstash-output-tcp, logstash-output-udp, logstash-output-webhdfs
Installation successful
[artifact:archives] Building tar.gz/zip of default plugins for OS: linux, arch: arm64
Adding duplicate gems to exclude path: base64, bigdecimal, cgi, date, ffi, fileutils, jar-dependencies, jruby-openssl, json, logger, net-http, net-imap, net-pop, net-protocol, net-smtp, psych, racc, rake, rexml, ruby2_keywords, timeout, uri
Full exclude_paths list:
 - **/*.gem
 - **/test/files/slow-xpath.xml
 - **/logstash-*/spec
 - bin/bundle
 - bin/rspec
 - bin/rspec.bat
 - vendor/**/gems/*/test/**/*
 - vendor/**/gems/*/spec/**/*
 - vendor/**/gems/**/Gemfile.lock
 - vendor/**/gems/**/Gemfile
 - vendor/jruby/lib/ruby/gems/shared/gems/jar-dependencies-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/jar-dependencies-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/jar-dependencies-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/net-imap-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/net-imap-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/net-imap-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/net-pop-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/net-pop-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/net-pop-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/net-smtp-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/net-smtp-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/net-smtp-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/racc-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/racc-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/racc-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/rake-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/rake-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/rake-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/gems/rexml-*/**/*
 - vendor/jruby/lib/ruby/gems/shared/gems/rexml-*
 - vendor/jruby/lib/ruby/gems/shared/specifications/rexml-*.gemspec
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/base64-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/base64.rb
 - vendor/jruby/lib/ruby/stdlib/base64/**/*
 - vendor/jruby/lib/ruby/stdlib/base64
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/bigdecimal-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/bigdecimal.rb
 - vendor/jruby/lib/ruby/stdlib/bigdecimal/**/*
 - vendor/jruby/lib/ruby/stdlib/bigdecimal
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/cgi-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/cgi.rb
 - vendor/jruby/lib/ruby/stdlib/cgi/**/*
 - vendor/jruby/lib/ruby/stdlib/cgi
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/date-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/date.rb
 - vendor/jruby/lib/ruby/stdlib/date/**/*
 - vendor/jruby/lib/ruby/stdlib/date
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/ffi-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/ffi.rb
 - vendor/jruby/lib/ruby/stdlib/ffi/**/*
 - vendor/jruby/lib/ruby/stdlib/ffi
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/fileutils-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/fileutils.rb
 - vendor/jruby/lib/ruby/stdlib/fileutils/**/*
 - vendor/jruby/lib/ruby/stdlib/fileutils
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/jar-dependencies-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/jar-dependencies.rb
 - vendor/jruby/lib/ruby/stdlib/jar-dependencies/**/*
 - vendor/jruby/lib/ruby/stdlib/jar-dependencies
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/jruby-openssl-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/jruby-openssl.rb
 - vendor/jruby/lib/ruby/stdlib/jruby-openssl/**/*
 - vendor/jruby/lib/ruby/stdlib/jruby-openssl
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/json-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/json.rb
 - vendor/jruby/lib/ruby/stdlib/json/**/*
 - vendor/jruby/lib/ruby/stdlib/json
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/logger-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/logger.rb
 - vendor/jruby/lib/ruby/stdlib/logger/**/*
 - vendor/jruby/lib/ruby/stdlib/logger
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/net-http-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/net-http.rb
 - vendor/jruby/lib/ruby/stdlib/net-http/**/*
 - vendor/jruby/lib/ruby/stdlib/net-http
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/net-protocol-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/net-protocol.rb
 - vendor/jruby/lib/ruby/stdlib/net-protocol/**/*
 - vendor/jruby/lib/ruby/stdlib/net-protocol
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/psych-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/psych.rb
 - vendor/jruby/lib/ruby/stdlib/psych/**/*
 - vendor/jruby/lib/ruby/stdlib/psych
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/racc-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/racc.rb
 - vendor/jruby/lib/ruby/stdlib/racc/**/*
 - vendor/jruby/lib/ruby/stdlib/racc
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/ruby2_keywords-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/ruby2_keywords.rb
 - vendor/jruby/lib/ruby/stdlib/ruby2_keywords/**/*
 - vendor/jruby/lib/ruby/stdlib/ruby2_keywords
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/timeout-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/timeout.rb
 - vendor/jruby/lib/ruby/stdlib/timeout/**/*
 - vendor/jruby/lib/ruby/stdlib/timeout
 - vendor/jruby/lib/ruby/gems/shared/specifications/default/uri-*.gemspec
 - vendor/jruby/lib/ruby/stdlib/uri.rb
 - vendor/jruby/lib/ruby/stdlib/uri/**/*
 - vendor/jruby/lib/ruby/stdlib/uri
[artifact:tar] building build/logstash-9.3.0-SNAPSHOT-linux-aarch64.tar.gz
[docker] Building docker image
➜  logstash git:(deduplicate-gem-env) ✗ docker image ls
REPOSITORY                                 TAG              IMAGE ID       CREATED          SIZE
docker.elastic.co/logstash/logstash-full   9.3.0-SNAPSHOT   fa5a1591bf02   54 seconds ago   1.48GB
docker.elastic.co/logstash/logstash        9.3.0-SNAPSHOT   fa5a1591bf02   54 seconds ago   1.48GB
python                                     3                671d8548cfc6   2 weeks ago      1.61GB
➜  logstash git:(deduplicate-gem-env) ✗ docker run -it fa5a1591bf02 /bin/bash
bash-5.1$ find / -name *rexml*
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/doc/rexml
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml/rexml.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-filter-xml-4.3.2/lib/logstash/filters/xml/patch_rexml.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/aws-sdk-core-3.234.0/lib/aws-sdk-core/xml/parser/rexml_engine.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/specifications/rexml-3.4.4.gemspec
/usr/share/logstash/vendor/jruby/lib/ruby/gems/shared/gems/rss-0.2.9/lib/rss/rexmlparser.rb
find: ‘/root’: Permission denied
find: ‘/var/cache/ldconfig’: Permission denied
find: ‘/proc/tty/driver’: Permission denied
bash-5.1$ find / -name *uri*
/sys/kernel/security
/sys/module/spurious
/usr/lib64/security
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/twitter-6.2.0/lib/twitter/entity/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rexml-3.4.4/lib/rexml/security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/http-cookie-1.1.0/lib/http/cookie/uri_parser.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/addressable-2.8.7/lib/addressable/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/sequel-5.97.0/lib/sequel/plugins/blacklist_security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/sequel-5.97.0/lib/sequel/plugins/whitelist_security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/tzinfo-data-1.2025.2/lib/tzinfo/data/definitions/Indian/Mauritius.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/tzinfo-data-1.2025.2/lib/tzinfo/data/definitions/Europe/Zurich.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/uri-1.0.4
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/uri-1.0.4/lib/uri
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/uri-1.0.4/lib/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rack-session-2.1.1/security.md
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/rack-protection-4.2.1/lib/rack/protection/content_security_policy.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/http-3.3.0/lib/http/uri.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/elasticsearch-api-8.19.1/lib/elasticsearch/api/actions/security
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/elasticsearch-api-8.19.1/lib/elasticsearch/api/namespace/security.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/mustermann-3.0.4/bench/uri_parser_object.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/mustermann-3.0.4/bench/capturing.rb
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/nio4r-2.7.4-java/ext/libev/ev_iouring.c
/usr/share/logstash/vendor/bundle/jruby/3.1.0/specifications/uri-1.0.4.gemspec
/usr/share/logstash/vendor/jruby/lib/ruby/gems/shared/specifications/default/open-uri-0.3.0.gemspec
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/uri/lib/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/uri/lib/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/vendor/optparse/lib/optparse/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/security.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/security
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/security_option.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/s3_uri_signer.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/rubygems/uri_formatter.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendor/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendor/uri/lib/uri
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendor/uri/lib/uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/vendored_uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/uri_credentials_filter.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/bundler/uri_normalizer.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/open-uri.rb
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/optparse/uri.rb
/usr/share/logstash/lib/pluginmanager/pack_fetch_strategy/uri.rb
/usr/share/logstash/logstash-core/lib/logstash/util/safe_uri.rb

Related issues


This is an automatic backport of pull request #18340 done by [Mergify](https://mergify.com).

* Remove duplicate gems when producting logstash artifacts

Bundler is used to manage a gem environment that is shipped with logstash
artifacts. By default, bundler will install newer/duplicate gems than shipped
with ruby distributions (in logstash's case jruby). Duplicate gems in the
shipped environment can cause issues with code loading with ambiguous gem specs
or gem activation issues. This commit adds a step to compute the duplicate gems
managed with bundler (and therefore direct/transitive dependencies of
logstash/plugins) and *removes* copies shipped with jruby. Note that there are
two locations to do the deduplication at. Both the stdlib gems as well as what
jruby refers to as "bundled" gems. The existing pattern for excluding files from
artifacts is used to implement the deduplication.

* only remove gemspecs for duplicated stdlib gems

* Make deduplicate a separate rake task and prevent gradle errors

Deduplication should happen as a depenedency of installing default gems. In the
current workflow we have a top level gradle task for packaging which calls out
to rake. Rake then invokes a *separate* gradle process. When we modify the jruby
default, when the separate gradle process goes to check of jruby is installed,
it sees a modified jruby and tries to re-install. We work around this by
changing how gradle detects if jruby is required to be installed.

* Ensure the set of gems tested at unit level matches packages

This commit adds the installDefaultGems task to the unit test tasks. This
ensures that the gem env tested at the unit level matches the deduplicated one
at the integration/acceptance level. Takes over #18330

* WIP: Use logstash_gem_home for gemInstaller

This commit changes gemInstaller such that the centralized gem_home from
Logstash::Environment is used instead of hard coding in a fragile path. The
tests were the only consumer of the optional positional parameter in the
`install` class method.

* Fix gem env setup for ruby unit tests

After some deeeeeeeep diving into comparing the state of running logstash from a
compiled artifact vs the unit tests i finally figured out that the use of the
bundler `setup!` method in unit tests is imcompatible with a couple of tests.
Specifically that method puts bundler installed gems ahead of the standard lib
gems in the load path. This commit solves that by re-positioning the standarl
lib back to the front of the load path.

* Show how using `--prefer-local` causes issues

Ideally bundler will consider default/stdlib gems when doing dependency resolution
to avoid duplication in the first place. this seems to break the pluginmanager.
Verify this happens in CI...

* Revert "Show how using `--prefer-local` causes issues"

This reverts commit 5a3b2bb.

* fix rebase error

(cherry picked from commit e08abb8)
@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • /run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

@donoghuc donoghuc closed this Dec 10, 2025
@elasticmachine
Copy link

💚 Build Succeeded

cc @donoghuc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants