Skip to content

Conversation

@jengebr
Copy link
Contributor

@jengebr jengebr commented Nov 11, 2025

HashMap.putAll() optimizations: Eliminating Megamorphic Call Site Bottlenecks

Summary

This PR addresses performance bottlenecks in HashMap.putMapEntries() by implementing direct optimizations for specific input types: j.u.HashMap and j.u.Collections$UnmodifiableMap. The optimizations target HashMap(Map) constructor and putAll() operations based on the real-world megamorphic behavior identified in JDK-8368292, delivering significant performance improvements when multiple Map subtypes are used.

Problem Context

Megamorphic Call Site Overhead in Map Iteration

HashMap.putMapEntries() currently uses a generic approach that suffers from megamorphic call site overhead when applications perform bulk creation or population of HashMaps from various source map types:

  1. m.entrySet() becomes megamorphic across different map implementations
  2. entrySet().iterator() creates different iterator types
  3. entry.getKey() and entry.getValue() calls vary by map type
  4. Individual putVal() calls for each entry

When the source is Collections$UnmodifiableMap, the problem is compounded by megamorphic wrappers around the already-megamorphic iteration methods. In cases where the unwrapped map is also a HashMap, both the wrapper overhead and the iteration overhead can be eliminated with a single optimization.

Optimized Methods

HashMap

  • putMapEntries(Map<? extends K, ? extends V> m, boolean evict): Added fast paths for UnmodifiableMap unwrapping and HashMap-to-HashMap copying
  • putMapEntries(HashMap<? extends K, ? extends V> src, boolean evict): copies HashMap-to-HashMap via direct Node processing. Avoids polymorphic issues and eliminates redundant calls to HashMap.hash().

Implementation Details

HashMap-to-HashMap Fast Path

Directly iterates over src.table to eliminate entrySet() allocation and polymorphic iterator calls, using the internal Node structure for maximum efficiency.

UnmodifiableMap Unwrapping

Detects UnmodifiableMap instances and accesses the underlying map directly via the m field, eliminating wrapper-induced megamorphic call sites. UnmodifiableMap visibility changed from private to package-private to enable this direct access.

Performance Impact

Source Type Size Poisoned Baseline Optimized Improvement
HashMap 0 true 9.429 ns/op 8.875 ns/op 6% faster
HashMap 5 true 181.645 ns/op 80.012 ns/op 56% faster
HashMap 25 true 796.415 ns/op 283.643 ns/op 64% faster
TreeMap 0 true 9.356 ns/op 9.155 ns/op -
TreeMap 5 true 169.233 ns/op 169.566 ns/op -
TreeMap 25 true 792.667 ns/op 791.195 ns/op -
ConcurrentHashMap 0 true 9.275 ns/op 9.156 ns/op -
ConcurrentHashMap 5 true 229.291 ns/op 227.382 ns/op -
ConcurrentHashMap 25 true 1071.336 ns/op 1105.111 ns/op -
UnmodifiableMap(HashMap) 0 true 13.563 ns/op 9.253 ns/op 32% faster
UnmodifiableMap(HashMap) 5 true 318.184 ns/op 76.264 ns/op 76% faster
UnmodifiableMap(HashMap) 25 true 1510.489 ns/op 284.001 ns/op 81% faster
UnmodifiableMap(TreeMap) 5 true 323.412 ns/op 186.990 ns/op 42% faster
UnmodifiableMap(TreeMap) 25 true 1529.959 ns/op 809.125 ns/op 47% faster

This shows substantial improvements (32-81%) for optimized paths while maintaining no significant regression for non-optimized paths.

Call Site Poisoning Impact

Source Type Size Unpoisoned Poisoned Degradation
HashMap 0 4.949 ns/op 9.429 ns/op 91% slower
HashMap 5 86.796 ns/op 181.645 ns/op 109% slower
HashMap 25 322.196 ns/op 796.415 ns/op 147% slower
TreeMap 0 5.043 ns/op 9.356 ns/op 86% slower
TreeMap 5 102.329 ns/op 169.233 ns/op 65% slower
TreeMap 25 433.668 ns/op 792.667 ns/op 83% slower
ConcurrentHashMap 0 5.054 ns/op 9.275 ns/op 84% slower
ConcurrentHashMap 5 135.616 ns/op 229.291 ns/op 69% slower
ConcurrentHashMap 25 591.134 ns/op 1071.336 ns/op 81% slower
UnmodifiableMap(HashMap) 0 5.319 ns/op 13.563 ns/op 155% slower
UnmodifiableMap(HashMap) 5 146.471 ns/op 318.184 ns/op 117% slower
UnmodifiableMap(HashMap) 25 393.053 ns/op 1510.489 ns/op 284% slower
UnmodifiableMap(TreeMap) 0 5.331 ns/op 13.539 ns/op 154% slower
UnmodifiableMap(TreeMap) 5 78.215 ns/op 323.412 ns/op 314% slower
UnmodifiableMap(TreeMap) 25 332.189 ns/op 1529.959 ns/op 361% slower

FAQ

Q: Why use exact class check instead of instanceof for HashMap?
A: This prevents the fast path from becoming megamorphic if HashMap subclasses are common, ensuring the optimization remains effective. Also, subclasses might override behavior in ways that make direct table access incorrect.

Q: Is it safe to make UnmodifiableMap package-private?
A: Yes. The class remains non-public and is only accessible within java.util. The change enables internal optimizations without exposing implementation details to external code.

Q: Are these optimizations safe?
A: Yes. HashMap-to-HashMap copying maintains identical behavior, and UnmodifiableMap unwrapping accesses the exact same data through a more direct path.

Q: What about other Map types?
A: Profiling data shows these are the biggest opportunities. Others exist but are lower-priority.

Q: How do these optimizations interact with existing code?
A: These are internal implementation optimizations that maintain full API compatibility. No external code changes are required.

Testing

Comprehensive unit tests added to MOAT.java covering:

  • HashMap.putAll() with various source map types
  • UnmodifiableMap handling in putAll operations
  • Correctness verification for all optimization paths

Tier-1 tests passed.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8371656: HashMap.putAll() optimizations (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28243/head:pull/28243
$ git checkout pull/28243

Update a local copy of the PR:
$ git checkout pull/28243
$ git pull https://git.openjdk.org/jdk.git pull/28243/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28243

View PR using the GUI difftool:
$ git pr show -t 28243

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28243.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 11, 2025

👋 Welcome back jengebr! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 11, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Nov 11, 2025

@jengebr The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@jengebr jengebr changed the title Optimizing HashMap.putAll() and .<init> for HashMap and C$UM 8371656: HashMap.putAll() optimizations Nov 11, 2025
@jengebr jengebr marked this pull request as ready for review November 11, 2025 21:55
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 11, 2025
@mlbridge
Copy link

mlbridge bot commented Nov 11, 2025

Webrevs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

1 participant