-
Couldn't load subscription status.
- Fork 28.9k
[SPARK-53446][CORE] Optimize BlockManager remove operations with cached block mappings #52210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ed block mappings
| Option(rddToBlockIds.get(rddBlockId.rddId)).foreach { blockSet => | ||
| blockSet.remove(blockId) | ||
| if (blockSet.isEmpty) { | ||
| rddToBlockIds.remove(rddBlockId.rddId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a race condition here:
- thread 1 calls
addToCache, gets the map for its RDD id - thread 2 calls
removeFromCache, gets the map for the same RDD id, remove the last block id, and then removes this RDD id from the cache - thread 1 adds the block id, but it's noop as this map entire is dangling now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @cloud-fan, use compute instead.
For example, something like this:
// Use import java.util.{Set => JSet} and change change the 'type' of value for the sets to JSet[BlockId]
def removeFromCache(
<snip>
def doRemove[K](map: ConcurrentHashMap[K, JSet[BlockId]], key: K, block: BlockId): Unit = {
map.compute(key,
(_, set) => {
if (null != set) {
set.remove(block)
if (set.isEmpty) null else set
} else {
// missing
null
}
}
)
}
<snip>
case rddBlockId: RDDBlockId =>
doRemove(rddToBlockIds, rddBlockId.rddId, blockId)
case broadcastBlockId: BroadcastBlockId =>
doRemove(broadcastToBlockIds, broadcastBlockId.broadcastId, blockId)
// and so on
| exceptionWasThrown = false | ||
| if (res.isEmpty) { | ||
| // the block was successfully stored | ||
| addToCache(blockId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is doPut the only entry point that can add blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For adding a new block, yes; it goes through doPut
|
can we put this extra mapping in the lower level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick pass, thanks for working on this !
| Option(rddToBlockIds.get(rddBlockId.rddId)).foreach { blockSet => | ||
| blockSet.remove(blockId) | ||
| if (blockSet.isEmpty) { | ||
| rddToBlockIds.remove(rddBlockId.rddId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @cloud-fan, use compute instead.
For example, something like this:
// Use import java.util.{Set => JSet} and change change the 'type' of value for the sets to JSet[BlockId]
def removeFromCache(
<snip>
def doRemove[K](map: ConcurrentHashMap[K, JSet[BlockId]], key: K, block: BlockId): Unit = {
map.compute(key,
(_, set) => {
if (null != set) {
set.remove(block)
if (set.isEmpty) null else set
} else {
// missing
null
}
}
)
}
<snip>
case rddBlockId: RDDBlockId =>
doRemove(rddToBlockIds, rddBlockId.rddId, blockId)
case broadcastBlockId: BroadcastBlockId =>
doRemove(broadcastToBlockIds, broadcastBlockId.broadcastId, blockId)
// and so on
| /** | ||
| * Add a block ID to the appropriate cache mapping based on its type. | ||
| */ | ||
| private def addToCache(blockId: BlockId): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Move these helper methods towards the bottom.
| val blocksToRemove = Option(rddToBlockIds.get(rddId)) match { | ||
| case Some(blockSet) => | ||
| blockSet.asScala.toSeq | ||
| case None => | ||
| Seq.empty | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove it proactively from the map. This also makes the size returned consistent with what is actually removed (in case of race conditions).
Same for the other cases as well
| val blocksToRemove = Option(rddToBlockIds.get(rddId)) match { | |
| case Some(blockSet) => | |
| blockSet.asScala.toSeq | |
| case None => | |
| Seq.empty | |
| } | |
| val blocksToRemove = Option(rddToBlockIds.remove(rddId)). | |
| map(_.asScala.toSeq).getOrElse(Seq.empty) |
|
|
||
| blockInfoManager.removeBlock(blockId) | ||
| removeFromCache(blockId) | ||
| hasRemoveBlock = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do this in the finally block as well.
| exceptionWasThrown = false | ||
| if (res.isEmpty) { | ||
| // the block was successfully stored | ||
| addToCache(blockId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For adding a new block, yes; it goes through doPut
What changes were proposed in this pull request?
This PR optimizes BlockManager remove operations by introducing cached mappings to eliminate O(n) linear scans. The main changes are:
Introduced three concurrent hash maps to track block ID associations:
rddToBlockIds: Maps RDD ID to its block IDsbroadcastToBlockIds: Maps broadcast ID to its block IDssessionToBlockIds: Maps session UUID to its cache block IDsAdded cache maintenance methods:
addToCache(blockId): Updates caches when blocks are storedremoveFromCache(blockId): Updates caches when blocks are deletedReworked remove operations to use cached lookups:
removeRdd(),removeBroadcast(), andremoveCache()now perform O(1) lookups instead of scanning all entriesIntegrated with block lifecycle:
doPutIterator()callsaddToCache()after successful block storageremoveBlock()callsremoveFromCache()when blocks are removedWhy are the changes needed?
Previously,
removeRdd(),removeBroadcast(), andremoveCache()required scanning all blocks inblockInfoManager.entriesto find matches. This approach becomes a serious bottleneck when:The original
removeRdd()method already contained a TODO noting that an additional mapping would be needed to avoid linear scans. This PR implements that improvement.Does this PR introduce any user-facing change?
No.
How was this patch tested?
removeRdd(),removeBroadcast(), andremoveCache(), including edge cases.Before optimization

After optimization
The optimization delivers significant performance improvements for block cleanup under large data volumes, reducing the overhead caused by frequent GC when blocks accumulate.
Was this patch authored or co-authored using generative AI tooling?
No.