Image we got the following scenario: The L1 data cache is configured to have less ways than threads_per_core. All threads are executing a data load with an address leading to the same cache set with different tags. The requested tags can be found in the L2 cache.
The L1 cache system will get in to a dead lock. The first thread generates a miss, get rolled back and requests a fill from the L2 cache. The corresponding response is written into L1 cache after few cycles. The same happens for the remaining threads. So when the first gets scheduled, the load will again produce a miss, because the correct cache line has already been replaced before it could be read once.
Any idea how to fixed this problem? Sure another replacement strategy would fix this specific cause, but I think it is more general problem.
Image we got the following scenario: The L1 data cache is configured to have less ways than threads_per_core. All threads are executing a data load with an address leading to the same cache set with different tags. The requested tags can be found in the L2 cache.
The L1 cache system will get in to a dead lock. The first thread generates a miss, get rolled back and requests a fill from the L2 cache. The corresponding response is written into L1 cache after few cycles. The same happens for the remaining threads. So when the first gets scheduled, the load will again produce a miss, because the correct cache line has already been replaced before it could be read once.
Any idea how to fixed this problem? Sure another replacement strategy would fix this specific cause, but I think it is more general problem.