Hi,
Would you mind explaining why the follow code is more memory efficient than just dividing one of them by sqrt(e)?
|
queries = queries / (e ** (1/4)) |
|
keys = keys / (e ** (1/4)) |
|
# - Instead of dividing the dot products by sqrt(e), we scale the keys and values. |
|
# This should be more memory efficient |
|
|
Thank you.