Skip to content

Commit 5fd66ee

Browse files
authored
Qualcomm AI Engine Direct - fix sliding attention update bug
Differential Revision: D82745889 Pull Request resolved: #14411
1 parent c2ddeec commit 5fd66ee

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

examples/qualcomm/oss_scripts/llama/runner/kv_manager.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -242,9 +242,8 @@ void KVManager<T>::update_attention_mask(
242242
std::fill_n(
243243
cur_ptr, std::abs(n_past + ar_len) - avalible_cache_len, neg_val);
244244
}
245-
246-
cur_ptr += metadata_.context_len;
247245
}
246+
cur_ptr += metadata_.context_len;
248247
}
249248
}
250249

0 commit comments

Comments
 (0)