-
Notifications
You must be signed in to change notification settings - Fork 279
Issues: dvlab-research/LongLoRA
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
not able to reproduce the passkey retrieval accuracy
#195
opened Sep 16, 2024 by
zhuconv
updated Sep 22, 2024
这套代码是否支持qwen/baichuan微调一个中文的长文本模型,代码需要做哪些修改?
#191
opened Jul 22, 2024 by
jy-101361-1810897
updated Jul 22, 2024
I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl.
#188
opened May 28, 2024 by
masteryqq
updated May 28, 2024
What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)?
#184
opened Apr 22, 2024 by
ZackZikaiXiao
updated Apr 22, 2024
How did make questions and answers for long context(LongAlpaca)?
#183
opened Mar 4, 2024 by
ddoyles
updated Mar 4, 2024
When I set
per_device_train_batch_size=2
, the S2-Attn would not shift as expected
#182
opened Mar 1, 2024 by
linhaojia13
updated Mar 4, 2024
HF models missing rope scaling in the config
#181
opened Feb 29, 2024 by
hsiehjackson
updated Feb 29, 2024
Regarding the results in Table 8 and Table 14
#177
opened Feb 4, 2024 by
Statisticss
updated Feb 4, 2024
About the different datasets and corresponding models
#176
opened Feb 2, 2024 by
Statisticss
updated Feb 2, 2024
training a LLM w/ shifted sparse attention from the scratch?
#173
opened Jan 24, 2024 by
we1k
updated Jan 24, 2024
merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge
#172
opened Jan 23, 2024 by
Spongeorge
updated Jan 23, 2024
LongLoRA + Flash Attention 2 causing illigal memory access
#148
opened Nov 21, 2023 by
ArturNiederfahrenhorst
updated Jan 19, 2024
Is it possible to increase the context length of phi-2 using LongLora? If yes, what changes need to be done to support it?
#169
opened Jan 18, 2024 by
dbanka
updated Jan 19, 2024
论文中的evaluate结果,推理时用的attention是shifted sparse attention?还是full attention?
#170
opened Jan 19, 2024 by
zhangxiann
updated Jan 19, 2024
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-01-20.