Cannot reproduce the results shown in the paper? #1

VoyageWang · 2024-06-14T07:35:34Z

Describe the issue

Hi there. A nice work! When I tried to reproduce the result of STIC, I did not see the improvement of STIC-stage1-preference optimization. The training setting is the same as yours. I tried two different versions of LLAVA 1.6 (vicuna-7b and mistral-7b) on the Science QA test dataset. Here I report the results:

llava-mistral-7b on Science-QA test without STIC-stage1

llava-mistral-7b on Science-QA test with STIC-stage1 (use your provided weight of lora)

Here somehow, we saw the results but it is not consistent with the paper (approximately 60).

And I tried STIC-stage1 in the llava-vicuna-7b. Here I saw no improvement. we did not change the trainingdate and setting.
llava-vicuna-7b (original)

llava-vicuna-7b after STIC-stage1 (trained on 4 48G L20 in our envs)

Here I also share the training loss log and lora setting here. They look normal

How should I do to get improvement in preference optimization? It really helps me. Thank you

ZihaoZheng98 · 2024-11-25T12:06:45Z

描述问题

你好。干得不错！当我尝试重现 STIC 的结果时，我没有看到 STIC-stage1-preference 优化的改进。训练设置与你的相同。我在 Science QA 测试数据集上尝试了两个不同版本的 LLAVA 1.6（vicuna-7b 和 mistral-7b）。我在这里报告结果：

llava-mistral-7b 在 Science-QA 测试中不带 STIC-stage1

llava-mistral-7b 在 Science-QA 测试中使用 STIC-stage1（使用您提供的 lora 权重）

在这里不知何故，我们看到了结果，但它与论文不一致（大约 60）。

我在 llava-vicuna-7b 中尝试了 STIC-stage1。在这里我没有看到任何改进。我们没有更改训练日期和设置。llava -vicuna-7b（原始） STIC-stage1 之后的 llava-vicuna-7b（在我们的环境中在 4 48G L20 上训练）

这里我也分享了训练损失日志和 lora 设置。它们看起来很正常

我该怎么做才能提高偏好优化？它真的帮助了我。谢谢

Hi, have you solved this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reproduce the results shown in the paper? #1

Cannot reproduce the results shown in the paper? #1

VoyageWang commented Jun 14, 2024

ZihaoZheng98 commented Nov 25, 2024

描述问题

Cannot reproduce the results shown in the paper? #1

Cannot reproduce the results shown in the paper? #1

Comments

VoyageWang commented Jun 14, 2024

Describe the issue

ZihaoZheng98 commented Nov 25, 2024

描述问题