Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot reproduce the results, is there anything wrong? #18

Open
QiuJYWX opened this issue May 16, 2024 · 4 comments
Open

cannot reproduce the results, is there anything wrong? #18

QiuJYWX opened this issue May 16, 2024 · 4 comments

Comments

@QiuJYWX
Copy link

QiuJYWX commented May 16, 2024

#!/bin/bash

CUDA_VISIBLE_DEVICES=0,1 python ../evaluation/run_evaluation_llm4decompile_vllm.py
--model_path ../../LLM/llm4decompile-6.7b-v1.5
--testset_path ../decompile-eval/decompile-eval.json
--gpus 2
--max_total_tokens 2048
--max_new_tokens 2000
--repeat 1
--num_workers 32
--gpu_memory_utilization 0.82
--temperature 0

Optimization O0: Compile Rate: 0.9268, Run Rate: 0.5488
Optimization O1: Compile Rate: 0.9268, Run Rate: 0.3598
Optimization O2: Compile Rate: 0.8902, Run Rate: 0.3537
Optimization O3: Compile Rate: 0.8902, Run Rate: 0.3171

@albertan017
Copy link
Owner

Thanks for testing the code. Please use the decompile-eval-executable-gcc-obj.json. All the evaluations and models are based on executable, which is different from our previous setting (object file, not linked).

Updates

  • [2023-05-16]: Please use decompile-eval-executable-gcc-obj.json. The source codes are compiled into executable binaries and disassembled into assembly instructions.

@QiuJYWX
Copy link
Author

QiuJYWX commented May 17, 2024

Thx for the reply, will try again.

@QiuJYWX
Copy link
Author

QiuJYWX commented Jun 21, 2024

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

@albertan017
Copy link
Owner

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Yes, deepseek-coder-v2 demonstrates a strong ability to decompile binaries, achieving decompilation results comparable to those of GPT-4o (avg 15% on HumanEval-Decompile). Our efforts are ongoing for the llm4decompile-ref (which achieves much better results compared to directly decompile). While we are not working with the 236B version, it is far beyond our budget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants