[11-03] 组会内容介绍：基于知识感知的异构图学习方法的 issue-PR 链接预测研究 #305

andyhuang18 · 2024-11-03T14:44:50Z

Description

汇报人：黄温瑞

本次会议分享一篇发表在《IEEE Transactions on Software Engineering》（CCF-A）的一篇论文。

论文链接：

Improving_Issue-PR_Link_Prediction_via_Knowledge-Aware_Heterogeneous_Graph_Learning.pdf

论文摘要：

Links between issues and pull requests (PRs) assist GitHub developers in tackling technical challenges, gaining development inspiration, and improving repository maintenance. In realistic repositories, these links are still insufficiently established. Aiming at this situation, existing works focus on issues and PRs themselves and employ text similarity with additional information like issue size to predict issue-PR links, yet their effectiveness is unsatisfactory. The limitation is that issues and PRs are not isolated on GitHub. Rather, they are related to multiple GitHub sources, including repositories and submitters, which, through their diverse relationships, can supply potential and crucial knowledge about technical domains, developmental insights, and cross-repository technical details. To this end, we propose Auto IP Linker (AIPL), which introduces the heterogeneous graph to model multiple GitHub sources with their relationships. Further, it leverages the metapath-based technique to reveal and incorporate the potential information for a more comprehensive understanding of issues and PRs. Firstly, we identify 4 types of GitHub sources related to issues and PRs (repositories, users, issues, PRs) as well as their relationships, and model them into task-specific heterogeneous graphs. Next, we analyze information transmitted among issues or PRs to reveal which knowledge is crucial for them. Based on our analysis, we formulate a series of metapaths and employ the metapath-based technique to incorporate various information for learning the knowledgeaware embedding of issues and PRs. Finally, we can infer whether an issue and a PR can be linked based on their embedding. We evaluate the performance of AIPL on real-world data sets collected from GitHub. The results show that, compared to the baselines, AIPL can achieve average improvements of 15.94%, 15.19%, 20.52%, and 18.50% in terms of Accuracy, Precision, Recall, and F1-score.

AIPL框架概览：

AIPL性能展示：

Feature:

更丰富的节点类型和边类型
针对Issue、PullRequest、SHA的缩写识别
针对PullRequest的不唯一issue_id的同一节点合并
对不同项目中的实体id扩展，以确保在全域中的id唯一性

birdflyi mentioned this issue Nov 18, 2024

[11-18] 组会内容介绍：基于协作信息抽取工具GH_CoRE(GitHub_Collaboration_Relation_Extraction)的异质图构建与相关下游任务讨论 #309

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[11-03] 组会内容介绍：基于知识感知的异构图学习方法的 issue-PR 链接预测研究 #305

[11-03] 组会内容介绍：基于知识感知的异构图学习方法的 issue-PR 链接预测研究 #305

andyhuang18 commented Nov 3, 2024 •

edited

Loading

birdflyi commented Nov 4, 2024

[11-03] 组会内容介绍：基于知识感知的异构图学习方法的 issue-PR 链接预测研究 #305

[11-03] 组会内容介绍：基于知识感知的异构图学习方法的 issue-PR 链接预测研究 #305

Comments

andyhuang18 commented Nov 3, 2024 • edited Loading

Description

论文链接：

论文摘要：

AIPL框架概览：

AIPL性能展示：

相关论文：

birdflyi commented Nov 4, 2024

Feature:

andyhuang18 commented Nov 3, 2024 •

edited

Loading