-
Notifications
You must be signed in to change notification settings - Fork 0
CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Just a few nits on the description:
|
def gat_norm( # noqa: F811 | ||
edge_index: Adj, | ||
edge_weight: Tensor, | ||
num_nodes: Optional[int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we never pass this I think we should remove it from the function signature. But would it make sense to use size.size(1)
in the case the user passes it? (only exists on GATConv but not GATv2Conv)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, on second look, size
is already factored when computing alpha
, right? So can't we just use alpha
's shape? Then we can get rid of the num_nodes parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I don't think it is necessary true, especially when the input is sparse
torch_geometric/nn/conv/gat_conv.py
Outdated
|
||
return to_torch_csr_tensor(edge_index), att_mat | ||
|
||
assert flow in ['source_to_target', 'target_to_source'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this to the top as it's relevant to all the tensor type cases?
Actually we don't currently use flow
in the SparseTensor case, should we use it when computing degree as in the other tensor type cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use flow
when computing deg
in the SparseTensor
case too or would that be wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on the gcn_norm, I don't think we need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a bug in GCN norm?
Any idea why we would need it in the other cases but not here?
I thought switching flow
is supposed to effectively swap the direction of edges.
This way won't we compute the same degrees regardless of flow
, which is not correct in the directed case?
Maybe worth asking the PR.
Typo "paper's. Also implementation" -> "paper's implementation" |
torch_geometric/nn/conv/gat_conv.py
Outdated
"The usage of 'normalize' is not supported " | ||
"for bipartite message passing.") | ||
|
||
if self.normalize: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could combine the if statements, but actually, is there much advantage of putting the error here instead of where the you already have the assert statements?
…sage passing yet and add num_nodes as parameters
0ff800b
to
76cab5d
Compare
Add
normalize
parameter toGATConv
andGATv2Conv
.Part of #4 (TODO update this) for our final project for the Stanford CS224W course, this allows "GAT with Symmetric Normalized Adjacency Matrix" as described in “Bag of Tricks for Node Classification with Graph Neural Networks”.
Details
gat_norm
inspired fromgcn_norm
, whenedge_index
is aSparseTensor
,is_torch_sparse_tensor
or dense torchTensor
.gat_norm
is called after computing thealpha
coefficients and return the updated values ofedge_index
andalpha
. The outputs ofgat_norm
are passed as inputs ofself.propagate
.GATConv
andGATv2Conv
.add_self_loops
parameter. We remove self loops from the initial graph before calling togat_norm
and add self loops with normalization ingat_norm
as described in the paper. We tried to use the tools already provided in the library such astorch_sparse.fill_diag
,to_edge_index
,add_remaining_self_loops
,add_self_loops
andto_torch_csr_tensor
.is_torch_sparse_tensor(edge_index) == True
, we have an issue formatting back the indexedge_index
and the corresponding values inatt_mat
in the appropriate format. Our workaround consists of sorting lexicographically the values ofatt_mat
, so it matches the index ofedge_index
for thepropagate
andupdate
subsequent steps.Benchmarks
I have the following metrics with one T4 GPU, so it performs better for CiteSeer and PubMed dataset with a computation time cost.
with the following run commands: