-
Notifications
You must be signed in to change notification settings - Fork 0
CS224W - Bag of Tricks for Node Classification with GNN - Label Usage #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
""" | ||
# re-assign train_idx to be nodes in mini-batch | ||
if batch is not None: | ||
train_idx = batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch is never used again, so I'm not sure it makes sense to have both batch and train_idx parameters.
@liuvince is current impl what you had in mind for minibatches? In the case of minibatching I think x and y would just not be the whole dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove batch and indicate that train_idx can be used for batch indices since batches passed contain the global indices. That seems more intuitive now that I read your comment and just from looking through it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name it mask
to ensure aligned naming with https://pytorch-geometric.readthedocs.io/en/2.5.2/generated/torch_geometric.nn.models.LabelPropagation.html and https://pytorch-geometric.readthedocs.io/en/2.5.2/generated/torch_geometric.nn.models.CorrectAndSmooth.html for example ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose the following logic:
in training mode (when self.training==True
):
- when
mask
is set to None, we should perform the split in the forward pass. - when
mask
is set to not-null value, we assume that it corresponds to nodes which labels are used as node features.
in test mode: - when
mask
is set to None, we use the whole input as unlabeled data. - when
mask
is set to not-null value, we assume that it corresponds to nodes which labels are used as node features which is the whole or sampled training dataset.
does it make sense ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can rename it to mask
for keeping with convention. Would mask
ever be None? I assumed mask
would always contain the indices to be trained/tested on. It would make sense to not split the indices and keep it as unlabeled during testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not sure it makes sense to allow None for mask. In both training and evaluation we need to know which set of nodes its ok to use true labels for.
Can we update the mask
description to be more similar to the examples Vincent linked?
Something like "A mask or index tensor denoting which nodes true labels can be used. "
As is it's a little confusing because it's called "mask" but we describe it as an index tensor. A mask is generally a bool tensor of shape (N,) while an index tensor is a list of the indices where you would put "true" in the mask, so it would have shape (k,) where k<=N. We can skip this detail on the expected size for this parameter since it is a common concept. I think the code should work for both cases without any changes.
|
||
# add labels to features for train_labels_idx nodes | ||
# zero value nodes in train_pred_idx | ||
onehot = torch.zeros([x.shape[0], self.num_classes]).to(x.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might consider making this initialization user-configurable. I'd be interested to benchmark if using the mean label works better when number of reuse iterations is zero or low.
something like init: Union[Tensor, float, 'mean'] = 0 # How to initialize unlabeled examples. 'mean' computes the mean label of the train_idx. If a tensor is passed, it must be one-dimensional of length num_classes
for more information, see https://pre-commit.ci
|
Code lgtm modulo note on no need for explicit |
for more information, see https://pre-commit.ci
Add
label_usage.py
to torch_geometric.nn.models and unit test.Part of #4 (TODO update this) for our final project for the Stanford CS224W course, this implements Label Usage as described in “Bag of Tricks for Node Classification with Graph Neural Networks”.
Description of Label Usage
num_features + num_classes
to accommodate concatenation of labels with featuresBenchmark
Run on Arxiv (ogbn-arvix) dataset with 100 epochs, 10 recycling iterations, and a split ratio of 0.6 using a GAT model.