Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the feature in the instruction_bert_feature.json is this? #8

Open
gqsmmz opened this issue Aug 6, 2024 · 2 comments
Open

Why is the feature in the instruction_bert_feature.json is this? #8

gqsmmz opened this issue Aug 6, 2024 · 2 comments

Comments

@gqsmmz
Copy link

gqsmmz commented Aug 6, 2024

I found that in the dataset_train = Traj_dataset(args, "train") in main.py, the read instruction feature is self. substructure_feature [instruction] [1] [0] [0].

According to the understanding, a sentence such as "I need a comfortable place to sit." should obtain [1,7,1024] feature values through the BERT Large model, but the self. substructure_feature ["I need a comfortable place to sit."] [1] [0] is [10,1024] feature value, and other sentences will also have 3 more values than the actual number of times. Is it because two double quotes ("") and a period ( . ) are added ?

And what data is self. substructure_feature [instruction] [0]?

Shouldn't we take all the features of self. substructure_feature [instruction] [1] [0] as input when reading the entire sentence? Why only take the zeroth element self. substructure_feature [instruction] [1] [0] as input? What feature does this zeroth element represent?

When I take both self. substructure_feature [instruction] [1] [0] as input, the following error occurs. It may be due to insufficient memory. Is this the reason why you only read the zeroth element? But can the zeroth element represent the characteristics of the entire sentence?

image

@whcpumpkin
Copy link
Owner

Hi,
I have some forgotten how I saved bert features. Unfortunately, the HDD that saves the original extracted feature code seems to be corrupt (looks like a SATA cable issue), I'll do a replacement as soon as I can and get back to you with more details.

I should have done a pooling operation on the features of the sentence, like this:
#5 (comment)

@gqsmmz
Copy link
Author

gqsmmz commented Aug 6, 2024

thanks, I get it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants