[Bert] Feature: Custom Model Outputs #31

bkonkle · 2024-05-07T00:16:24Z

Closes #22

Adds an optional Pooler layer for text classification using models like plain BERT instead of RoBERTa.
Outputs both the last hidden states and the optional Pooler output if enabled.
Adds the pooler layer to the loader, for using pretrained models.
Makes the pad_token_idx public for things like the batcher to use.
Removes .clone() in a few places where it isn't needed.
Adds .envrc to gitignore for direnv users.
Adds .vscode to gitignore for VS Code users.

nathanielsimard

LGTM ping @laggui for an additional review before merging.

laggui

Overall LGTM! One minor comment regarding implementation.

But I cloned your fork with the branch changes and tried to run the example with the same arguments as the README and it failed:

Model variant: roberta-base
thread 'main' panicked at src/loader.rs:304:56:
Config file present: InvalidFormat("missing field `with_pooling_layer` at line 21 column 1")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Seems the downloaded config does not have the newly added field and it fails to parse. We should fix that in the implementation.

Have you tried the example with your changes @bkonkle?

laggui · 2024-05-07T12:41:29Z

bert-burn/src/embedding.rs

@@ -18,13 +18,13 @@ pub struct BertEmbeddingsConfig {

 #[derive(Module, Debug)]
 pub struct BertEmbeddings<B: Backend> {
+    pub pad_token_idx: usize,


Any particular reason why this is now public?

This is specifically so that I can use it here, in the batcher for my text classification pipeline (and later for the token classification pipeline): https://github.com/bkonkle/burn-transformers/blob/0.1.0/src/pipelines/text_classification/batcher.rs#L72

bkonkle · 2024-05-07T22:14:39Z

Thanks for the review! I tested it, but then made the last-minute change to add with_pooling_layer to the config, instead of passing it in as an argument. 😅 I failed to test it afterwards, and since this config property isn't found in the original Bert model config I need to default it to false. I'll fix shortly.

I might be able to work up some Github Actions code to run those examples automatically as part of PR checks. I'll open a separate PR for that if I do.

Update: Defaulting it to false with #[config(default = false)] doesn't actually prevent the error, since it looks for the field when it attempts to load the config from the base model file. The fact that it's not present in the base config file is why I originally structured the flag as an argument, but when Nathan suggested moving it into the config I didn't think it would be an issue. 😅 I'm working towards a solution now.

…the original model

bkonkle · 2024-05-07T22:59:00Z

I went with pub with_pooling_layer: Option<bool> to avoid the problems with loading the base model config, coupled with .unwrap_or(false) to resolve the wrapped value.

Examples are working again for me:

Model variant: roberta-base
Input: Shape { dims: [3, 63] } // (Batch Size, Seq_len)
Roberta Sentence embedding Shape { dims: [3, 768] } // (Batch Size, Embedding_dim)

Model variant: bert-base-uncased
Input: Shape { dims: [3, 64] } // (Batch Size, Seq_len)
Roberta Sentence embedding Shape { dims: [3, 768] } // (Batch Size, Embedding_dim)

Model variant: roberta-large
Input: Shape { dims: [3, 63] } // (Batch Size, Seq_len)
Roberta Sentence embedding Shape { dims: [3, 1024] } // (Batch Size, Embedding_dim)

I'm using this in the burn-transformers library like this: https://github.com/bkonkle/burn-transformers/blob/0.2.0/src/models/bert/sequence_classification/text_classification.rs#L130-L131

laggui

Thanks for your contribution! 🙂

I'll approve with the latest changes.

[Bert] Feature: Custom Model Outputs

605f98b

bkonkle mentioned this pull request May 7, 2024

[Bert] Feature: Custom Model Outputs bkonkle/burn-models#1

Closed

nathanielsimard requested a review from laggui May 7, 2024 12:18

nathanielsimard approved these changes May 7, 2024

View reviewed changes

laggui requested changes May 7, 2024

View reviewed changes

Make the with_pooling_layer flag optional, since it isn't present in …

1b4f7f1

…the original model

laggui approved these changes May 8, 2024

View reviewed changes

laggui merged commit 14ae737 into tracel-ai:main May 8, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bert] Feature: Custom Model Outputs #31

[Bert] Feature: Custom Model Outputs #31

bkonkle commented May 7, 2024

nathanielsimard left a comment

laggui left a comment •

edited

Loading

laggui May 7, 2024

bkonkle May 7, 2024

bkonkle commented May 7, 2024 •

edited

Loading

bkonkle commented May 7, 2024 •

edited

Loading

laggui left a comment

[Bert] Feature: Custom Model Outputs #31

[Bert] Feature: Custom Model Outputs #31

Conversation

bkonkle commented May 7, 2024

nathanielsimard left a comment

Choose a reason for hiding this comment

laggui left a comment • edited Loading

Choose a reason for hiding this comment

laggui May 7, 2024

Choose a reason for hiding this comment

bkonkle May 7, 2024

Choose a reason for hiding this comment

bkonkle commented May 7, 2024 • edited Loading

bkonkle commented May 7, 2024 • edited Loading

laggui left a comment

Choose a reason for hiding this comment

laggui left a comment •

edited

Loading

bkonkle commented May 7, 2024 •

edited

Loading

bkonkle commented May 7, 2024 •

edited

Loading