Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
__init__.py		__init__.py
attention.py		attention.py
attention_test.py		attention_test.py
cls_head.py		cls_head.py
cls_head_test.py		cls_head_test.py
dense_einsum.py		dense_einsum.py
dense_einsum_test.py		dense_einsum_test.py
gated_feedforward.py		gated_feedforward.py
gated_feedforward_test.py		gated_feedforward_test.py
gaussian_process.py		gaussian_process.py
gaussian_process_test.py		gaussian_process_test.py
masked_lm.py		masked_lm.py
masked_lm_test.py		masked_lm_test.py
masked_softmax.py		masked_softmax.py
masked_softmax_test.py		masked_softmax_test.py
mat_mul_with_margin.py		mat_mul_with_margin.py
mat_mul_with_margin_test.py		mat_mul_with_margin_test.py
mobile_bert_layers.py		mobile_bert_layers.py
mobile_bert_layers_test.py		mobile_bert_layers_test.py
multi_channel_attention.py		multi_channel_attention.py
multi_channel_attention_test.py		multi_channel_attention_test.py
on_device_embedding.py		on_device_embedding.py
position_embedding.py		position_embedding.py
position_embedding_test.py		position_embedding_test.py
relative_attention.py		relative_attention.py
relative_attention_test.py		relative_attention_test.py
rezero_transformer.py		rezero_transformer.py
rezero_transformer_test.py		rezero_transformer_test.py
self_attention_mask.py		self_attention_mask.py
spectral_normalization.py		spectral_normalization.py
spectral_normalization_test.py		spectral_normalization_test.py
talking_heads_attention.py		talking_heads_attention.py
talking_heads_attention_test.py		talking_heads_attention_test.py
text_layers.py		text_layers.py
text_layers_test.py		text_layers_test.py
tn_expand_condense.py		tn_expand_condense.py
tn_expand_condense_test.py		tn_expand_condense_test.py
tn_transformer_expand_condense.py		tn_transformer_expand_condense.py
tn_transformer_test.py		tn_transformer_test.py
transformer.py		transformer.py
transformer_scaffold.py		transformer_scaffold.py
transformer_scaffold_test.py		transformer_scaffold_test.py
transformer_test.py		transformer_test.py
transformer_xl.py		transformer_xl.py
transformer_xl_test.py		transformer_xl_test.py
util.py		util.py

README.md

Layers

Layers are the fundamental building blocks for NLP models. They can be used to assemble new tf.keras layers or models.

MultiHeadAttention implements an optionally masked attention between query, key, value tensors as described in "Attention Is All You Need". If from_tensor and to_tensor are the same, then this is self-attention.
CachedAttention implements an attention layer with cache used for auto-agressive decoding.
MatMulWithMargin implements a matrix multiplication with margin layer used for training retrieval / ranking tasks, as described in "Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax".
MultiChannelAttention implements an variant of multi-head attention which can be used to merge multiple streams for cross-attentions.
TalkingHeadsAttention implements the talking heads attention, as decribed in "Talking-Heads Attention".
Transformer implements an optionally masked transformer as described in "Attention Is All You Need".
TransformerDecoderBlock TransformerDecoderBlock is made up of self multi-head attention, cross multi-head attention and feedforward network.
RandomFeatureGaussianProcess implements random feature-based Gaussian process described in "Random Features for Large-Scale Kernel Machines".
ReZeroTransformer implements Transformer with ReZero described in "ReZero is All You Need: Fast Convergence at Large Depth".
OnDeviceEmbedding implements efficient embedding lookups designed for TPU-based models.
PositionalEmbedding creates a positional embedding as described in "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding".
SelfAttentionMask creates a 3D attention mask from a 2D tensor mask.
SpectralNormalization implements a tf.Wrapper that applies spectral normalization regularization to a given layer. See Spectral Norm Regularization for Improving the Generalizability of Deep Learning
MaskedSoftmax implements a softmax with an optional masking input. If no mask is provided to this layer, it performs a standard softmax; however, if a mask tensor is applied (which should be 1 in positions where the data should be allowed through, and 0 where the data should be masked), the output will have masked positions set to approximately zero.
MaskedLM implements a masked language model. It assumes the embedding table variable is passed to it.
ClassificationHead A pooling head over a sequence of embeddings, commonly used by classification tasks.
GaussianProcessClassificationHead A spectral-normalized neural Gaussian process (SNGP)-based classification head as described in "Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness".
GatedFeedforward implements the gated linear layer feedforward as described in "GLU Variants Improve Transformer".
MultiHeadRelativeAttention implements a variant of multi-head attention with support for relative position encodings as described in "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"(https://arxiv.org/abs/1901.02860). This also has extended support for segment-based attention, a re-parameterization introduced in "XLNet: Generalized Autoregressive Pretraining for Language Understanding" (https://arxiv.org/abs/1906.08237).
TwoStreamRelativeAttention implements a variant of multi-head relative attention as described in "XLNet: Generalized Autoregressive Pretraining for Language Understanding" (https://arxiv.org/abs/1906.08237). This takes in a query and content stream and applies self attention.
TransformerXL implements Transformer XL introduced in "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context" (https://arxiv.org/abs/1901.02860). This contains TransformerXLBlock, a block containing either one or two stream relative self-attention as well as subsequent feedforward networks. It also contains TransformerXL, which contains attention biases as well as multiple TransformerXLBlocks.
MobileBertEmbedding and MobileBertTransformer implement the embedding layer and also transformer layer proposed in the MobileBERT paper.
BertPackInputs and BertTokenizer and SentencepieceTokenizer implements the layer to tokenize raw text and pack them into the inputs for BERT models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layers

layers

README.md

Layers

Files

layers

Directory actions

More options

Directory actions

More options

Latest commit

History

layers

Folders and files

parent directory

README.md

Layers