Various parts of the spacetimeformer embedding #11

qAp · 2022-01-25T15:04:38Z

Overall, the spacetimeformer embedding, spacetimeformer_model.nn.embed.SpacetimeformerEmbedding takes inputs of shape (N, L, d_x) and (N, L, d_y), and it outputs several embeddings of shape (N, L * d_y, d_model).

N is the batch size.
L is the sequence length. For the encoder, this is context_points. For the decoder, it's - target_points.
d_x is the number of input features.
d_y is the number of output features.

The overall embedding consists of several embeddings: x_emb, y_emb, var_emb, and given_emb.

The text was updated successfully, but these errors were encountered:

qAp · 2022-01-25T15:05:50Z

x_emb
This is the time2vec embedding, which maps a timestamp to a vector. For more details, see: #9 (comment).

As there are L timestamps, there are actually only L such embedding vectors per batch, and so they are actually repeated d_y times:

t2v_emb = self.x_emb(x).repeat(1, d_y, 1)

qAp · 2022-01-25T15:18:27Z

y_emb is for embedding the target values y.

Now, there are L * d_y target values. Each of these has an associated timestamp, so to each of these, the time2vec embedding vector is attached (concatenated). This results in a vector that is 1 element longer than the time2vec embedding vector. This vector is passed through y_emb, which is actually a linear layer, becoming a new embedding vector of length d_model.

For the whole batch, this gives a tensor of shape (N, L * d_y, d_model).

qAp · 2022-01-25T15:28:35Z

var_emb embeds the target variables themselves, not their values.

For example, if there are 14 target variables, these can be indexed from 0 to 13. Then, what var_emb does is that it maps index 0 to some vector, index 1 to some other vector, and so on.

Or, in this competition, 'Bitcoin' is mapped to a vector, and Binance is mapped to another, etc.

In general, there are d_y embedding vectors, one for each of the target variable.

Now, the target variable, the name itself, is independent of time, so each of the d_y variable embedding vectors is repeated L times.

For the batch, this gives a tensor of embeddings of shape (N, L * d_y, d_model).

qAp · 2022-01-25T15:33:06Z

given_emb embeds whether a target value y is available or not.

Here, "available or not" does not concern missing values in the original data, but rather whether the target value is meant to be predicted (not available) or to be used for prediction (available).

Being available is mapped to an embedding vector, while being unavailable is mapped to another.

For the entire batch, the resulting tensor is of shape (N, L * d_y, d_model).

This embedding tensor is actually summed together with the embedding tensor returned by y_emb.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various parts of the spacetimeformer embedding #11

Various parts of the spacetimeformer embedding #11

qAp commented Jan 25, 2022

qAp commented Jan 25, 2022 •

edited

Loading

qAp commented Jan 25, 2022 •

edited

Loading

qAp commented Jan 25, 2022

qAp commented Jan 25, 2022

Various parts of the spacetimeformer embedding #11

Various parts of the spacetimeformer embedding #11

Comments

qAp commented Jan 25, 2022

qAp commented Jan 25, 2022 • edited Loading

qAp commented Jan 25, 2022 • edited Loading

qAp commented Jan 25, 2022

qAp commented Jan 25, 2022

qAp commented Jan 25, 2022 •

edited

Loading

qAp commented Jan 25, 2022 •

edited

Loading