LEAD: Linear Embedding Alignment across Deep Neural Network Language Models’ Representations

Note: This project/paper is not yet fully complete.

This is a repository containing the source code for the LEAD Blog Post by Gatlen Culp and Adriano Hernandez from MIT.

Published December 10th, 2024

Abstract

Recent advances in Large Language Models (LLMs) have demonstrated their remarkable ability to capture semantic information. We investigate whether different language embedding models learn similar semantic representations despite variations in architecture, training data, and initialization. While previous work explored model similarity through top-k results and Centered Kernel Alignment (CKA), yielding mixed results, in the field of large language embedding models, which we focus on, there is a gap: more modern similarity quantifiation methods from Computer Vision, such as model stitching, which operationalizes the notion of “similarity” in a way that emphasizes downstream utility, are not explored. We apply stitching by training linear and nonlinear (MLP) mappings, called “stitches” between embedding spaces, which aim to biject between embeddings of the same datapoints. We define two spaces as connectivity-aligned if stitches achieve low mean squared error, inreadicating approximate bijectivity.

Our analysis spans 6 embedding datasets (5,000-20,000 documents), 18 models (between 20-30 layers, including both open-source and OpenAI models), and stitches ranging from linear stitches to MLPs 7 layers deep, with a focus on linear stitches. We hoped that stitching would recover the similarity between models, aligning with a strong interpretation of the platonic representation hypothesis. However, things appear to be more complicated. Our results suggest that embedding models are not linearly connectivity-aligned. Specifically, linear stitches do not perform significantly better than mean estimators. A brief foray into MLPs suggests that training shallow MLPs does not necessarily work out of the box either, but more work remains to be done on non-linear stitches, since we haven’t fully maximized their potential here. Stitches are important, because their success can be used to determine operational, and therefore useful, notions of representational similarity. Our findings buttress the hypothesis that alignment metrics such as CKA are not always informative of behavior or feature overlap between models.

To read the rest of blog, click here.

Other Resources

View our HuggingFace datasets here and our trained stitches here.

Our work was heavily influenced by the Beyond Benchmarks paper. The code of which we used and of which you can find the source of here

Name		Name	Last commit message	Last commit date
Latest commit History 233 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
adrianos-notes-and-workspace		adrianos-notes-and-workspace
data/figs		data/figs
docs		docs
ignore		ignore
notebooks		notebooks
owler_fork		owler_fork
scripts		scripts
secrets		secrets
src		src
tests		tests
.cruft.json		.cruft.json
.cursorrules		.cursorrules
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
LICENSE.md		LICENSE.md
README.md		README.md
Taskfile.yml		Taskfile.yml
log.txt		log.txt
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEAD: Linear Embedding Alignment across Deep Neural Network Language Models’ Representations

Abstract

Other Resources

About

Releases

Packages

Contributors 3

Languages

License

GatlenCulp/embedding_translation

Folders and files

Latest commit

History

Repository files navigation

LEAD: Linear Embedding Alignment across Deep Neural Network Language Models’ Representations

Abstract

Other Resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages