Discoveries in transformer anisotropy and intrinsic dimensions by AIRI. #ComplexDynamics

Transformer-based models, introduced by Google Brain researchers in 2017, have significantly impacted natural language processing and computer vision with their attention mechanism and embeddings. A team of researchers from various institutions studied the anisotropy and local intrinsic dimensions of intermediate embeddings in transformer models during training. Anisotropy measures the space of embeddings, while local intrinsic dimension characterizes the complexity in a small neighborhood of point space. Experiments with language models revealed patterns in anisotropy and intrinsic dimension behavior, showing differences between encoder and decoder models. The study also found that representations increase in dimensionality initially and then gradually decrease during training, indicating a two-phase process of inflation and compression. These insights enhance understanding of transformer architectures and offer potential for improving training and inference efficiency. The research was presented at the EACL-2024 conference, with more details available in the preprint.

Source link

Source link: https://medium.com/airi-institute/complex-dynamics-of-anisotropy-and-intrinsic-dimensions-have-been-discovered-in-transformers-90d10aa91c3c?source=rss——llm-5