The Matrix: A Bayesian learning model for LLMs
Computer Science > Machine Learning
arXiv:2402.03175 (cs)
View a PDF of the paper titled The Matrix: A Bayesian learning model for LLMs, by Siddhartha Dalal and Vishal Misra
Abstract:In this paper, we introduce a Bayesian learning model to understand the behavior of Large Language Models (LLMs). We explore the optimization metric of LLMs, which is based on predicting the next token, and develop a novel model grounded in this principle. Our approach involves constructing an ideal generative text model represented by a multinomial transition probability matrix with a prior, and we examine how LLMs approximate this matrix. We discuss the continuity of the mapping between embeddings and multinomial distributions, and present the Dirichlet approximation theorem to approximate any prior. Additionally, we demonstrate how text generation by LLMs aligns with Bayesian learning principles and delve into the implications for in-context learning, specifically explaining why in-context learning emerges in larger models where prompts are considered as samples to be updated. Our findings indicate that the behavior of LLMs is consistent with Bayesian Learning, offering new insights into their functioning and potential applications.
Submission history
From: Vishal Misra [view email]
[v1]
Mon, 5 Feb 2024 16:42:10 UTC (305 KB)
Access Paper:
- View PDF
- TeX Source
- Other Formats
View a PDF of the paper titled The Matrix: A Bayesian learning model for LLMs, by Siddhartha Dalal and Vishal Misra
Current browse context:
cs.LG
export BibTeX citation
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.