Merullo et al. (2024) showed that LLMs leverage task vectors alongside residual streams for Word2Vec-like vector arithmetic in ICL tasks
How media typically covers Merullo
Research or work cited
A theoretical framework proves that nonlinear residual transformers perform in-context learning via vector arithmetic, with provable convergence and generalization guarantees including robustness to concept recombination.
“Merullo et al. (2024) showed that LLMs leverage task vectors alongside residual streams for Word2Vec-like vector arithmetic in ICL tasks”