submited by

Style Pass

Large Language Models process input with an ordered sequence of layers. Let li(x)l_i(x) l i ( x ) denote the result of calling the ithith i t h layer on some input xx x ; The value returned by a model with three layers is denoted M(x)=l3(l2(l1(x)))M(x) = l_3(l_2(l_1(x))) M ( x ) = l 3 ( l 2 ( l 1 ( x ))) .

In this essay, I analyze the output of l12l_{12} l 12 (of 3232 32 ) of Meta's seven billion parameter LLM (LLaMA-7b) using techniques from The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets by Samuel Marks and Max Tegmark.

I curate (code) datasets of questions about math problems (for example, "What is 2/2+62/2+6 2/2 + 6 ?"). I send those statements through LLaMA-7b and sample the output of l12l_{12} l 12 . This output is high-dimensional and difficult to visualize, so I perform principal component analysis on the samples to find a two-dimensional representation of l12l_{12} l 12 's output.

Figure 1 Each point corresponds to a question (full list) in the form "What is ∘\circ ∘ ", where ∘\circ ∘ is a math problem in the form a+b−ca+b-c a + b − c or a−b+ca-b+c a − b + c resulting in either 44 4 or 77 7 . Each question is sent through LLaMA-7b and the output of l12l_{12} l 12 is sampled. PCA is then performed on the samples, and each sample is projected onto the top two principal components to create this plot. The horizontal axis corresponds to the first principal component, and the vertical one to the second.

Read more percisely.xy...